Content Extractor

1.0a

12/28/99
Initial development. It drives the sample site on Linux just fine.

1.0b

12/29/99
Added error messages and development hints. Added delimiter variable for Mac compatibility. Changed PATH_INFO delimiter to $ for Mac compatibility and directory structure consistency.

1.0

1/17/00
Discontinued use of PATH_INFO -- the template is now the first argument. Fixed file reading so it will take any kind of line break. Cleaned up code, added comments, added error messages and other misc. improvements. Driving main Type A site on Linux, and also works on WebStar (Mac) server.

1.1

1/19/00
Added a registration checking routine. You can specify a delivery date, grace period, and registration code in the script when installing it on a client's web server, and it will stop functioning after the grace period expires, unless a file called registration.ini, containing the correct registration code, is added to the cgi-bin.

2.0b

1/25/00
Added editing mode to edit content live in the browser.

2.0

2/16/00
Cleaned up glitches in live editing mode. Added auto-save when leaving pages. Renamed script "content.cgi" -- content because it now does input and output; cgi to prevent hangs in MacPerl.

2.0.1

2/18/00
Fixed a line-spacing incompatibility between Netscape and IE during live content saves.

2.0.2

2/20/00
Allowed regex special characters during live content saves.

2.1

3/20/00
Changed the way arguments are passed to the script. You should now use two name/value pairs -- one for the template file, and one for the other arguments. So, a dynamic URL might look like this:

../cgi-bin/content.cgi?template=main.html&args=company.html,about

Also, the .html extension in your dynamic URL is now optional. If you don't specify an extension, Content Extractor will default to .html.

2.1.1

5/1/00
Oops, bug in the script which prevented the .html optional feature from working on some web servers. Fixed now.

2.1.2

8/17/00
Changed parsing routine so that if an ar

2.1.2

8/17/00
Changed parsing routine so that if an arg is missing, that embed simply won't get processed; so you can parse an indeterminate number of embeds by including or leaving off your trailing args.

Added support for nested content files on multiple platforms (e.g. template=faculty&args=faculty/smith,bio,quotes).

Set the HTML header so that when running from a Mac server, the page is never cached. This will aid development within our office network.

2.2

10/26/00
Fixed bug that was interfering with live editing routines, and updated various pages in the edit directory.

Added content.php, a PHP wrapper that lets you embed PHP code in your templates or content files, or include PHP files as an argument to an <!--#embed file --> tag.

2.2.1

11/16/00
Returned backwards compatibility with old-style query strings, which was inadvertently lost with the bug fix in version 2.2.

2.5b

12/8/00
The extractor no longer chops the last character out of the files it processes, in case the last character is not a line break.

Added new embed type, random, which takes one argument, contentFile. The fields in contentFile must be named with consecutive numerals, but it can be a regular HTML pseudo-database file in every other respect.

Added new embed type, passthru, which takes one argument, value. Passthru simply writes the value of the argument into the HTML code. It can be used to display text, or to set the value of an HTML attribute, such as an image source.

Embed tags can now include hard-coded values for any or all of their arguments. For example, the tag <!--#embed field "my_content", arg1 --> will extract the field specified in the first URL argument, from the content file my_content.html.

Templates and content files can now be located in non-standard locations. For example, template=my_template will open the file ../templates/my_template.html, but template=../other/my_special_template will open the file ../other/my_special_template.

An alias file can now be used to shorten awkward URLs and help manage duplicate links throughout a site. If you create a pseudo-database file at ../content/alias.html, you can store a dynamic URL definition (template=xxx&args=y,z) in each field, with the field name serving as the alias to that URL. Then, you can open dynamic URLs by referencing just the alias. For example: http://www.mydomain.com/scripts/content.cgi?alias=mycomplicatedURL.

2.5b2

1/8/01
You can now enclose hard-coded embed tag values in either single or double quotes.

Also, the extractor recognizes other name/value pairs besides template= and args= in alias definitions.

2.5b3

1/28/01
You can now use a combination of hard-coded and variable values in embed tags, using the "+" symbol for concatenations. For example, with the URL ../scripts/content.cgi?template=default&args=main,joe, the tag <!--#embed field arg1,arg2+"_picture" --> will extract the field called "joe_picture" from the file main.html.

2.5b4

2/18/01
Embed tags that refer to blank arguments won't be parsed. This allows you to leave a field blank by leaving out an argument it uses, without affecting the positions of the other arguments. For example, in the URL ../scripts/content.cgi?template=default&args=content,field1,,field3, the first and third embed tags will be parsed, but the second one will simply be left blank.

Also, content.php will now work from any directory level.

2.5.1

3/5/01
Content.cgi now detects a "flash=1" flag in the URL, and if present, redirects to a standard Flash template movie in ../templates/templates.html, passing the template and args variables to the Flash templates.

Also, content.cgi gives a correct error message if it can't find the specified content file. Previously, the script failed with an Internal Server Error; now it prints "Error: Can't open the template file ../templates/nonexistent.html. No such file or directory."

2.5.2

3/29/01
The passthru embed type, used in conjunction with combination arguments (hard-coded plus variable), now works.

2.5.3

4/4/01
Content.cgi now looks for a replace property in embed tags, which triggers some HTML cleanup required for display within a Flash template. For example, <!--#embed field arg1,arg2 replace spaces,paragraphs -->.

The allowed replace values are:

spaces - removes all line breaks and double spaces
paragraphs - adds a <br> tag after every <p> tag
ampersands - replaces all ampersands with the ` character, which can be restored within Flash
links - changes the color of all links to the link color specified in the template's body tag,
and sets the target of all links to "_new"; link colors can also be overridden by
in each embed tag: <!--#embed field arg1,arg2 replace link-000099 -->

Also, you can now include more than one embed tag on a single line of your template, and they will be correctly parsed.

2.6

4/17/01
Content.cgi now supports several new embed types, which can be used to quickly create linear navigation elements for presentations or e-learning products. See the extractor demo for examples. The new types, with the arguments they require, are:

nav_current - returns the position of the current field in the file (contentFile, contentField)
nav_count - returns the number of fields in the file (contentFile)
nav_next - returns the name of the next field in the file (contentFile, contentField)
nav_previous - returns the name of the previous field in the file (contentFile, contentField)
nav_first - returns the name of the first field in the file (contentFile)
nav_last - returns the name of the last field in the file (contentFile)

The next and previous tags automatically loop, in other words, next returns the first field when going beyond the last field, and previous returns the last field when going before the first field.

You can also skip fields with the next and previous tags, which is handy if your content file is organized like page1, page1_title, page2, page2_title. In this case, you can use <!--#embed nav_next2 arg1,arg2 --> on page1 to get to page2, because "nav_next2" tells the extractor to move ahead two fields.

2.6.1

4/18/01
Ampersand replacement (for use with Flash templates) now uses the string "*amp*" rather than the character "`" as a delimiter.

3.0

5/14/01
The content extractor toolset now includes a major new feature: built-in logging of site statistics. Third-party stats packages don't work well with the content extractor, because they list most hits as content.cgi, giving no useful information about pages visited. But the new stats functionality logs the argument strings, along with the user's IP address, host, referrer, and user agent. To install the stats functionality, copy the stats folder to your site -- it contains stats.cgi, an empty data folder, plus template.html, which you can edit to match the rest of your site. You can also place an index.html page in stats, to redirect to stats.cgi, so you can view the stats at http://www.yourhost.com/stats. Better yet, use .htaccess in the stats folder to set stats.cgi as the index page, and to password protect the directory.

Content.cgi can now read templates over an http connection, rather than directly through the filesystem. This may slow down site performance a bit, but allows you to use CGI scripts or other dynamic pages as templates, because any server-side scripting in those files will execute before the files are parsed by content.cgi. If you've ever wished you could use PHP logic to construct an embed tag in a template, you'll appreciate this feature. Note: reading over http requires that LWP/Simple be installed on the web server; if not present, content.cgi will read files through the filesystem as in previous versions, and display a warning message.

Both the stats and the http reading can be turned on or off sitewide with another new feature, the content.ini configuration file. Content.ini is optional, but also provides settings for the default file extension (which still defaults to .html if not specified in this file) and the registration code (which makes the registration.ini file obsolete).

3.0.1

5/18/01
Content.cgi can now read any file, whether a template or a content file, over http, rendering any dynamic content in those files before merging content into templates.

Since this pre-rendering may decrease site performance, the content.ini file also gives you more control over which templates and content files are pre-rendered. The setting "pre-render files" (formerly "parse template") in content.ini can now take a comma-delimited list of match strings as well as a simple binary value. For example, if you specify "pre-render files: template, cgi," only files containing the strings "template" or "cgi" in their paths will be read over http. Entering a value of 1 for this setting will pre-render all files, while 0 will pre-render no files, as will omitting the setting from content.ini.

3.0.2

6/22/01
Content.cgi now passes any arguments through to files it prerenders, so rendered code in templates or content files can use those arguments in their logic.

Since the pre-rendering feature seems to be working fine, the PHP wrapper (content.php) is now officially obsolete and has been removed.

3.0.3

7/10/01
A few bug fixes and code cleanups, inspired by the development of the parallel Content Extractor for Director (content.dxr), were implemented in content.cgi. Most notably, a bug that caused the wrong pages to appear when looping backwards from the first to the last page using the nav_previous embed type plus an offset number has been fixed. Also, the random embed type no longer requires that fields be numbered sequentially -- any field names will work, as long as they are unique.

The stats script, like the Extractor for Director, has been given its own version number and Version History document. Stats.cgi is currently at version 1.0, while content.dxr is at 1.0b.

3.0.4

8/5/01
Added support for nav_current2 and nav_count2, for use with content files that group fields. These embed types function similarly to nav_previous2 and nav_next2, which were introduced in version 2.6.

3.0.5

8/22/01
Changed formatting and wording of extractor error messages to help clients distinguish between the different types of errors described in the Interpreting Error Messages document.

3.0.6

8/31/01
Enhanced page naming in the stats logging section. If the original request was for a page other than content.cgi, its request URL is displayed in parethesis. Previously, the content extractor would log the URL of the resultant error page.

3.0.7

9/11/01
A small bug which prevented fields from being found within a pre-rendered content file has been fixed.

However, the "PHP wrapper," content.php, has been reintroduced to the content extractor suite of tools, because attempts to replace the wrapper technique with pre-rendered content or template files has proved impractical in some cases. For example, putting a database query into a template and form fields in a content file wouldn't work because the form fields would render separately from the query that provided their data. Conversely, if the query was added to each field of the content file that needed it, then that query would run multiple times when the content file was rendered by content.cgi. And cookie-based sessions don't work, because Perl, not the client, is requesting the PHP session information.

Pre-rendering files might still be useful to build extractor tags with PHP, or to include small bits of PHP inside content files, but content.php will be a better solution for more complex PHP scripts.

3.1

9/11/01
Content.cgi has taken a conceptual leap forward with support for logic-based constructions. These are expressed with a new kind of tag, the conditional tag. For example:

<!--#if1 (arg1=="home") --><img src=logo_large.gif><!--#end if1 -->
<!--#if2 (arg1!="home") --><img src=logo_small.gif><!--#end if2 -->

would display a large logo on the home page and a smaller logo on all other pages. This effectively creates dynamic templates, not just dynamic pages, and allows multiple templates with relatively small differences to be combined into a single file.

The conditional tags can enclose any content, including embed tags, and can be nested. However, if you include more than one conditional tag, you must number them as in the example above. The tags will be processed by content.cgi in numerical order.

The expressions can theoretically contain any Perl code, for example:

<!--#if (substr(arg1,7) == arg2."_button") -->Hey, I'm one smart template!<!--#end if -->

3.1.1

9/23/01
Content.cgi no longer prints comments into your dynamic page to indicate the results of conditional constructions. The comments caused problems when embedding inside of HTML tags or JavaScript blocks.

3.1.2

10/7/01
An additional routine was added to the primary variable setup to determine the local path of the scripts directory on a Windows server. Some Windows servers were resolving relative paths based on the server root rather than the current directory, which was causing the getFile function to fail. This change should make content.cgi work more consistenly across platforms.

3.1.3

11/11/01
Content.cgi now writes error events to the stats log in a more human-readable format. Assuming that any execution of content.cgi where "content.cgi" is not a part of the HTTP request was a redirect from another page, the extractor now writes "Redirect," then the redirect status code, if any, then the full path to the original request. This is an improvement on the logging functionality originally added in version 3.0.6.

3.2

11/25/01
Several changes were made to prevent content.cgi from being used to undermine the security of its host computer. Most significantly, nonstandard file locations must be declared in content.ini before content.cgi will read files from them. Anything included in ../content/ and ../templates/, including their subfolders, is allowed by default, but if you want to read content files, for example, from a content2 and content3 folder, add the following line to content.ini:

read files from: ../content2/, ../content3/

You could allow reading from your whole site with this line...

read files from: ../

...but that undermines the purpose of this security feature to some extent.

As before, references to files in nonstandard locations must begin with ../, otherwise content.cgi will look for them in ../content/ or ../templates/.

Another security enhancement is that content.cgi will not read any file whose name begins with a ".". This makes it impossible to display the contents of .htaccess or .htpasswd files, for example.

And developers can now choose to hide content extractor error messages, by adding the following line to content.ini:

show errors: 0

In that case, extractor errors will be sent to the browser as commented HTML, rather than visible HTML, but will be logged in the stats file as before.

A final minor change is that a leading "." is now optional in the default file extension setting of content.ini.

3.2.1

12/3/01
Fixed a bug in 3.2's nonstandard paths functionality which prevented path settings in content.ini from taking hold if ../templates/ and ../content/ weren't specified. Those two paths are now assumed and don't need to be specified in the ini file.

3.2.2

1/20/02
The stats routine of content.cgi now writes extractor errors and HTTP errors to the errors file, rather than including HTTP errors in the primary stats file. This change accomodates new functionality in stats.cgi version 1.2.2, and requires that version of the script or later for best results.

3.2.3

2/4/02
Previously, a request for a field called ##home would return a field called ##home_placeholder if ##home_placeholder appeared in the content file before ##home. The matching behavior now avoids this situation and only returns an exact match.

4.0

3/5/02
The most substantial update to the content extractor so far, version 4.0 uses a different format for dynamic URLs and embed tags and is not backwards-compatible with any previous version. It provides usability enhancements for content editors, a clearer architecture for developers, and an integrated site flattener for delivering web sites on local drives or servers without Perl.

Fundamental changes

Instead of two key-value pairs, template and args, dynamic URLs now consist of a template variable plus a variable for each embed tag in the template. Embed tag names correspond to variable names in the URL. For example, a template containing three tags...

<!--#embed env_http_user_agent -->
<!--#embed env_remote_addr -->
<!--#embed env_request_uri -->

In addition to the CGI-standard variables, one additional variable is available. This tag displays the modification date of the newest content file or template used to build the dynamic page:

<!--#embed env_last_modified -->

Environment variables, by the way, are also available in conditional tags using Perl syntax:

<!--#if1 ($ENV{'HTTP_USER_AGENT'} =~ "Mac") -->Hello, Macintosh user!<!--#end if1 -->

A few new options have been added to content.ini. "prevent page caching" modifies to header of each dynamic page to prevent the browser from caching it, which may be helpful during troubleshooting. "show expanded URLs" will instantly redirect any short-form dynamic URL to its full format, another troubleshooting aid. And "external link targets" allows you to set the target attributes of anchor tags for absolute links sitewide. If an anchor tag already contains a target attribute, this setting won't override it.

In addition to these new options, you can now set options temporarily on a page-by-page basis by including the setting in the dynamic URL. For example, this URL...

../scripts/content.cgi?page=home&show_expanded_URLs=1

...will redirect to a page that displays the full version of the home URL, even if "show expanded URLs" is turned off in content.ini. The only option that can't be overridden in this way is "read files from," which would have undermined the server security provided by that option.

The PHP wrapper

Content.php has also been updated to better accomodate a variety of hosting situations. To use content.php with the shortened URLs described above, simply change...

../scripts/content.cgi?page=home to ../scripts/content.php?page=home

To use it in conjunction with .htaccess files, change...

../dynamic/home.html to ../dynamic/home.php

The site flattener will not render PHP that the PHP wrapper would normally render. If it encounters a URL that uses the PHP wrapper, it will save the file to filename.php so that the PHP will still be active in the flattened site.

Miscellaneous fixes

Content.cgi will now filter attempts to read files outside of the standard directories when the filenames begin with ./ as well as ../. This is an enhancement to the security functionality introduced in version 3.2.

If a variable referred to in a conditional tag expression is not available in the current URL, you can still achieve a true condition by testing for (!var[0]) or (var[0] != 1). Previously, the expression evaluated false no matter what you tested for.

4.0.1

3/18/02
Fixed a bug where content.cgi?page=test&test=foo would generate an invalid page error even if a page named test was defined in pages.txt. This was occurring because content.cgi was treating the second argument as an override for the value of test in pages.txt, when really, URL overrides shouldn't affect parsing of the pages.txt file. Now, the overrides will only affect parsing of the content.ini file.

4.0.2

5/16/02
Fixed a small bug that prevented the external link target setting from working when the external link value was quoted.

4.1

5/26/02
When creating templates, you no longer need to number conditional tags to show the content extractor which tags go together; you can just include if/end if tags, nested in any combination, and the extractor will parse them correctly. The extractor also supports the use of else tags now. For example:

<!--#if (main[1]=="home") -->
This is the home page
<!--#else -->
This is another page
<!--#end if -->

These changes should result in the conditional functionality of the content extractor functioning just like it does in any C++-style programming language.

You can now strip all markup and trim surrounding line breaks from the content that you embed into your templates by adding the word "strip" to the end of your embed tags. For example, any content embedded with the tag <!--#embed main strip --> will have markup removed on the fly by the extractor. This is useful for embedding data into JavaScript or other code, where formatting that content editors add in the HTML content files might disrupt the operation of the code.

The content extractor no longer puts double-quotes around external link targets when writing them into your content files. It does, however, copy any double- or single-quotes that you include in the content.ini setting. For example, setting external link target: '_blank' will cause the extractor to write target='_blank' into your <a> tags. By the way, you can disable this feature without removing the setting from your content.ini file by setting external link target: 0.

Perhaps more importantly, the extractor no longer adds a target attribute to links if a target has already been set. This should improve compatibility with different web browsers by ensuring that only one target setting appears.

Previously, the field marker used to separate fields in HTML content files was always ##. You can now use any field marker by specifying it with the "field marker" setting in content.ini. If you don't set this, the extractor will use ## as the default.

4.1.1

5/28/02
You can now load page definitions from a file other than ../content/pages.txt, if you specify an alternative file using the "read page definitions from" setting in content.ini. Besides offering the ability to customize the name and location of this file, this feature also allows multiple site structures (such as you might create when building a multilingual site) to share the same page definitions. For example, sites driven from /en/scripts/ and /es/scripts could share the same page definitions at ../../content/pages.txt. If you don't include this setting, the extractor will load ../content/pages.txt by default.

Another change intended to support multilingual site development regards the way that the extractor finds and loads external content and template files. The extractor now reads files relative to the directory location that appears in the browser window, rather than the physical location of the content.cgi script. This would allow you, for example, to create English content at /en/content and Spanish content at /es/content, and create symbolic links at /en/scripts and /es/scripts that both point to /scripts. Even though the English and Spanish sites both use the same content.cgi script, content.cgi will load the English or the Spanish content depending on the URL from which it was accessed.

Finally, this version fixes a bug that hung content.cgi in some cases if a conditional tag didn't contain a space between the text and the closing comment tag. For example, <!--#end if--> would hang the extractor, where <!--#end if --> would not. The latter style is preferred, but the content extractor was designed to allow some flexibility in the writing of embed tags, so the former should now also work.

4.1.2

6/4/02
Updated an internal path reference to ensure that the content extractor could read files on IIS servers.

5.0

8/13/02
The most substantial update since version 4.0, this version adds significant new features, new names and locations for many of the files, and a new name for this development tool: Contemplate.

Fundamental changes

All files associated with Contemplate are now located in a single directory called "contemplate" at the root level of your website. This organization should help distinguish the Contemplate files from your own site files. Here's a summary of the renamed files:

contemplate/assembler.cgi was: scripts/content.cgi contemplate/assembler.ini was: scripts/content.ini contemplate/assembler_wrapper.php was: scripts/content.php contemplate/pages.txt was: content/pages.txt contemplate/reporter/reporter.cgi was: stats/stats.cgi contemplate/reporter/inspect.cgi was: stats/focus.cgi

Unfortunately, due to this broad reorganization, sites built with Contemplate 4.1.2 or earlier are not directly compatible with Contemplate 5.0 or later. Fortunately, updating the site to work with Contemplate 5.0 will be easier than updating a 3.x site to 4.0. In many cases, you need only search your site files and replace "scripts/content" with "contemplate/assembler."

One larger conversion task, though, is due to the fact that Contemplate no longer adds a default file extension to the file names you provide it. If your dynamic URLs, or the entries in your page definitions file, rely on a default file extension setting in assembler.ini, you'll need to edit these locations to specify the file extension. Contemplate 5.0 requires file extensions because they provide more reliable execution of the builder component and search routines.

Finally, Contemplate is now available in other languages for the first time. To address the performance concerns of Unix webmasters, a PHP version is available, and to address the compatibility needs of Windows webmasters, an ASP version is available. These new ports will behave identically to the original Perl version, except where stated in this documentation.

New functionality

The field and random embed types were enhanced to provide support for XML-based content files. Previously, content files had to be organized using HTML tables and the field marker. Now, you can organize your content using XML tags. Contemplate currently recognizes two HTML tags: content and group.

For example, if the following content were saved into a file called sample.xml...

<content name="myname">John Doe</content>

<group name="measurements">
<content name="height">67</content>
<content name="weight">115</content>
</group>

...and you had a template called default.html that contained embed tags called name, height, and weight, you could access the relevant content with the following URL:

../contemplate/assembler.cgi?template=default.html&main=field,sample.xml,myname& height=field,sample.xml,measurements/height&weight=field,sample.xml,measurements/weight

Notice that in the case of standalone content tags, you can access the content using a field embed tag in the same way that you would access content in an HTML content file. In the case of grouped content tags, you can access the content by specifying a "path" to the desired content tag, separating all enclosing group names with slashes. With this technique, you can organize content by nesting it to as many levels as you wish.

The new XML parsing routines take effect when your content file has a .xml extension. You may mix HTML and XML content files freely throughout a project.

A completely new embed type provides another new way to organize content. The form embed type allows you to access default values of an HTML form in a content field. For example, if a field called "125" in the file "employees.html" contained the following content...

<form name=foo>

Name
<input type=text name=name value="George Jones">

Role
<select name="role">
<option value="Developer" selected>Developer</option>
<option value="Project Manager">Project Manager</option>
<option value="Instructional Designer">Instructional Designer</option>
</select>

Skills
<input type=checkbox name=skills value="HTML" checked>HTML 
<input type=checkbox name=skills value="JavaScript">JavaScript 
<input type=checkbox name=skills value="PHP" checked>PHP

</form>

...and you had a template called default.html that contained embed tags called name, role, and skills, you could access the relevant content with the following URL:

../contemplate/assembler.cgi?template=default.html&main=field,employees.html,myname& name=form,employees.html,125,name&role= form,employees.html,role& skills= form,employees.html,125,skills

Contemplate can automatically access values from text, radio, checkbox, select, and textarea form elements. When checkbox or select elements contain multiple values, Contemplate returns a comma-delimited list of values.

The form embed type can be useful either for setting up database-like data structures, or for enforcing standard types and formats for your content. Developers can create a "shell" HTML form in one content field, and then content editors can duplicate that form and change the default values in each new instance.

And the search embed type now has a completely different meaning than before. Previously, you could specify two strings, and Contemplate would return any text occurring between those two strings. Now, you can use the search type to create standard site search mechanisms. Simply add a search form like this to any page...

<form name=search method=get action=../assembled/search_results.html>
<input type=text name=search_string> 
<input type=submit value=Submit>
</form>

...then, in the URL for the search_results page, set an embed tag to the type "search" with no other arguments. Contemplate will search all your content files for instances of the search string, rank the pages by frequency, and embed the results in the location of the search tag.

Picking up where the strip attribute left off, you can now perform multiple search and replace operations on your content as you embed it into your templates. For example, if you write an embed tag like this...

<!--#embed main replace /"/&quot;/ -->

...assembler.cgi will replace all double-quotes with the &quot; entity. You can combine the strip and replace attributes, you can include multiple replace attributes in your embed tag, and you can use regular expressions in your search values. For example, this tag...

<!--#embed main strip replace /"/&quot;/ replace /\s+/ / -->

...will tell assembler.cgi to first strip all markup tags from the content, then replace all double-quotes with HTML entities, then replace all multiple spaces with a single space. The strip operation will always be performed before the replace operations, regardless of its position in the embed tag; but replace operations will be performed in the order in which they appear in the embed tag. If you need to do some replacements before stripping tags, you can use a regular expression to strip tags rather than the strip attribute. For example, the three replace attributes in this tag...

<!--#embed main strip replace /<p>/\*/ replace /<[^>]*>// replace /"/&quot;/ -->

...will replace all paragraph tags with asterisks, then remove all markup tags, then replace double-quote characters with their HTML-entity equivalent.

The search option does have a couple known limitations. Currently, you can't use a forward slash symbol (/) or the string "-->" in search and replace values.

Because of limited usage, the body tag type was removed. If you wish, you can achieve the same results using the field type and the search option:

<!--#embed field replace /<body[^>]*>(.*)<.body>/$1/ -->

Because of limited usage and overlap with the new XML functionality, the tag embed type was also removed.

Miscellaneous fixes

Some navigation elements, titles, and the overall appearance of the Flattener were adjusted to conform to the other Contemplate utilities.

When you use the show_expanded_URLs option, the expanded URL won't include the show_expanded_URLs argument, which was redundant.

The PHP wrapper for assembler.cgi can't pass the show_expanded_URLs flag through to the assembler, and was displaying an error message when it encountered the flag. Now, it will reload the page without the flag, which allows the page to display without errors. Note that the PHP wrapper should no longer be needed for sites on PHP-capable servers, which can use the PHP port of Contemplate. However, we'll leave the wrapper file available in the Perl version to ease migration of older sites.

The PHP wrapper now passes environment variables on to assembler.cgi, so traffic reports will provide more detail. Previously, all requests to "wrapped" pages adopted the IP address, host, and user agent of the PHP server, rather than of the site visitor.

Previously, if your page definitions file contained a page called "abcde" followed by a page called "cde," a request for page cde would bring up page abcde. This has been fixed.

Previously, some of the navigation functions didn't work properly when files were saved with DOS line endings. Assembler.cgi now has a better routine for conforming line endings to ensure cross-platform compatibility.

In some cases, the navigation functions didn't work properly when the page definition file contained extra line breaks between sections. Assembler.cgi now has a better routine for splitting up sections and providing this functionality.