Rewriting returns to the sandbox

Friday, June 6, 2008 at 4:07 PM



One technique introduced recently to help combat latency is automatic URL rewriting -- during preprocessing, orkut will automatically replace the URLs for static content (e.g. JavaScript files, style sheets, images, etc.) with proxied URLs.

Yesterday, we pushed a new version of the rewriting mechanism to the sandbox as well as a more granular method of disabling rewriting during testing (when these static resources are likely to be in flux). Once your application is polished and ready for deployment, you should re-enable rewriting so that your app can take advantage of the automatic proxying, which reduces both the load on your servers and the loading time for your users.

To customize rewriting for your test applications, place the following block in your app spec's ModulePrefs element, replacing the [REGULAR_EXPRESSION] blocks with your custom expressions:

<Optional feature="content-rewrite">
<Param name="include-urls">[REGULAR_EXPRESSION]</Param>
<Param name="exclude-urls">[REGULAR_EXPRESSION]</Param>
<Param name="include-tags">[COMMA SEPARATED LIST OF HTML TAG NAMES]</Param>
</Optional>

The first parameter above is where you should place full or partial URLs (using standard regular expression syntax) to be automatically proxied. The second parameter accepts URLs that should be explicitly excluded from the automatic proxying. Finally, the third row contains element names that the preprocessor targets for rewriting.

For example, to disable rewriting completely (for testing only!), you can use the following block. Notice the .* in the second parameter -- this is a common regular expression for matching all characters, meaning that all URLs will effectively be excluded here:

<Optional feature="content-rewrite">
<Param name="include-urls"></Param>
<Param name="exclude-urls">.*</Param>
<Param name="include-tags"></Param>
</Optional>

After your application is polished, you should include all URLs that aren't delivered via a CDN such as Akamai. Since the exclude-urls parameter is applied after include-urls, you can filter these special resources by placing an appropriate expression in 'exclude-urls' -- this will be honored even if you specified .* in the first parameter.

One final note: if you don't include the snippet above, orkut will automatically include all URLs that it encounters as well as all script, link, img, embed, and style elements.

Updated 7/16/2008: Corrected the "include-tags" parameter description. Instead of being a regular expression, this should just be a comma-separated list of tags that should be rewritten. For example:

<Param name="include-tags">img,script,link,embed</Param>