Compressed HTML makes your pages zippy

Filed: Fri, Jan 05 2007 under Programming|| Tags: .htaccess rewrite compressed mod_rewrite

What if I were to tell you that for a few lines of code in your .htaccess file you could shave off at least 50% of your non-binary bandwidth usage and make your pages load substantially and noticeably faster, all without changing a single line in any of your pages. It's crazy, but it's true!

This hack is especially useful for external javascript and css files. The popular prototype library for instance (version 1.4.0) weighs in at a hefty 47k but if you gzip it and use this hack then you're only going to be sending 10k down the tubes (that's nearly 1/5th the size of the original). And that's just one file, you can zip up your custom JS files, external css files, and basically any static, non-scripted (php,asp,etc) file. Do the math and it really ads up to major bandwidth savings all while while benefiting your visitors (the slower their connection speed the more the benefit). And best of all this hack costs virtually nothing to implement.

So lets get started!

In your web directory there's a file called .htaccess. This file lets you tweak the server settings without having to touch the REAL server configuration files. A feature of the .htaccess file is a system known as the rewrite engine. Basically this lets you use regular expressions to test and modify the url before the web server ever gets around to actually serving the file.

The trick is that with just a few lines put into our .htaccess file we can check to see if the browser can accept compressed files (almost all of them can including Firefox and IE). If the browser can accept compressed files and there's a copy of the file being requested that's been zipped, we can serve the compressed file instead of the uncompressed file. Automatically and invisibly. Completely transparent to your HTML and the user's browsers.

Step 1 -- GZip your files.

First take a common static file like an external javascript (.js) file, or an external (.css) file. Create a GZipped copy ( you can get a free compressor at 7 ZIP. Then upload them to your web server. Remember this only works with STATIC files -- php, cgi, asp, python, perl, ruby and whatnot are all off-limits! If you were working on toolbox.js then your web server should now have a toolbox.js file AND a toolbox.js.gz file (and the .gz file should be dramatically smaller).

Step 2 -- Modify .htaccess

After you've created g-zipped copies of the static files you want to send compressed, make a backup copy of your .htaccess file in your web-server's home directory (or if it doesn't exist, create it). Next, edit the .htaccess file and add the following lines.

RewriteEngine on
RewriteOptions Inherit

#Check to see if browser can accept gzip files.
ReWriteCond %{HTTP:accept-encoding} (gzip.*)

#make sure there's no trailing .gz on the url
ReWriteCond %{REQUEST_FILENAME} !^.+\.gz$

#check to see if a .gz version of the file exists.
RewriteCond %{REQUEST_FILENAME}.gz -f

#All conditions met so add .gz to URL filename (invisibly)
RewriteRule ^(.+) $1.gz [L]

The first line turns the rewrite engine on, and the second tells it to keep all the rules being passed down from the master server file (if any). Next we see if the browser can accept gzipped compressed files and if so we make sure the user isn't already requesting a .gz file, finally we check to see if a .gz copy of the file actually exists. If all these conditions are met (browser, no url.gz, actual file exists) then the rewrite engine will silently add .gz to the filename. The .gz will be on the server side only while the file is being sent, it won't show up in the user's location bar or anywhere else.

Step 3 -- Profit!

Once you're done you can test and see if everything is working by simply uploading a .gz file without an uncompressed equivalent (test.html.gz but no test.html), now if you ask for test.html, even if it doesn't exist on your server you should still see a good web page because the server sent you the gzipped copy.

There are a few caveats. The rewrite engine puts a load on your server. If you have tons of bandwidth but your server is trying to outdo Chernobyl then this isn't a good solution for you. Also you'll have to remember to re-zip and upload your gzipped files every time you make a change to the original uncompressed file. But if you can live with that then you'll find this mod is blazing fast for your users, and a godsend to your bandwidth quotas.

This is definitely something you can file away under "cool tools".


Addendum

This addendum was filed after the article was published.

I'd like to answer a few questions about this article. The most serious question was whether this is just a duplication of apache's mod_gzip module which does real-time compression of static pages. This is quite a valid question. Most shared web hosts don't enable mod_gzip. You can check and see if your site is using it at whatsmyip.org (if you've already implemented .htaccess compression, make sure you test it on a file that does not have a .gz copy). Even if your site is using mod_gzip you can take a load off your cpu AND create tighter compressed files by pre-zipping your content, the effect will not be as noticeable and dramatic but you will still achieve some savings.

The second issue was the suggestion to use multiviews. While it may be possible to use multiviews to do the same thing as rewrite, it's my view that this approach will be neither more efficient or as transparent to the existing system.