Posts filed under 'Apache'
(This article is Part 2 in a series about Apache’s loadable modules. The previous topic was mod_ext_filter).
As former Microsoft-ite who made the leap to OSS, one of the things that took a little getting used to with Apache are that filenames are case-sensitive. While that’s long since been hammered into my soul, a great many other people browsing the web aren’t aware that some webservers are case-senstive, while others are not. Sometimes I’ll see “http://www.example.com/signup” in my access logs when what the person really wanted is “http://www.example.com/Signup”. I could easily catch these simple typos with a mod_rewrite rule, but what if the user entered “http://www.example.com/Sign-up” or “http://www.example.com/Signpu”? Trying to plan for the ways a URL could be miscontrued would be make for a lot of ugly mod_rewrite rules. A better choice is to load Apache’s mod_speling module.
Mod_speling is very easy to implement. And what’s more, unlike mod_ext_filter, it can be used from within an .htaccess file–shared hosting users rejoice! The module is normally loaded by default (though spell-checking is disabled), but you might want to double-check Apache’s http.conf:
LoadModule speling_module modules/mod_speling.so
Enabling mod_speling is as easy as:
<IfModule mod_speling.c>
CheckSpelling on
</IfModule>
A couple of points to remember:
- mod_speling will only catch and correct up to one URL misspelling at a time. For true fuzzy matching you’ll have to use a more advanced solution.
- If Apache finds several likely matching URLs, the user will be presented with a list to pick from.
- Renaming files and not updating links, relying upon mod_speling to automagically fix them, isn’t a great idea. The module does a directory scan each time it intercepts a 404. If you have a lot misspelled files and a lot of pageviews, you will experience a noticable performance degradation.
For more information, check out the official Apache docs.
September 22nd, 2006
I recently discussed mod_ext_filter, a loadable module used with Apache. It allows you to easily process your content through an external program before it’s passed onto the user’s web browser. I mentioned using mod_ext_filter as a way to automatically add watermarks to images on a website. After receiving a bit of interest in how to do this, I decided to write a step-by-step HOWTO.
Before you start
First determine whether dynamically adding watermarks using mod_ext_filter is really the best option for your project. If you have a fixed set of images that never change, it would probably make more sense to add the watermark using a photo editing program before you upload them to the site. This would eliminate the overhead of having to run the filter everytime someone requests the image. If you’re interested in marking images that are being uploaded to the site by users via a web interface, using an image manipulation library within your scripting environment would be more efficient. (For PHP and most other languages, the obvious choice is GD). Having said that, some cases that using mod_ext_filter would be appropriate would be:
- Users directly upload images via FTP, but the images are accessed via HTTP (Apache)
- A webcam or IP security camera regularly uploads images to the server and people view the images via a website.
- Non-technical people upload images via a CMS that need to be automatically watermarked, and you have no way/desire to modify the CMS.
With that in mind, the only two technical requirements for this project is having root access on your Apache server and having “composite.” Unlike mod_rewrite, mod_ext_filter cannot be run from within an .htaccess file. Composite is a small utility that is part of the ImageMagick graphics manipulation suite. Most distros ship with this installed, but if yours doesn’t and it isn’t available via your favorite package manager, you can always download and compile the source.
Manipulating the images
Composite simply overlays one image ontop of another, allowing the transparency to be adjusted. You could get fancy and create a dynamically-generated watermark from scratch by using the other binaries in the ImageMagick suite. However, for this project I’m just going to use something I put together in Photoshop in 30 seconds.
The ultra-fancy watermark is simply a GIF on a transparent background:

And our obligatory kitten picture (It’s Maynard, of course! All together now….awwww):

Having both these images on the webserver, we could now easily use composite like so:
/usr/bin/composite -gravity SouthEast -watermark 20.0 watermark.gif maynard.jpg watermarked.jpg
Which gives us the watermark in the lower-right corner at 20% opacity:

Looks good, but we actually want to do is add the watermark to the image in real-time, when it’s sent to the web browser. For that, we need to use some mod_ext_filter voodoo.
Using composite with mod_ext_filter
Now that I’ve figured out how I want the watermark to look, I need to create a new directory to store the images that will be automatically watermarked. Since I normally store images in an “/images” directory off the root, we’ll just keep things simple and create a new subdirectory off of “/images” named “/watermarked”. Into this directory I’ll upload all the files that I want the watermark automatically applied.
Keep in mind that the image files themselves on the server aren’t affected since Apache applies the watermark on-the-fly when the user’s browser requests the image. Depending on what you’re trying to accomplish, this can be good or bad. It increases the server load and, depending on how much traffic you get, may add a noticable delay. However this is also a very flexible solution that doesn’t modify any of the original images. A good compromise would be to implement some type of caching system and/or write a custom module in C using the Apache API, but that’s a bit beyond the scope of this project.
Next I’ll modify Apache’s config file, http.conf, to define the filter:
<IfModule mod_ext_filter.c>
ExtFilterDefine watermark mode=output intype=image/jpeg cmd="/usr/bin/composite -gravity SouthEast -watermark 20.0 /var/www/html/mysite/images/watermarked/watermark.gif - -"
</IfModule>
(Make sure that the “ExtFilterDefine” directive is all on one line in the config file, else Apache will bomb when you try to restart it.) You’ll notice that the this is pretty much the same command that I used eariler when I was testing the watermark. The only differences are that I’m using an absolute filename for my watermark image and that my input and output filenames have been replaced with two dashes (”- -”). The dashes are the key to this whole trick. They tell composite to use STDIN for the input file, and STDOUT for output. The “intype” is also important. It specifies which MIME type Apache will use with this filter. Trying to stuff a “text/html” document through composite isn’t likely to work all that well.
Next, we need to tell Apache the location of the files that we want to be processed by the filter. The most common way to do this is with a Location container:
<Location /images/watermarked>
SetOutputFilter watermark
</Location>
This simply directs Apache to run any files located in “/images/watermarked” through the “watermark” filter that was defined earlier. The filter will then check to ensure that the MIME type matches, processes the requested JPEG through composite, and output the result to the browser. That’s about all there is to it… restart Apache and you should be good to go!
If you found this tip useful, please add a comment below. It would be interesting to know how people are using this idea. Thanks!
P.S. Make sure that your installation of Apache is configured to load mod_ext_filter.c. Elsewise the filter will silently fail and probably drive you nuts for an hour or two. The line “LoadModule ext_filter_module modules/mod_ext_filter.so” needs to be present somewhere near the top of http.conf.
September 19th, 2006
Almost every experienced web developer who works with Apache is at least vaguely familar with mod_rewrite. It’s most frequently used to reformat ugly URLs containing a lot of query values into a nice, search-engine friendly URL. Implementing mod_rewrite is generally one of the first steps to a SEO overhaul on a website with a lot of dynamic content; hence it’s popularity. However, Apache is the swiss army knife of HTTP servers and was one of the first to offer a whole gamut of external, loadable modules (e.g., “mod_”…). These little tidbits of compiled-C goodness act as plugins that extend the functionality of Apache. In Part One of a finite-but-unknown-length series about these useful tools, I take a look at mod_ext_filter and some of it’s practical uses.
Introduction
At the most basic level, mod_ext_filter is simply a way to run content through an external program before it is outputted to the client. Think of it as a filter sitting between Apache and the user’s web browser. It intercepts the output that Apache would normally send back to the user, does something (generally modify or log the content), and then passes the result along to the browser. In fact, you could, in theory, sniff web browsers’ language setting and dynamically translate all of your webpages into foreign languages using mod_ext_filter. However, practically speaking, it would be horribly inefficent and probably too slow to be usable. Where the module really shines is making small changes to content. And since it’s an Apache module, it will work with anything Apache serves: HTML documents, PHP/Perl/Ruby/Python scripts, images, PDFs, etc.
Example
Assume that you’ve been told by the VP of Marketing that they want to “increase brand impact” by adding a trademark notice (a “TM”) after each mention of your company’s flagship product, SuperWidget. You could simply do a search-and-replace on all the files on the web server. However, this wouldn’t prevent an intern who didn’t get the memo from uploading a new document sans-TM. What we could do is use mod_ext_filter and sed to perform a search-and-replace every time a document is requested from the web server.
First, inside Apache’s config file (”http.conf”) we define the filter:
<IfModule mod_ext_filter.c>
ExtFilterDefine add-tm mode=output intype=text/html cmd="/bin/sed 's/SuperWidgets/SuperWidgets<sup>TM</sup>/g'"
</IfModule>
“add-tm” is the name of our filter, which will be referenced a bit later.
“mode=output” tells Apache that this filter should be applied on content that’s going out to the web browser. Currently “output” is the only mode supported.
The “intype” parameter is used to specify which MIME type this filter will be applied to. In this example, the filter is only run on “text/html” documents. This is important to note because a “TM” would not be added within PDF files, which would most likely corrupt the document when a user attempted to download/view it.
“cmd” is simply the command that is run when the filter is activated. The intercepted content is piped to this command via STDIN and outputted via STDOUT. Since almost all Linux commands support standard streams, this makes things quite handy.
Next we add:
<Location />
SetOutputFilter add-tm
</Location>
While the opening “Location” tag looks like a self-closing tag (it ends with a “/>”), the slash actually specifies that this filter be applied at the root of this website. You could also use <Location /products/superwidget> to only filter files in that directory. Alternatively, you can also use “Directory” (for absolute paths) or “File” containers to specify where the filter is run. See the Apache docs for more info.
Using mod_ext_filter and sed to automatically add a trademark notice.

Other Applications
That’s just one, simple example of the power of mod_ext_filter. Odds are there are more robust methods of implementing this solution for the marketing dept. Other, more practical, uses include:
- Dynamically add a copyright footer, even on plain ‘ole static HTML pages. (Though mod_include would be better for this unless you’re doing something tricky that requires the extra logic of an external program. I’ll be covering this in an upcoming blog entry.)
- A profanity filter that can easily be used with any blog or messageboard software.
- Dynamically adding a watermark to images using “composite” (Step-by-step HOWTO).
- And my favorite… Automatically add a wrapper around external links for tracking purposes. This could be done by piping the output through search-and-replace regex using Perl, sed, or awk. This is great when used with a CMS and end-users will be self-publishing content, possibly adding their own links.
Conclusion
One thing to keep in mind is that mod_ext_filter’s flexibility comes at a price: it’s not a speed demon because the filter is shelling out everytime it’s called. For most websites though, the performance hit is neglible if you keep the filter command simple. Even if the module is too pokey for your site, it’s a great way to prototype an idea before you sit down and crank out a custom module written in C using the Apache API
For more information on this great little tool, check out the official docs.
One additional caveat: while mod_rewrite can be used in either a .htaccess file (per directory) or within http.conf (per server), mod_ext_filter only works from within http.conf. This means that those of you out there who use a shared webhost are most likely out of luck. If this really burns you, keep in mind that you can lease a dedicated server for less than $70/mo nowadays. It’s something to consider if you have the systems admin skills (or are willing to learn) and a few paying web design/development clients that you could move over.
September 17th, 2006