why unix | RBL service | netrs | please | ripcalc | linescroll
hosted services

hosted services

RewriteMaps are extremely useful, especially if you have a large database of links that you're moving from and old site structure. The redirectmaps allow for indexing/hash of data through the use of external programs or DBM files.

introduction

This page is here to document how to implement a DBM RewriteMap in the simplest form for regular modifications (at least this is the way that I found it simplest).

The rewrite map has to be defined first in your main server configuration and cannot be defined within .htaccess files. I find it easiest to reference a DBM file that is in user space, but you might want to limit this yourself to an area that is writeable only by a super user.

RewriteMap redirects-map dbm:/var/www/sites/example.test/maps/redirects

The above is telling Apache's configuration that the rewrite map we're using is named redirects-map, in DBM format and is located in the path /var/www/sites/example.test/maps/redirects.

The map itself is a set of key:value pairs which are set-out like so:

main /index.php?article=main
search /index.php?article=search

updating

In order to make frequent updates I have found it easiest to use a Makefile which allows for definitions of which httxt2dbm to run from the command line

HTTXT2DBM=/usr/sbin/httxt2dbm

all:
        $(HTTXT2DBM) -i links.txt -o links
        $(HTTXT2DBM) -i redirects.txt -o redirects

In this example there are two maps which are built, the links and redirects.

.htaccess

To use the map within .htaccess configuration it can be referenced like so:

RewriteCond     ${redirects-map:$1}     >""     [NC]
RewriteRule     ^(.*)   ${redirects-map:$1}     [R=301,L]

What this is telling Apache is that if the map contains the URI and the value is not "" (an empty string) then perform the RewriteRule. The second line of this snippet tells Apache that the URI should be re-written to the value of the key that we matched in the previous line. The redirect code is 301 (permanent redirect) and this is the last processing rule.

It's possible to redirect to the value of the key that matches the URI in a single line (ignoring the RewriteCond), but that's only good if there is a match for the key.

It's also possible to use the logic to redirect matching URI's to a single location:

RewriteCond     ${redirects-map:$1}     >""     [NC]
RewriteRule     ^(.*)   /update_your_bookmarks.html     [R=301,L]

The logic above just requests that the keys value is not blank, in this case the following logic in the RewriteRule is applied.

lowercase keys

Should you want to only store lowercase keys (which might be advisable) for ease of maintenance, you can use the internal tolower function, just add the below to your Apache main configuration

RewriteMap      lowercase       int:tolower

This is then accessible in the rewrite functions, such as in the example below:

RewriteCond     ${lowercase:$1}         ^(.+)$  [NC]
RewriteCond     ${redirects-map:%1}     >""     [NC]
RewriteRule     ^(.*)   ${redirects-map:%1}     [R=301,L]

The first line of the above turns puts the result of the lowercase map into the %1 value which is referenced in the second line. The %1 value is also used in the third and final line of this block.

trailing slash

So, rather than insert double the number of URI's to account for trailing slash, you can compensate for that in the RewriteCond:

RewriteCond     ${lowercase:$1}         ^(.+)/$ [NC]
RewriteCond     ${redirects-map:%1}     >""
RewriteRule     ^(.*)   ${redirects-map:%1}     [R=301,L]

In the lowercase rule we add the / to the end of %1.

block lists

After receiving a lot of spam to a site I decided to start blocking those requests and storing them in a log.

This evolved into a RewriteMap

RewriteMap  ipblock-map dbm:/var/www/sites/ip_block

The block text list should contain the following key value pair syntax:

127.0.0.1   block

RewriteCond ${ipblock-map:%{REMOTE_ADDR}}   =block
RewriteRule ^   -           [F,L]

This block rule just checks for a key with the word 'block' as the value and responds with a forbidden notice.

This is great for single IP addresses. This following snippet works for a /24 CIDR range (if you want to do more complex rules take a look at mid_cidr.

RewriteCond %{REMOTE_ADDR}          ^(\d+)\.(\d+)\.(\d+)\.
RewriteCond ${ipblock-map:%1.%2.%3}     =block
RewriteRule ^   -           [F,L]

The entries in the block list need to be stored in the following format for /24 addresses:

127.0.0     block

host redirects

Another useful feature you can make use of is simple redirects based on the HTTP_HOST variable. The first VirtualHost configuration will be the 'catch all' container, so if you add a rewrite map such as this:

RewriteMap      lowercase       int:tolower
RewriteMap      host-map        dbm:/var/www/sites/maps/host-map

RewriteCond     ${lowercase:%{HTTP_HOST}}       ^(.+)$ [NC]
RewriteCond     ${host-map:%1}                  >""
RewriteRule     ^                               ${host-map:%1} [R=301,L]

All you need to do now is add host redirects to the host-map dbm file. If you're running a very large redirect server then the chances are that you don't want to have many virtual host containers as each virtualhost container will consume some memory. If you have the redirect configured through a dynamic script then that script will need to be invoked for each request that needs to go through the virtualhost redirect server. Using a rewritemap in this instance has strong benefits.

rewritemap programs

We can take the above example and improve it somewhat with the use of an external program to handle the lookup key and respond with a value. This is really handy since external programs can do computation things which would be near impossible with mod_rewrite. Here is a rather simple example

RewriteMap  beanmap         prg:/home/perl/rewritemaps/beanmap.pl
RewriteCond ${mymap:%{REMOTE_ADDR}} ^(\S+)\s(\S+)\s(\S+)\s(\S+)
RewriteRule ^   -       [E=V1:%1,E=V2:%2,E=V3:%3,E=V4:%4]

In the above example, we're expecting the program to return four values separated by a space otherwise we'll discard the values all together and the RewriteRule won't have any effect. In the above example we're sending the program the remote IP address of the connection as the input (key).

The program which responds to this input can be as simple as the following:

#!/usr/bin/perl

use strict;
use warnings;

# it is essential to turn off buffering, otherwise the output from this program will not be flushed
$|=1;

while( my $line = <STDIN> ) {
    chomp $line;
    if( $line =~ /^([0-9a-f]+)\.|\:([0-9a-f]+)\.|\:([0-9a-f]+)\.|\:([0-9a-f]+)/ ) {
        print "Fi\tFy\tFo\tThumb\n";
        next;
    }
    print $line, "\n";
}

In the above, if we validate the input as IPv4 or IPv6 then we can write some output. This will be matched by the RewriteCond and the environment variables will be set.

The program is a bit silly and doesn't do very much at all of value other than serve as an example to show how the output from this program can become environment variables through RewriteCond matching and RewriteRule setting.