Fun with Regular Expressions (RegEx)

I’ve been having fun with Regular Expressions today!

I was after a solution for one of my clients who posed this challenge…

When someone clicks on my ads, I want to send them to my index page (index.php), and I want to pass in information about the campaign as well as the keyword.

At the moment, these are parameters, so I do this with (for example) ““, but it looks ugly in their browser URL bar.

I’d really rather the whole thing was a url, but I don’t want to set up copies of my index page for every campaign and keyword combination.

Is there a way to do this?

I suggested that we could send them through to a url that looked more like this:

Which I accomplished with some regular expressions in the .htaccess file.

While I was doing this I found a really useful (and free!) online Regular Expression testing tool – saved me a bunch of time with uploading .htaccess files and testing them.

Here’s how I did it…

.htaccess changes

First, we need to ensure that their web-server’s rewrite engine is switched on:

RewriteEngine on

Then we need to replace anything that looks like this:

with this:

Regular Expressions (RegEx) to the rescue:

RewriteRule ^([^/\.]+)/([^/\.]+)/?$ /index.php?campaign=$1&keyword=$2 [L]

A RewriteRule takes the general form of:

RewriteRule <search-string> <replace-string> [FLAGS]

Let’s take our solution apart one piece at a time…

[^/\.]+ - matches any string of characters ([...]) EXCEPT (^) a forward-slash (/) or a full-stop (.)

(…)  – round brackets identify a “group” and saves it with a name for use later on. The first group is called $1, the second is called $2 and so on.

^  – matches from the start of the line

/? $ – matches an OPTIONAL (?) forward-slash(/) at the end of the line ($)

putting it all back together, this will match (campaign)/(keyword-phrase) as well as (campaign)/(keyword-phrase)/. The campaign and keyword-phrase are saved as groups $1 and $2 respectively, ready to be used in the replacement url: /index.php?campaign=$1&keyword=$2

Now, the only problem is that relative references to images and other files from within the index.php will be pointing to the wrong place. This is because the user’s browser thinks it is pointing to /EDUC1/some-keyword so references to images/some-pic.jpeg would be interpreted as /EDUC1/images/some-pic.jpeg. Of course, they are really in simply /images/some-pic.jpeg.

We fix this with another bit of RegEx magic:

RewriteRule .+/images/(.*) /images/$1 [L]

So all together the changes look like this:

RewriteEngine on
RewriteRule .+/images/(.*) /images/$1 [L] RewriteRule ^([^/\.]+)/([^/\.]+)/?$ /index.php?campaign=$1&keyword=$2 [L]

Check out this webmasterworld page for more information on using mod-rewrite, regular expressions and .htaccess.

Leave a Reply

Your email address will not be published. Required fields are marked *