Apache mod_rewrite example: Redirecting and rewriting URLs
In this post, we feature a comprehensive Apache mod_rewrite example. This article is part of our Academy Course titled Apache HTTP Server Tutorial.
In this course, we provide a compilation of Apache HTTP Server tutorials that will help you get started with this web server. We cover a wide range of topics, from installing the server and performing a basic configuration, to configuring Virtual Hosts and SSL support. With our straightforward tutorials, you will be able to get your own projects up and running in minimum time. Check it out here!
In a previous article we mentioned that one of Apache’s most distinguishing features is its extensibility via modules, which we defined as “independent, separate pieces of software that provide specific functionality”.
Some modules are built-in into Apache as part of the core functionality and are present when the web server is installed as we explained in “How to install the Apache web server”.
Others, such as mod_bw (which we covered in “Apache name-based Virtual Host Configuration Example”), can be installed using your distribution’s package management system.
In this article we will explain how to use mod_rewrite (a well-known and widely used module) to dynamically map incoming HTTP requests targeting arbitrary URLs to specific documents in your web server’s or virtual host internal structure, or to another external URL.
In other words, this module will allow you to redirect (as the rewrite in the name suggests) an URL (http://www.example.com/scg/results.php?country=Argentina&province=Cordoba) to a more user or SEO friendly URL (http://www.example.com/scg/Argentina/Cordoba) in order to get a higher position in search engines rankings, which ultimately leads to more visitors.
To accomplish this purpose, mod_rewrite realies heavily on PCRE (Perl Compatible Regular Expressions) vocabulary, which we will introduce next. Please be advised, however, that this topic can be a little burdensome until you start reaping the benefits out of it.
1. Introducing regular expressions (regexs)
In simple words, a regular expression is a text string that represents a search pattern. The following list, adapted from the Apache documentation on PCRE, shows the most common characters used in regular expressions, their meaning, and an example:
- . (a dot) matches any single character. Thus, b.t will match bat, bet, bit, bot, and but.
- + (the plus sign) repeats the previous characters one or more times. For example, o+ matches oo, ooo, etc.
- * (star) repeats the previous match zero or more times. Additionally, * is also used to match an empty string. In other words, the matches returned by a+ are a subset of the matches of a*.
- ? (question sign) makes the match of the previous characters optional, so colou?r will match both color and colour.
- ^ (caret) matches the beginning of the string. For example ^a matches a string that begins with a.
- $ (dollar sign) matches the end of the string, so a$ matches a string that ends with a.
In addition, you can also group characters into groups of classes:
- A set of parentheses ( ) is used to group several characters into a single unit. You can then apply the above regex characters to the group as it was a single character. Thus, (ab)+ matches abab, ababab, etc. Keep in mind that the + here applies to the group of characters surrounded by parentheses.
- A character class [ ] matches at least one of the characters in the set inside square brackets. For example, [alnum] matches any letter of the alphabet or numerical digit. Character classes are well explained in the PCRE regex syntax for PHP.
- On the opposite, a negative character class [^ ] matches any character not specified. Thus, c[^/]t matches cat or c2t but not c/t.
Finally, you can use the exclamation sign (!) to negate it.
Now we are ready to discuss the RewriteRule and RewriteCond directives, which are essential to the operation of mod_rewrite.
2. Introducing RewriteRule and RewriteCond
The RewriteRule directive, as its name implies, substitutes a given pattern with either one of three things:
- An absolute path to a local resource found inside the system
- A path to a local web resource
- An absolute URL
Its basic syntax is:
RewriteRule [Pattern] [Substitution] [Optional → Flags]
Last, but not least, we need to mention that you can place this directive inside the main configuration file, inside a Virtual host definition or Directory block. You can use multiple RewriteRule directives in the same context, each one with its own [Pattern], [Substitution] and (optionally) [Flags].
The RewriteCond directive introduces a condition that must be met before the RewriteRule rules are “activated”.
To better understand this topic, let’s illustrate with a few examples.
3. Examples
Before we proceed with some examples, there are some considerations that we must take into account. In order to actually use mod_rewrite, we need to add the directives listed below in the context where we will use this module. Additionally, we need to make sure the module is loaded. We do this by running (in Ubuntu)
sudo a2enmod rewrite
In CentOS, it is enabled by default, which you can confirm with
httpd -M | grep rewrite
You should get the following output:
rewrite_module (shared)
If not, you will need to check the presence of the module file (mod_rewrite.so) in /etc/httpd/modules and make sure Apache is loading the modules in that directory. Look for the following line in the main configuration file:
Include conf.modules.d/*.conf
(By the way, /etc/httpd/modules is actually a symbolic link to /etc/httpd/conf.modules.d)
If it is not there, add it before proceeding.
Suppose we want to enable mod_rewrite in the context of www.example1.com. We need to modify its configuration file and add RewriteEngine (to enable the rewriting engine). Additionally, you need to set the Options directive to allow FollowSymlinks:
<VirtualHost *:80> DocumentRoot "/var/www/example1.com/public_html/" ServerName www.example1.com ServerAlias example1.com ErrorLog /var/www/example1.com/error.log LogLevel info CustomLog /var/www/example1.com/access.log combined BandwidthModule On ForceBandWidthModule On Bandwidth all 20480 MinBandwidth all -1 MaxConnection all 5 <Directory "/var/www/example1.com/public_html/media"> LargeFileLimit * 1024 10240 </Directory> RewriteEngine on Options FollowSymLinks </VirtualHost>
With that in place, also add the following lines inside the virtual host definition given above:
RewriteCond "%{REMOTE_ADDR}" "^192\.168\.0\.104" RewriteRule "^/vhosterrors" "/var/www/example1.com/error.log" RewriteRule "^/default\.aspx$" "index.html" [R] RewriteRule "^/go/to/example2$" "http://example2.com" [R] RewriteRule "^/writer/(.*)/view$" "/var/www/example1.com/$1"
(Make sure your configuration is similar to that shown in Fig. 1)
Let’s see what is happening here:
RewriteCond "%{REMOTE_ADDR}" "^192\.168\.0\.104"
indicates that the below rules apply if the remote address is 192.168.0.104
RewriteRule "^/vhosterrors" "/var/www/example1.com/error.log"
If you browse to http://example1.com/vhosterrors, a non-existent directory (note that it is a regular expression beginning with the word vhosterrors, as indicated by the caret sign), you will view the error log for the virtual host (/var/www/example1.com/error.log), as seen in Fig. 2.
RewriteRule "^/default\.aspx$" "index.html"
If you go to http://example1.com/default.aspx, you will be taken to the index.html of the virtual host. Refer to Fig. 3 for details.
RewriteRule "^/go/to/example2$" "http://example2.com" [R]
Browse to http://example1.com/go/to/example2 and you will be redirected to http://example2.com. By the way, the R inside square brackets stands for Redirect. This rule, as opposed to the previous one (which does a URL rewrite in the full sense of the word), performs a redirect to an external site. You may want to keep in mind that example1.com and example2.com are two different, separate sites even though they are hosted in the same machine.
Finally,
RewriteRule "^/writer/(.*)/view$" "/var/www/example1.com/$1"
says that if you go to http://example1.com/writer/gabriel/view, you will be taken to /var/www/example1.com/public_html/gabriel. Here the $1 is a placeholder for whatever matches the regular expression (.*). As explained earlier, the dot stands for any character, and the star sign represents zero or more occurrences of such character. In other words, that is the regular expression for match everything. Since this file does not exist, in Fig. 4 we can see a portion of the error log that says so:
If you try any of the above rewrite rules from a machine other than 192.168.0.104, you will see they don’t work as the rewrite rules are only put into effect when the remote address is 192.168.0.104, as you can see in Fig. 5:
With a slight change in the RewriteCond directive, you could allow access from the 192.168.0.0/24 network. Replace
RewriteCond "%{REMOTE_ADDR}" "^192\.168\.0\.104"
with
RewriteCond "%{REMOTE_ADDR}" "^192\.168\.0"
Then test again (see Fig. 6):
Please note that you should consider creating custom 404 error pages to display when the visitor attempts to access a resource that does not exist.
For example, copy the following code in /var/www/example1.com/public_html/error.html:
<!DOCTYPE html> <html> <head> <title>Not found</title> </head> <body> <h1>Not found :(</h1> <h3>The page you requested has not been found.</h3> <p>Perhaps you would like to go to our <a href="index.html">home page</a>?</p> </body> </html>
Now add the following line inside the virtual host definition:
ErrorDocument 404 /error.html
Then browse to a non-existent resource (http://example1.com/hello, for example) and you will see your personalized error page. See Fig. 7 for details:
As you can see, a custom error page looks much better than Apache’s default. In addition, you can use the error page to provide instructions (such as the suggestion to go to the home page in the Fig. 7).
4. Summary
In this article we have explained how to use mod_rewrite, definitely one of Apache’s most versatile modules, to perform URL rewriting and redirecting. As it is a vast topic, we cannot adequately cover it in a single article, so you are highly encouraged to check out the documentation linked in this tutorial, along with the Redirecting and Remapping guide. This last resource provides lots of other examples of what you can do with mod_rewrite.
Very Good article… which gives a good insight for RewriteRule. 🙂