By spencer, Wednesday, November 14, 2007 · 5:59 pm

I just finished moving the last of my content into WordPress pages. I got tired of my old theme and decided to give up on integrating WordPress into my existing site. I was spending a significant amount of time just merging in the changes from WordPress releases without clobbering all of my customizations. It had officially become a pain in my ass. School, and Projects are now available in the other Other… section at the top. I might them up to the top-level but for now they are there.

Aside from moving the content and fixing it to adhere to my theme and WordPress’ requirements I had to come up with some magic mod_rewrite rules to redirect the old pages to the new pages including query strings. I hadn’t used QUERY_STRING in redirects before so it was a learning experience and quickly reminded of the love/hate relationship I have with mod_rewrite. Here are the new components:

RewriteCond %{REQUEST_URI}              ^/wordpress.*$ [NC]
RewriteRule wordpress/(.*)            http://beyondabstraction.net/$1 [R=301,L]

# legacy url rewriting
RewriteCond %{REQUEST_URI}              ^/school.php$ [NC]
RewriteRule ^school.php$                http://%{SERVER_NAME}/stuff/school/ [R=301,L]
RewriteCond %{REQUEST_URI}              ^/school-stuff/?$ [NC]
RewriteCond %{QUERY_STRING}             class=([^&;]*)? [NC]
RewriteRule .*                  http://%{SERVER_NAME}/stuff/school/%1/? [R=301,L]

RewriteCond %{REQUEST_URI}              ^/sonyfs660/?$ [NC]
RewriteRule sonyfs660/?            http://beyondabstraction.net/stuff/vaio-fs660/ [R=301,L]

RewriteCond %{REQUEST_URI}              ^/projects.php.*$ [NC]
RewriteRule projects.php(.*)?            http://beyondabstraction.net/stuff/projects/$1 [R=301,L]

Since I’ve never posted these in their entirety before I’m attaching my top-level .htaccess and my 403 page [1] [2]. The 403 page uses flushes to display content to humans while staying in a loop to waste spammers time.

I had to add a .htaccess rule to disable output compression on the 403 directory. This is because I have Apache setup to use mod_deflate to compress the content. So mod_deflate is buffering output to compress before sending it to the client. Additionally, PHP is buffering output and the client side browsers are buffering. PHP is an easy fix via a flush() and ob_flush() call. Client side buffering is a little more tricky. Must browsers start displaying content as soon as it can be rendered. Safari, and perhaps other KHTML/Webkit based browsers, buffer until the first 1K is seen or the connection is closed. There is some magic in the 403 file to skirt around this problem.

Updated: Forgot to describe the rest of the .htaccess file. Those of you not interested in mod_rewrite should not read on as the boredom may kill you. Those of you that are interested:

  1. Find a new hobby and don’t read this dribble
  2. Perhaps even move and start a new life
  3. At the very least should get a cup of coffee and a porno to avoid the same boredom related death as the non-interested

I’m going to pretend I didn’t post that clip from the .htaccess file above and just start over from the beginning. Please click any of the links below to see a demonstration of the rewrites.

# spammers and scanners
Order Allow,Deny
Deny from 69.13.156.208
Deny from 208.57.118.100
Deny from 216.229.143.241
Deny from 12.178.36.25
Deny from 81.95.146.227
Deny from 68.83.37.146
Deny from 80.58.205.34
Allow from all

The next section starts the rewrite engine, ensures that the site is only reachable via very certain host names [3]. It also provides a shorthand for my security blog that can easily be written on a matchbook ;)

<IfModule mod_rewrite.c>
RewriteEngine On

# host rewrite rules
RewriteCond %{HTTP_HOST}   !^beyondabstraction\.net [NC]
RewriteCond %{HTTP_HOST}   !^security.beyondabstraction\.net [NC]
RewriteCond %{HTTP_HOST}   !^$ [NC]
RewriteRule ^/(.*)         http://beyondabstraction.net/$1 [R=301,L]

# security sub-blog
RewriteRule ^security/?$ /category/security?title=off [R=301,L]

The next chunk is the meat of what I call “legacy support”. These are efforts a maintainer takes to avoid dead links as their site changes. These were all required when I moved to a pure WordPress site. Come to think of it, that is how this whole tangent got started. Oh well, now that you’re screwed and I’m stuck finishing…

The wordpress/ rewrite shows exactly what I mean when I say WordPress was integrated into my site; the rewritten rule shows that WordPress is now the top-level of my site. For the sake of explanation, lets say WordPress used to be at the second level. I had other content at the second level such as my school work. This meant all of that content had to be moved from the second level of my old site, to the second level of my new site inside of WordPress. These rewrites are the equivalent of setting a forwarding address for snail mail or email. The hardest one is the rewrite of a query string, a GET request. The old school pages took the class number as a GET argument to determine which page to display. I used the QUERY_STRING variable and converted it into a sub-directory. /school.php?class=cs331 will now redirect to /stuff/school/cs331.

RewriteCond %{REQUEST_URI}              ^/wordpress.*$ [NC]
RewriteRule wordpress/(.*)            http://beyondabstraction.net/$1 [R=301,L]

# legacy url rewriting
RewriteCond %{REQUEST_URI}              ^/school.php$ [NC]
RewriteRule ^school.php$                http://%{SERVER_NAME}/stuff/school/ [R=301,L]
RewriteCond %{REQUEST_URI}              ^/school-stuff/?$ [NC]
RewriteCond %{QUERY_STRING}             class=([^&;]*)? [NC]
RewriteRule .*                  http://%{SERVER_NAME}/stuff/school/%1/? [R=301,L]

RewriteCond %{REQUEST_URI}              ^/sonyfs660/?$ [NC]
RewriteRule sonyfs660/?            http://beyondabstraction.net/stuff/vaio-fs660/ [R=301,L]

RewriteCond %{REQUEST_URI}              ^/projects.php.*$ [NC]
RewriteRule projects.php(.*)?            http://beyondabstraction.net/stuff/projects/$1 [R=301,L]

The final set of rules addresses bandwidth leechers, Ebay image leechers and spam referrers. The final four lines are used to pass requests for non-existent files and directories onto WordPress as path information. WordPress then uses this information to generate dynamic pages.

# bandwidth leechers
RewriteCond %{HTTP_REFERER} ^https?://.*.ebay.com/.*$
RewriteRule .*\.(gif|GIF|jpg|JPG|png|PNG).*$ - [G,L]

# spammers
RewriteCond %{HTTP_REFERER} poker [OR]
RewriteCond %{HTTP_REFERER} medicine [NC,OR]
RewriteCond %{HTTP_REFERER} pills [NC,OR]
RewriteCond %{HTTP_REFERER} diet [NC,OR]
RewriteCond %{HTTP_REFERER} viagra [NC,OR]
RewriteCond %{HTTP_REFERER} mortgage [NC,OR]
RewriteCond %{HTTP_REFERER} casino [NC,OR]
RewriteCond %{HTTP_REFERER} insurance [NC,OR]
RewriteCond %{HTTP_REFERER} loan [NC,OR]
RewriteCond %{HTTP_REFERER} xanax [NC,OR]
RewriteCond %{HTTP_REFERER} meridia [NC,OR]
RewriteCond %{HTTP_REFERER} incest [NC,OR]
RewriteCond %{HTTP_REFERER} lesbian [NC,OR]
RewriteCond %{HTTP_REFERER} viagra [NC,OR]
RewriteCond %{HTTP_REFERER} adult [NC,OR]
RewriteCond %{HTTP_REFERER} hentai [NC,OR]
RewriteCond %{HTTP_REFERER} tramadol [NC,OR]
RewriteCond %{HTTP_REFERER} phentermine [NC,OR]
RewriteCond %{HTTP_REFERER} gambling [NC,OR]
RewriteCond %{HTTP_REFERER} texas- [NC,OR]
RewriteCond %{HTTP_REFERER} holdem [NC,OR]
RewriteCond %{HTTP_REFERER} pharmacy [NC,OR]
RewriteCond %{HTTP_REFERER} ultram [NC,OR]
RewriteCond %{HTTP_REFERER} levitra [NC,OR]
RewriteCond %{HTTP_REFERER} phentermine [NC,OR]
RewriteCond %{HTTP_REFERER} cialis [NC,OR]
RewriteCond %{HTTP_REFERER} payday [NC,OR]
RewriteCond %{HTTP_REFERER} bargains [NC,OR]

# WARNING: any inserted lines need NC,OR… only the last line should be NC
RewriteCond %{HTTP_REFERER} tramadol [NC]
RewriteRule .* - [F,L]

# pass everything non-file/dir as a parameter to index
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
[3]
I use this to ensure my site isn’t reachable via www.beyondabstraction.net or really as a catchall. As my subdomains come and go this also plays a role in supporting “legacy” portions of the site.

Add your own comment or set a trackback

Currently no comments

  1. No comment yet

Add your own comment

Powered by WP Hashcash



Follow comments according to this article through a RSS 2.0 feed


Jump to start of page | Jump to posts