The website technology supporting Philadelphia Reflections is PHP, MySQL and DHTML. The web hosting service is Internet Planners. The development of this website has provided an opportunity to learn new technology, to try out different techniques for getting noticed by the search engines and the trials and tribulations of dealing with malicious hackers and spammers who range from the annoying to the abusive. This collection of articles documents some of our experiences and we hope that people surfing the web looking for solutions to problems we've encountered will benefit.
George and Computers(1)
I got him into computers around 1960. He soon far surpassed me.
It used to be that no spiders or search engines could index a dynamic URL, namely one that contained a "?" followed by parameters to be used by PHP, ASP or other server-side scripting languages to drive a website using a database.
Nowadays, Google and Yahoo seem to do a perfectly fine job of indexing dynamic URLs but Google has a disclaimer warning that it may still encounter problems with dynamic URLs and the SEO literature is still full of warnings that other spiders and search engines may be blind to everything to the right of the "?".
Furthermore, a *.php extension is an invitation to bad guys to try to break in and wreak many sorts of havoc: this site was hacked by Nigerians a few years ago using PHP tricks and they managed to use it as an email factory until our ISP shut us down. I came on the scene at that point and implemented every safeguard I could find, but the concern still lingers.
Finally, dynamic URLs are not user friendly ... human beings generally do not know what to make of long strings of obscure parameters.
Apache has a feature called "mod_rewrite" that allows you to specify, via regex, that you want incoming URLs to be transformed in some way. Apache's instructions on this subject are here: URL Rewriting Guide. I have used that facility at Philadelphia Reflections to use static URLs for public use while still allowing me to use parameters to drive the website with the database.
Two excellent articles on this got me started:
Here's what I did:
Step 1: htaccess
I added these lines to the htaccess file
Options +FollowSymLinks RewriteEngine on RewriteRule ^(blog|topic)/([0-9]+)\.html?$ reflections.php?type=$1&key=$2
The result is that
is transformed into
The latter is what is passed in to me in the reflections.php routine, which tells me to pull up blog #906 from the database.
Both of these URLs are equivalent to the old, ugly dynamic URL
which still works, in case there are any legacy bookmarks or links out there, but going forward the new, simple, static URL is the face we will present to the world.
Step 2. SMOP
After the htaccess regex was debugged, all that was left was a simple matter of programming. In fact, I had to completely rewrite the driver script, reflections.php, and the XML creation script which creates the RSS, sitemap, etc. files; plus a lot more besides. It was a lot of work but the breakthrough was in figuring out the htaccess trick; everything else was just work.
In July 2008, after Volumes were implemented, another RewriteRule was implemented:
RewriteRule ^volumes?/([0-9]+)\.html?$ volume.php?table_key=$1Converts