PHILADELPHIA REFLECTIONS
Musings of a Philadelphia Physician who has served the community for six decades

Return to Home

Related Topics

Website Development
The website technology supporting Philadelphia Reflections is PHP, MySQL and DHTML. The web hosting service is Internet Planners. The development of this website has provided an opportunity to learn new technology, to try out different techniques for getting noticed by the search engines and the trials and tribulations of dealing with malicious hackers and spammers who range from the annoying to the abusive. This collection of articles documents some of our experiences and we hope that people surfing the web looking for solutions to problems we've encountered will benefit.

RSS, Atom, Syndication, etc.

The world is full of XML and XML-like file formats for syndication purposes

Here's the list of files we generate automatically for submission to search engines and such.

(For right now, things are a bit abbreviated)

http://www.philadelphia-reflections.com/reflectionsRSS.xml (RSS Syndication file)
http://www.philadelphia-reflections.com/reflectionsATOM.xml (Atom Syndication file)
http://www.philadelphia-reflections.com/sitemap.xml (Google sitemap)
http://www.philadelphia-reflections.com/siteinfo.xml (A9/Amazon siteinfo.xml)
http://www.philadelphia-reflections.com/reflectionsIDIF1.xml (Yahoo IDIF file 1)
http://www.philadelphia-reflections.com/reflectionsIDIF2.xml (Yahoo IDIF file 2)
http://www.philadelphia-reflections.com/reflectionsIDIF3.xml (Yahoo IDIF file 3)
http://www.philadelphia-reflections.com/reflectionsIDIF4.xml (Yahoo IDIF file 4)
http://www.philadelphia-reflections.com/reflectionsIDIF5.xml (Yahoo IDIF file 5)
http://www.philadelphia-reflections.com/IDIFpointer.txt (Yahoo IDIF pointer file)
http://www.philadelphia-reflections.com/urllist.txt (Yahoo urllist.txt)

Validate Short RSS | The Short RSS File itself
Validate Short rss (lower case) | The Short RSS File itself (lower case)
Validate Short ATOM | The Short ATOM File itself

Weblogs.com extended successfully pinged
Weblogs.com successfully pinged
blo.gs successfully pinged
Technorati successfully pinged
Ping-O-Matic successfully pinged
Syndic8 successfully pinged (Feed ID 477463)

Ping Blogroller manually
Ping MyYahoo manually



The RSS and Atom validator (http://feedvalidator.org/) has a length restriction. I don't know what it is, exactly, but it bombs if your file is "too long". Since most syndication readers run the validator before they'll accept a feed, I have resorted to creating a short file, which is what I point to in my meta tags.



Here's how I provide change frequency and priority for our Google sitemap (in PHP ... $mod is the variable containing the date last modified)

$GOOGLEpriority = "0.0"; $GOOGLEfreq = "yearly";	// default

if ($mod > mktime(0,0,0) - 86400*210)	{$GOOGLEpriority = "0.1"; $GOOGLEfreq = "monthly";}	// past 210 days
if ($mod > mktime(0,0,0) - 86400*180)	{$GOOGLEpriority = "0.2"; $GOOGLEfreq = "monthly";}	// past 180 days
if ($mod > mktime(0,0,0) - 86400*150)	{$GOOGLEpriority = "0.3"; $GOOGLEfreq = "monthly";}	// past 150 days
if ($mod > mktime(0,0,0) - 86400*120)	{$GOOGLEpriority = "0.4"; $GOOGLEfreq = "monthly";}	// past 120 days
if ($mod > mktime(0,0,0) - 86400*90)	{$GOOGLEpriority = "0.5"; $GOOGLEfreq = "monthly";}	// past 90 days
if ($mod > mktime(0,0,0) - 86400*60)	{$GOOGLEpriority = "0.6"; $GOOGLEfreq = "monthly";}	// past 60 days
if ($mod > mktime(0,0,0) - 86400*30)	{$GOOGLEpriority = "0.7"; $GOOGLEfreq = "monthly";}	// past 30 days
if ($mod > mktime(0,0,0) - 86400*7)	{$GOOGLEpriority = "0.8"; $GOOGLEfreq = "weekly";}	// past 7 days
if ($mod > mktime(0,0,0) - 86400)	{$GOOGLEpriority = "0.9"; $GOOGLEfreq = "daily";}	// yesterday
if ($GOOGLEmoddate == date("Y-m-d"))	{$GOOGLEpriority = "1.0"; $GOOGLEfreq = "hourly";}	// today



IDIF is a stupid format: it includes the entire blog_contents, so the files are huge. In the process of setting this up, I learned that flat files have a maximum size of 1.4 megs or so (the size of an old floppy disk), so I had to create more than one.

Which explains the stupid concept of a "pointer file"; instead of just giving Yahoo the IDIF file itself, you give it a pointer file with URLs pointing to the multitude of IDIF files. Really stupid.

News flash, after finding the Journal Of Ovid on the web, I learned about length restrictions for the input fields (described below). This information was not contained on the Yahoo web site describing their file format. It considerably reduced the file sizes but I retained the structure of multiple files because who knows what I'll learn next?

IDIF title must be a maximum of 80 characters
IDIF description must be a maximum of 180 characters
IDIF body must be a maximum of 1000 characters
I'm only guessing about keywords

Thanks to the Journal Of Ovid on the web for this secret information

From the inside out: trim, replace whitespace (thanks to the PHP manual for this), shorten to maximum length

$IDIFtitle		= substr( preg_replace ('/\s\s+/', ' ', trim($title) ), 0, 80 );
$IDIFdescription	= substr( preg_replace ('/\s\s+/', ' ', trim($description) ), 0, 180 );
$IDIFkeywords		= substr( preg_replace ('/\s\s+/', ' ', trim($keywords) ), 0, 79 ) . " ";
$IDIFblog_contents	= substr( preg_replace ('/\s\s+/', ' ', trim($blog_contents) ), 0, 1000 );

Yahoo is said to support a simple text file list of URLs "urllist.txt" Documentation, of course, is scarce


(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

(1188)

Please Let Us Know What You Think


(HTML tags provide better formatting)

Because of robot spam we ask you to confirm your comment: we will send you an email containing a link to click. We apologize for this inconvenience but this ensures the quality of the comments. (Your email will not be displayed.)
Thank you.