PHILADELPHIA REFLECTIONS
The musings of a Philadelphia Physician who has served the community for nearly six decades


Google Earth icon

Computers and Websites

Much of the early development of the electronic computer took place in Philadelphia. We lost the lead, but it might return.

Computer Adjectives

{ENIAC museum}
ENIAC museum

We are indebted to Paul W. Schaffer, the curator of the ENIAC museum, for the novel concept that much of the complexity of modern computers can be reduced to a few adjectives. Before we get to that, let's explain how "computing" was done before the University of Pennsylvania revolutionized it.

We used calculating machines, which are sort of overgrown calculators. As big as baby grand pianos, bearing no resemblance at all to those things which sit on top of desks, noisy as all get-out. A typical "calculating shop" used to contain eight or ten machines, each with a specialized function. Key-punch machines, usually several of them, to put holes in cards to be fed into the machines. A couple of sorting machines, to count the holes in the cards and shuffle them into pockets for specified holes. A collator, which was capable of more complicated sorting and sequencing. And the calculator itself, which was able to count various combinations of holes, and even print out the calculations on very large rolls of tape with perforated holes along the edges of the paper.Five or six trained operators would move the piles of cards around, feeding them into the appropriate machines in a prescribed order. Sitting off in a corner was the super-operator, whose job it was to design the sequences of manipulation, and string wires around a wiring board at the back of the machines. His were the brains of the calculating system, and the wires were strung around in accordance with his design. My recollection is that IBM refused to sell these machines, and a typical cluster rented for $1100 a month in 1955. The Pennsylvania Hospital was considered very advanced for having this arrangement as its billing system, but its primitive quality can be seen in the system architecture. The whole system revolved around the concept of producing a bill for every patient in the hospital, every day. If the patient went home, he was given the latest bill. If he remained another day, the old bill was discarded and a new updated one made. It was simple, it was clever, and please don't tell Jefferson. (On the other hand, something appears very wrong about the fact that fifty years later, the same hospital with hundreds of computers, today cannot produce a hospital bill within a month of patient discharge.)

{John Mauchly}
John Mauchly

Well, back to adjectives. John Mauchly the mathematician came to the fundamental recognition that just about everything in mathematics and calculating could be done by "iteration", and re-iteration. Don't be afraid of the words. They only mean you take a small piece of arithmetic, and perform it over and over, millions of times. You really don't need to redesign new machines for each new process, since anything you might want to do could be done by reducing it to the same sort of iteration. So there's the first adjective: Mauchly's iterative design concept amounted to a "general purpose" computer. Many different patterns, perhaps, but all performed on the same machine, just as many different pieces of music are performed on the same piano.

{John Presper Eckert,}
John Presper Eckert,

His graduate student, John Presper Eckert, eliminated the moving parts. Instead of metal hammers and prongs moving around, Eckert moved electrons. This step vastly increased the speed of the processing, and even decreased the effective maintenance. The early computers required a man to go around with a wheel-barrow, constantly replacing vacuum tubes as they burned out. But by moving electrons instead of mechanical parts, the iterative speed was so great that overall maintenance, per million calculations, was less. Eckert gave us the "electronic" computer. Together with Mauchly, the two ideas blended into the electronic, general purpose, computer.

{John von Neumann}
John von Neumann

So then along came John von Neumann, observing this thing at work. ,Millions of punched cards were fed into the machine, but the holes in the cards represented data; the instructions were still wired in by physically connecting one contact to another, which had to be changed when the instructions changed. Von Neumann immediately saw how to get rid of half of this non-electronic effort. His contribution was to punch the instructions into program cards and feed them into the machine when the program instructions changed. So now, we had "stored instruction sets". As we still have today, the trio created the idea of a stored-instruction, general purpose, electronic computer, and actually made a working model of it. That's what promoted it ahead of the Mark I and other mechanical computers that had been developed in Europe. Vastly increased speed, vastly decreased costs -- and lots of big bucks for the manufacturer.

So, off to court, to sue for patent protection. Thousands of patents have been granted for various small innovations in the system, but who was entitled to claim ownership of the basic idea? Who invented the general purpose, electronic, stored-instruction calculator? Some puzzled judge finally worked his way out of that jig-saw puzzle by declaring that no one owned the right to have an overall patent. His reasoning was that since von Neumann had rushed to publish his work rather than rushing to the patent office first, it had become the property of the public and no longer belonged to the inventor. Compared with the contributions of a great many present computer billionaires, it really seems as though Mauchly, Eckert and von Neumann were conservatively entitled to a trillion dollars apiece. But life is not fair, and the law is an ass. Or is that so?

In later lawsuits, of which there were a great many, it came out that Mauchly and Eckert were employees of the University of Pennsylvania. They did what they were told to do and were paid for doing it. Maybe the University is entitled to trillions, thereby allowing them to pay their humanities professors better. But then, one final idea. The University accepted government money to do the job. Maybe all us citizens are entitled to trillions, since we collectively commissioned and paid for this work. Where do we sue?

RSS

WHAT IS RSS?

RSS is a collection of several things.

WHAT IS A RSS FEED?

A feed is a stream of information (in an agreed format), broadcast on the Internet.

WHAT IS A RSS READER?

A reader is a program on the user's machine, that picks out pre-selected feeds, and displays them for the user to browse.

WHAT PROTOCOL IS BEST?

Obviously, the feeds and the readers must speak the same language. After a period of development, there are only two main protocols, and there isn't much advantage between them. The arguments are mostly commercial, like the arguments between IE and Netscape.

WHAT GOOD IS RSS?

Privacy. Although developed for other purposes, the main function is to combat SPAM. The consumer can choose what he wants to get, and can exclude other things.

CAN THE FEEDER PICK AND CHOOSE AMONG CONSUMERS?

Yes, but this is much harder. It probably will involve some sort of encryption system. But it opens up the ability to charge a fee for the information, and so it is probably the main goal of the Killer Ap.

WHAT IS A RSS FEED?

A feed is a stream of information (in an agreed format), broadcast on the Internet.

WHAT IS A RSS READER?

A reader is a program on the user's machine, that picks out pre-selected feeds, and displays them for the user to browse.

WHAT PROTOCOL IS BEST?

Obviously, the feeds and the readers must speak the same language. After a period of development, there are only two main protocols, and there isn't much advantage between them. The arguments are mostly commercial, like the arguments between IE and Netscape.

WHAT GOOD IS RSS?

Privacy. Although developed for other purposes, the main function is to combat SPAM. The consumer can choose what he wants to get, and can exclude other things.

CAN THE FEEDER PICK AND CHOOSE AMONG CONSUMERS?

Yes, but this is much harder. It probably will involve some sort of encryption system. But it opens up the ability to charge a fee for the information, and so it is probably the main goal of the Killer Ap.

Here's ours:

{rss logo}
rss logo

RSS, Atom, Syndication, etc.

The world is full of XML and XML-like file formats for syndication purposes

Here's the list of files we generate automatically for submission to search engines and such.

(For right now, things are a bit abbreviated)

http://www.philadelphia-reflections.com/reflectionsRSS.xml (RSS Syndication file)
http://www.philadelphia-reflections.com/reflectionsATOM.xml (Atom Syndication file)
http://www.philadelphia-reflections.com/sitemap.xml (Google sitemap)
http://www.philadelphia-reflections.com/siteinfo.xml (A9/Amazon siteinfo.xml)
http://www.philadelphia-reflections.com/reflectionsIDIF1.xml (Yahoo IDIF file 1)
http://www.philadelphia-reflections.com/reflectionsIDIF2.xml (Yahoo IDIF file 2)
http://www.philadelphia-reflections.com/reflectionsIDIF3.xml (Yahoo IDIF file 3)
http://www.philadelphia-reflections.com/reflectionsIDIF4.xml (Yahoo IDIF file 4)
http://www.philadelphia-reflections.com/reflectionsIDIF5.xml (Yahoo IDIF file 5)
http://www.philadelphia-reflections.com/IDIFpointer.txt (Yahoo IDIF pointer file)
http://www.philadelphia-reflections.com/urllist.txt (Yahoo urllist.txt)

Validate Short RSS | The Short RSS File itself
Validate Short rss (lower case) | The Short RSS File itself (lower case)
Validate Short ATOM | The Short ATOM File itself

Weblogs.com extended successfully pinged
Weblogs.com successfully pinged
blo.gs successfully pinged
Technorati successfully pinged
Ping-O-Matic successfully pinged
Syndic8 successfully pinged (Feed ID 477463)

Ping Blogroller manually
Ping MyYahoo manually



The RSS and Atom validator (http://feedvalidator.org/) has a length restriction. I don't know what it is, exactly, but it bombs if your file is "too long". Since most syndication readers run the validator before they'll accept a feed, I have resorted to creating a short file, which is what I point to in my meta tags.



Here's how I provide change frequency and priority for our Google sitemap (in PHP ... $mod is the variable containing the date last modified)

$GOOGLEpriority = "0.0"; $GOOGLEfreq = "yearly";	// default

if ($mod > mktime(0,0,0) - 86400*210)	{$GOOGLEpriority = "0.1"; $GOOGLEfreq = "monthly";}	// past 210 days
if ($mod > mktime(0,0,0) - 86400*180)	{$GOOGLEpriority = "0.2"; $GOOGLEfreq = "monthly";}	// past 180 days
if ($mod > mktime(0,0,0) - 86400*150)	{$GOOGLEpriority = "0.3"; $GOOGLEfreq = "monthly";}	// past 150 days
if ($mod > mktime(0,0,0) - 86400*120)	{$GOOGLEpriority = "0.4"; $GOOGLEfreq = "monthly";}	// past 120 days
if ($mod > mktime(0,0,0) - 86400*90)	{$GOOGLEpriority = "0.5"; $GOOGLEfreq = "monthly";}	// past 90 days
if ($mod > mktime(0,0,0) - 86400*60)	{$GOOGLEpriority = "0.6"; $GOOGLEfreq = "monthly";}	// past 60 days
if ($mod > mktime(0,0,0) - 86400*30)	{$GOOGLEpriority = "0.7"; $GOOGLEfreq = "monthly";}	// past 30 days
if ($mod > mktime(0,0,0) - 86400*7)	{$GOOGLEpriority = "0.8"; $GOOGLEfreq = "weekly";}	// past 7 days
if ($mod > mktime(0,0,0) - 86400)	{$GOOGLEpriority = "0.9"; $GOOGLEfreq = "daily";}	// yesterday
if ($GOOGLEmoddate == date("Y-m-d"))	{$GOOGLEpriority = "1.0"; $GOOGLEfreq = "hourly";}	// today



IDIF is a stupid format: it includes the entire blog_contents, so the files are huge. In the process of setting this up, I learned that flat files have a maximum size of 1.4 megs or so (the size of an old floppy disk), so I had to create more than one.

Which explains the stupid concept of a "pointer file"; instead of just giving Yahoo the IDIF file itself, you give it a pointer file with URLs pointing to the multitude of IDIF files. Really stupid.

News flash, after finding the Journal Of Ovid on the web, I learned about length restrictions for the input fields (described below). This information was not contained on the Yahoo web site describing their file format. It considerably reduced the file sizes but I retained the structure of multiple files because who knows what I'll learn next?

IDIF title must be a maximum of 80 characters
IDIF description must be a maximum of 180 characters
IDIF body must be a maximum of 1000 characters
I'm only guessing about keywords

Thanks to the Journal Of Ovid on the web for this secret information

From the inside out: trim, replace whitespace (thanks to the PHP manual for this), shorten to maximum length

$IDIFtitle		= substr( preg_replace ('/\s\s+/', ' ', trim($title) ), 0, 80 );
$IDIFdescription	= substr( preg_replace ('/\s\s+/', ' ', trim($description) ), 0, 180 );
$IDIFkeywords		= substr( preg_replace ('/\s\s+/', ' ', trim($keywords) ), 0, 79 ) . " ";
$IDIFblog_contents	= substr( preg_replace ('/\s\s+/', ' ', trim($blog_contents) ), 0, 1000 );

Yahoo is said to support a simple text file list of URLs "urllist.txt" Documentation, of course, is scarce


(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Use the Internet for Your Club (1)

Most clubs, family groups, or neighborhood associations are held together by one loyal volunteer who does all the work. This limits the scope of the club to what one person is able to do in spare time. When that central person gets tired of it or moves away, things tend to fall apart. In the spirit of encouraging more volunteerism, this article suggests some ways the home computer can easily automate the normal drudgery of running a club. Having just performed this task for the local computer society, I can report it takes about two hours to put it together. If I did it three times, it would take forty-five minutes. A rank beginner, who doesn't even know what the words mean, might take all day to do it, but no more than that.

Most of the programs a club would need were first developed for people on the go, like a salesman who visits several cities, or a college student who commutes. It's an easy step to imagine different club members in different places instead of one person in several places. Electricity travels so fast that connecting computers together with the whole world's Internet can be thought of as essentially all one big computer. For practical purposes, it doesn't matter whether a piece of information is in two parts of one computer or in two different computers hooked together by the Internet. The whole process is so cheap it might just as well be free.

{The new Macintosh Mac Mini is designed for people who already have a keyboard and monitor, such as existing PC users, who might want to switch.}
The new Macintosh Mac Mini
is designed for people who already have a keyboard and monitor,
such as existing PC users, who might want to switch.

Selection of Computer and Operating System. Over ninety percent of the world's home computers are based on the Windows operating system, but Windows is having a lot of trouble right now with viruses and spam. Right now is Apple's big chance, because the Apple OS X operating system, based on Unix, seems to be immune to viruses and spam. So, if you are buying a new computer, I suggest you look at Apple's headless version. That's a little six-inch box to which you attach the monitor, keyboard and printer that presumably you have left over from some Windows system. Times will change, but right now this five hundred dollar little headless job is worth the money. That's for the club secretary; all the club members can use any kind of machine they happen to have, for read-only use.

Router. If you have several computers on one telephone line, you need a router to send the right signals to each machine. Because the router changes the identification numbers every time it is restarted, it tends to foil the buccaneers out there who are trying to find your credit card. Therefore, it's not a bad idea to have a router attached, even if you only have it connected to a single computer. Security folks say it takes about fifteen minutes for some buccaneer to find a newly installed computer, and most banks get several hundred break-in attempts every hour. That's because everybody is getting automated these days, including criminals.

Choice of Browser. After you get set up and organized and all, you need to download the Fire fox browser, which right now is faster and more spam-proof than either Internet Explorer or Netscape. Go to some other browser and enter http://www.mozilla.org/products/firefox/ . There's no harm in having several browsers sitting on your computer, including Opera if you like, but right now Fire fox is the one to use. A browser, in case you care, is a program that takes a stream of Internet data and translates it into the image on your screen, sort of like translating Morse code into a telegram. Some browsers are lean, mean and fast, while others are loaded with a lot of bells and whistles that slow them down. If you can't see any difference by trying them, go with the one that gives you most spam protection.

{You can have a personal calendar by clicking on http://calendar.yahoo.com.}
You can have a personal calendar by clicking on
http://calendar.yahoo.com

Yahoo Calendar. There are lots of computer calendars, but right now Yahoo offers one that is somewhat better for public use by clubs. For an illustration, take a look at the Philadelphia Orchestra calendar that can be located on Philadelphia Reflections in the lower left column, by first clicking the Philadelphia Calendars button, and then clicking the link to the Orchestra's schedule. Naturally, the Orchestra doesn't want people changing their public schedule, so the calendar is read only. You can create a calendar like this for your club or organization by going to www.calendar.yahoo.com and entering an identifier and password. You can only change the calendar if you have the password, so be careful who is allowed to have it. If you make a misjudgment about this, just abandon the calendar and start a new one. You can of course create a personal calendar for yourself; it would be nice to merge your calendar with organization calendars. Calendar-merge programs do exist, but presently are a little primitive. Even nicer would be the ability to drag and drop individual events from one calendar to the other, but that's mostly on the wish list.

Yahoo Address Book. There are zillions of address books, but Yahoo provides a public one, if you allow club members to know the password. On the one hand, it's a big convenience for the secretary to have everybody fill in his own data. It can take ten or fifteen minutes apiece to complete all that information. On the other hand, if just anybody can have all this data, you can expect to get lots of unwanted solicitations. Naturally, you want to keep intruders from altering the data, but whether or not you make your membership list public is your own decision. So, probably you want to transfer the data to a list that you keep private, using a system of letting people enter data, and then erasing it after it is transferred.

Listers. A very handy tool is to create a listers, which is a system of e-mail that is sent to everyone on the list, and everyone can chime in with comments. It makes for a lot of local excitement, and it keeps families together, including reunion classes from all the schools you went to, 'way back then. If the Rs and the Ds get to bashing each other on the Listserv, you will learn the value of designating some sober soul to be list master, given the power to exile people whose mouths get too noisy.

Minutes and History as Blogs. Most clubs keep minutes, and after a while they start to record their history. It's a lot of work, and often gets lost; furthermore, it's hard for anyone but the author to read. We suggest you create a blog, and hang it on the Internet.

While there are a dozen programs and systems for creating blogs (that's short for Web logs), Google has bought blogger.com from that company, and has pepped it up quite a lot. Like the rest of these ideas, this one is free, and there are several million of these in existence. Sometimes people write poetry in the form of blogs, and some other people put up some pretty raunchy pictures or commentary. Apparently Google doesn't care, so they shouldn't mind if you publish the minutes of the East Whip switch Cooking Society as blogs. It's very easy to do, and their canned templates produce some pretty elegant web sites in minutes. That's right, minutes in minutes.

Finances and Newsletters. Clubs typically collect dues or charge for luncheons, but financial stuff on the Internet is more complicated and must be dealt with in a later article. Similarly, you can publish a newsletter using RSS that is very spiffy indeed, but that's really hard to explain, and must be described in a separate article, too. Anyway, these preliminary items are enough to keep a new club busy for a few months.

Fast User Switching. Other operating systems will surely imitate it, but Apple is at present where you have to go to make a separate computer section for your club. Apple originally had the idea that several people would use the same machine, and want to keep their data secret from each other. So, they have a system in which you can click the upper right corner of the screen, and you can place yourself in a secret room with its own password. We suggest that it would be better to see this as a new desktop. All graphical interfaces of all computer operating systems use the metaphor of a desktop, which is what suggested to me that the club needs a desktop like my own. That is, it's littered with half-finished business of a dozen sorts, suddenly abandoned when the phone rings or a visitor arrives. You would like to be able to come back to your desktop and take up your work where you left off. For that, you probably need several desktops, and that's what fast user shifting provides you. Not vitally essential, but very convenient.

Favicons. Especially if you have fast user desktops specially designated by work topics instead of people, you can really use the favicon, or favorite icon, feature. A favicon is the little miniature do-hickey to the left of the webpage URL in the URL box. Maybe you never noticed it, but it's usually there. If you take your mouse and drag the favicon onto the desktop (you may have to shift something to create some blue sky desktop room) a new icon will appear on the desktop. Close up and click on that new icon, and you will open up a browser and go right to the page you were using when you created the icon. This is such a real neat feature that your desktop is apt to fill up quickly with a lot of web pages you happened to come across. It doesn't take long for the favicons to choke the desktop into uselessness, so this feature is at its best in a system where the topics of general utility to the user are sub-set by fast user switching.

{Apache share of market}
Apache share of market

Apache has the largest share of the market and is available for most computers.

Your Own Website. Apache. Your club will soon get the idea that you need your own website, but in fact you already have several of them. Your calendar, address book, club minutes blog, club history blog already add up to four websites. To most people, having their own website means consolidating all this material into one elegant page, with photos and artwork. You can do that, but it's much harder, and you first need to see if you really have a need for that.

If you do, and particularly if your club runs a little on the snooty side and highly prizes its privacy, you might want to consider going all the way and becoming your own Internet provider. That brings us back to Apple, since the OS X system includes a free copy of Apache, the program for running your own site on your own computer. Now, that's really a big undertaking, far beyond the average club. So if privacy of that order is mandatory, you may have to hire someone to do it for you. But Apache sure makes it possible, if that's where you feel you want to go.

Use the Internet for Your Club (2)

First, take a look at what you are trying to achieve, and a handy example would be

{Yahoo}
Yahoo

http://www.yahoo.com/. You will see it is not a daily or a weekly, it is continuous. The page of the newspaper is a montage of ten or twelve blocks on a page. For example, one block might give you the month's schedule, another shows the sports scores, another shows the stock market, etc. Each one of those blocks is probably updated at a different time, making this a continuous newsletter, and of course there is a way provided to individualize the blocks of space, change the color schemes, etc. Since this newsletter is on the Internet, anyone can read it from anywhere in the world, at any time. That is, they can read it if they know the password, which some clubs want to keep private, and others prefer to skip because it is a nuisance when people forget what the password is.

What underlies this process is a technique known as RSS. Each block of space in the newspaper is operating on a different scheduling, and each blocks "polls" a donor site every so often, for example fifteen minutes. The polling program calls the URL of each donor site at a preset time, where a record is kept of the last time the site was modified. If the site has been changed since that last visit of the polling program, the new site is downloaded to the newsletter page. If there has been no change, the polling program simply goes on to the next-scheduled site. In effect, the polling program is acting as a "robot". Modifications of this system, with considerable elaboration, are at the heart of the Google robot and other robots for other purposes. Generally speaking, the ordinary user doesn't have to know how to construct one of these robots, or modify one. No doubt, there will be extensive elaboration of this concept in the near future, but that's essentially how you can construct a usable newsletter in short order.

Web hosting providers

Choosing a web hosting service provider is difficult. There's no brand to rely on and an Internet search turns up confusing claims and offers. We have used two providers:

  • Internet Planners is the service currently running this web site. They have provided what they promised. They do not run the latest versions of MySQL or PHP which means certain newer features are not available but it probably improves their stability and what they offer is, in fact, provided. What was missing that we noticed included
    • RSS parsing (Magpie was a good substitute)
    • and HTML Tidy, which we've just lived without, substituting our own Regular Expressions.

  • Network Solutions, on the other hand, has provided very poor service and has high fees. They run later versions of PHP and MySQL than Internet Planners, but their implementation is poor and the result is that critical functions are not available: user authentication for example (how can an Internet service provider not support user authentication?).

Based on our experience, Internet Planners is a reasonable choice for web hosting; Network Solutions is a bad choice.

Webpage Printing

This site offers a Print button for all Reflections and Topics. Formatting the text on the pages to print nicely works quite well; how to specify what to do with images remains a bit unclear (as of August 2006).

The "trick", if it can be called that, to special print formatting is the media attribute for CSS styling. The main stylesheet for this website is called in a LINK statement as follows:

<link rel="stylesheet" type="text/css" media="all" href="stylesheets/reflectionsLayout.css">

The media attribute tells the browser to use this sylesheet for all media types, i.e., for screens and printers. In the pages that are formattted to print is a stylesheet that cascades below the main stylesheet and therefore supercedes it. This stylesheet controls the printing.

The specification of

<body onload="window.print()"> (all lower case for XHTML purposes)

is what forces the print dialog to appear.

The remaining problem is how to specify CSS formatting for images so that text flows around them as we want. The formatting seems to work on screen for all browsers but only on some browsers for printing.


<style type="text/css" media="print">

/*
	Style Sheet to print Philadelphia Reflections
*/


body			{
				margin:		0;
				padding:	0;
				width:		100%;
				}				
				

#wrapper		{
				margin:		0;
				padding:	0;
				width:		100%;
				}
		
		
#center			{
				margin-top:		0;
				margin-bottom:	0;
				margin-right:	0;
				margin-left:	0;
				padding:		0;
				}


				
#content		{
				font-size:120%;
				}
				
</style>


(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

The Web As Investigative Reporter

{Reporters}
Reporters

Freedom of the Press seems a tiresome, old topic, until the Internet gets considered. What's fundamentally always been at issue is the election process, useless without people knowing what they are voting on. Freedom of the Press has smaller value, the day after election.

The point here is that the Internet has added considerable speed to the spread of public information, and its two-way character also speeds up the process of reporting falsehoods. Everyone understands politics can get dirty, and it is most important to discourage lies and discredit liars, in time for election day.

Newspapers are only a part of the process. Investigative reporters actually investigate very little; they sit about the newsroom hoping for someone to bring in news of a scandal. Because informants usually have some self-serving motive, a responsible editor will not permit such a story to be printed without independent verification. If the election comes and goes before the story is verified, it's too bad for democracy, why bother with a useless expose'. The traditional way to slow down publication is to threaten a libel suit. In this way, libel suits, investigative reporting, editorial courage, and political campaigns are all one big ball of wax, different parts of the same game. Protection of anonymous press reports accelerates publication, while libel suits retard publication. Early in November, time matters, so enter the Internet.
In a funny sort of way, the Internet tends to diminish the injury of libeling someone, just because it lacks much restraint. Websites have a smaller audience than newspapers, and their audience is more specialized. Therefore, collective injury to an innocent person's reputation is greater where the audience is also more innocent, as it is when a whole city picks up the morning paper. Furthermore, the Internet audience can react. They can pummel the reporter's boss, the editor. They can pummel the editor's boss, the publisher. Hit and run dirty politics will always be with us. But with the web there's getting to be less time to run -- after the hit, but before the exposure.

The Beginnings of E-Mail

Dan Rottenberg, who wrote an outstanding book about Anthony Drexel called The Man who Made Wall Street, had access to many private papers that had to be omitted from that book because of space limitations. He tells an interesting tale about telegrams between Drexel and his bulbous-nosed protege at the New York office, J.P. Morgan.

Around 1880, Morgan put AT & T together, but before the telephone came into being, most high-speed communication was by telegraph. Naturally, Drexel and Morgan could afford to have a private telegraph line going between them. It would have been a bit much for them to use Morse Code themselves, so the scraps of conversation were written down and some have been preserved.

{{J.P. Morgan/>



<p>Spam</a>, of course. If you want to avoid hackers, intruders, and unwanted advertisements, then as now, you have to be a zillionaire. Since, however, Morgan's <a href=}
{J.P. Morgan/>

Spam, of course. If you want to avoid hackers, intruders, and unwanted advertisements, then as now, you have to be a zillionaire. Since, however, Morgan's

private library on Madison Avenue had lots and lots of pornography hidden away, it does almost boggle the mind to imagine what might have been accomplished with a telegraphic wire tap.

XHTML vs. HTML

The markup language used by web browsers continues to evolve. The most current version (as of August 2006) is XHTML 1.1, an XML version of HTML.

Many browsers, most particularly IE, do not support XHTML. Technically speaking, they support only the "text/html" mime type, not "application/xhtml+xml". Lots of web developers have gone to the trouble of sticking closing tags ( />) in their BR, HR, META and INPUT tags and a DOCTYPE at the top but then serve the code as "text/html".

This produces a syntactic mish mash which is almost certainly worse than using strict HTML 4.01.

Why "worse"? Because of the possibility of unintended results from providing incorrect instructions to the browser. If you care about the output produced by the browser, which most developers and content providers emphatically do, then you have to be careful about what instructions you give the browser. You simply cannot count on getting what you want if what you're telling the browser to do is syntactically incorrect.

However, it's a little difficult to see just what good XHTML is:

  • There are rumors that it renders the non-image portion of a page as much as 50% faster than HTML, but what with gzip and broadband being pretty common these days, it's hard to see that as an especially compelling reason to be bothered.
  • Furthermore, those browsers that do render XHTML (Mozilla, Firefox) are very picky about syntax and blow up much too easily.
  • And the claim that XHTML is the way to get your web pages onto cell phones and toaster ovens leaves me cold. It's just not believable that the format required for these special devices will be the same as for a computer monitor.

Internet cognoscenti speak disparagingly of "tag soup" but the Internet is a lot more about content than it is about syntax, so who really cares?

Well, somehow, I do. A little. Since we use PHP on this site, we have the opportunity to figure out what features are supported by a browser and render the correct types of tags, mime-types, etc.

Check out the headers and the page source in Mozilla to see it in action:

  1. It renders XHTML 1.1 correctly whenever it encounters a browser that can support it
  2. It uses output buffering (which demonstrably if illogically improves rendering response time)
  3. It sends the whole thing using gzip compression if the browser will support it
<?php

#
# Both http://www.workingwith.me.uk/articles/scripting/mimetypes
# and http://keystonewebsites.com/articles/mime_type.php
#
# ... show how to serve the correct mime type and HTML prologue
#
# ... I prefer http://www.workingwith.me.uk/articles/scripting/mimetypes
#     because it serves XHTML 1.1 instead of XHTML 1.0 Transitional
#     and HTML 4.01 Loose
#
# http://www.hixie.ch/advocacy/xhtml Discusses using the wrong mime type.
#
# http://www.goer.org/ is a very interesting site on this subject
#
#
# I also include the privacy header created for me by http://www.p3pwiz.com/
# and I modified the fix_code function to include gzip.
#
# It is possible that we can get the server to do all our compression for us automatically.
# At this point I have not tested this but here are two references:
#
# http://elliottback.com/wp/archives/2006/01/12/http-gzip-compression-in-php/
# http://lists.evolt.org/archive/Week-of-Mon-20050228/170256.html
#

$charset = "iso-8859-1";
$mime = "text/html";

function fix_code($buffer)
	{
	#
	# I modified this for gzip
	#
	if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip'))
		{
		header("Content-Encoding: gzip"); // required to un-gzip to output
		return (gzencode(str_replace(" />", ">", $buffer),6,FORCE_GZIP));
		}
		else
			{
			return (str_replace(" />", ">", $buffer));
			}
	}

if(stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml")) {
   # if there's a Q value for "application/xhtml+xml" then also 
   # retrieve the Q value for "text/html"
   if(preg_match("/application\/xhtml\+xml;q=0(\.[1-9]+)/i",
                 $_SERVER["HTTP_ACCEPT"], $matches)) {
      $xhtml_q = $matches[1];
      if(preg_match("/text\/html;q=0(\.[1-9]+)/i",
                    $_SERVER["HTTP_ACCEPT"], $matches)) {
         $html_q = $matches[1];
         # if the Q value for XHTML is greater than or equal to that 
         # for HTML then use the "application/xhtml+xml" mimetype
         if($xhtml_q >= $html_q) {
            $mime = "application/xhtml+xml";
         }
      }
   # if there was no Q value, then just use the 
   # "application/xhtml+xml" mimetype
   } else {
      $mime = "application/xhtml+xml";
   }
}

# special check for the W3C_Validator
if (stristr($_SERVER["HTTP_USER_AGENT"],"W3C_Validator")) {
   $mime = "application/xhtml+xml";
}

# set the prolog_type according to the mime type which was determined
if($mime == "application/xhtml+xml")
	{
	#
	# I added this, G4
	#
	ob_start("ob_gzhandler");
	#
	#
	$prolog_type = "<?xml version='1.0' encoding='$charset' ?>
<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.1//EN' 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'>
<html xmlns='http://www.w3.org/1999/xhtml' xml:lang='en'>\n\n";
	}
	else
		{
		ob_start("fix_code");
		$prolog_type = "<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01//EN' 
		'http://www.w3.org/TR/html4/strict.dtd'>
		<html lang='en'>\n\n";
		}

# finally, output the mime type and prolog type
header("Content-Type: $mime;charset=$charset");
header("Vary: Accept");

// privacy header created at http://www.p3pwiz.com/
header("P3P: policyref=\"http://www.philadelphia-reflections.com/w3c/p3p.xml\",
 CP=\"NID DSP NOI COR\"");

print $prolog_type;
?>

Here's an interesting article on Doctype Switching: http://gutfeldt.ch/matthias/articles/doctypeswitch.html

The Philadelphia Reflections webmaster: George IV

(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

2007 PS:
It turns out that Firefox has a number of intolerable quirks in the way it displays pages presented to it using XHTML 1.1 and the application/xhtml+xml mime type. I was unable to figure out a satisfactory way of circumventing these bugs and so I have reverted to XHTML 1.0 Strict and the text/html mime type, which solves all the problems but annoys me quite a lot.

Quakerism and the Industrial Revolution

{Richard Arkwright}
Richard Arkwright

had a lot to do with manufacturing cotton cloth by religious dissenters in the neighborhood of Manchester, England in the Eighteenth Century. What needs more emphasis is the remarkable fact that Quakerism and the Industrial Revolution both originated at about the same time, and in about the same place. True, the industrializing transformation can be seen in England as early as 1650 and as late as 1880. The Industrial Revolution thus extended before Quakerism was even founded, as well as long after most Quakers had migrated to America. No Quaker names are much mentioned except perhaps for Barclay and Lloyd in banking and insurance, and Cadbury in candy. As far as local history in England's industrial midlands is concerned, the name mentioned most is Richard Arkwright, whose behavior, demeanor and beliefs were anything but Quaker.

It is instructive, however, to examine the nature of Arkwright's achievement.

{Karl Marx}
Karl Marx

He seems to have invented nothing, stealing the patents and ideas of others freely, while disgustingly boasting about his rise from rags to riches. Some would say his skill was in organization, others would say he imposed an industrial dictatorship on a reluctant agricultural community. He grew rich by coercing orphans, convicts and others he obviously disdained into long, unpleasant, boring and unwelcome labor that largely benefited him, not them. In the course of his strivings he probably forced Communism to be invented. It is no accident that Karl Marx wrote the Communist Manifesto while in Manchester visiting his friend Friedrich Engels, representing reasonably well the probable attitudes of Arkwright's employees. What Arkwright recognized and focused on was that enormous profits could flow from bringing piecework weaving into factories where machines could do most of the work. Until his time, clothing was mostly made by piecework at home, with middlemen bringing it all together. The trick was to make clothing cheaper by making a lot of it, and making a bigger profit from a lot of small profits. Since the main problem was that peasants intensely disliked indoor confinement around dangerous machines, the industrial revolution in the eyes of Arkwright and his ilk translated into devising ways to tame such semi-wild animals into submission. For their own good.

{Charles Babbage}
Charles Babbage

The Quakers in the region, however, taught that it was an enjoyable experience to sit indoors in quiet contemplation. Their children were taught to submit to it at an early age, and their elders frequently exclaimed that it was a blessing when everyone remained quiet, enjoying the silence. Out of the multitude of religious dissenters in the first half of the Seventeenth century, three main groups eventually emerged, the Quakers, the Presbyterians, and the Baptists. Only the Quakers taught that silence was productive and enjoyable; the Calvinist sects leaned toward the idea that sitting on hard English oak was good for the soul, training and discipline was what kept 'em in line.

{babbagemaq.jpg}
babbagemaq.jpg

The Quaker idea of fun through day dreaming was peculiarly suitable for the other important feature of the Industrial Revolution that Arkwright and his type were too money-centered to perceive. If workers in a factory were accustomed to sit for hours, thinking about their situation, someone among them was bound to imagine some small improvement to make life more bearable. If such a person was encouraged by example to stand up and announce his insight, eventually the better insights would be adopted for the benefit of all. Two centuries later, the Japanese would call this process one of continuous quality improvement from within the Virtuous Circle. In other cultures, academics now win professional esteem by discovering "win-win behavior", which displaces the zero sum, or win/lose route to success. The novel insight here was that it has become demonstrably possible to prosper without diminishing the prosperity of others. In addition, it was particularly fortunate that many Quaker inhabitants of the Manchester region happened to be watch makers, or artisans of similar trades that easily evolved into the central facilitators of the new revolution -- becoming inventors, machine makers and engineers.

The power of this whole process was relentless, far from limited to cotton weaving. When Charles Babbage sufficiently contemplated the punched-cards carrying the simple instructions of the knitting machines, he made an intellectual leap to the underlying concept of the tabulating machine. Using what were later called IBM cards, he had the forerunner of the stored-program computer. There were plenty of Arkwrights getting rich in the meantime, and plenty of Marxists stirring up rebellion with the slogan that behind every great fortune is a great crime. But the quiet folk were steadily pushing ahead, relentlessly refining the industrial process through a process of welcoming the suggestions of everyone.

Computerizing Medical Care

{First Computer}
First Computer

My first encounter with a computer was in 1958, and I have loved them ever since. As president of what called itself the Delaware Valley Hospital Computing Society, I remember giving a dinner speech concluding as follows: "If you want to be happy for a day, get drunk. If you want to be happy for a week, get married. But if you want to be happy for a lifetime, get a computer!" After fifty years, my affection continues. But to be candid, billions of dollars about to be spent on computers in medical care, will mostly be wasted. Even worse, like malpractice suits computers will induce behavioral changes in the system costing far more than the directly visible costs.

That's unpopular news at present, since the National Business Coalition for Health has launched a major lobbying campaign to persuade Congress to spend an initial billion dollars inducing physicians to maintain an electronic medical record. Various health insurance companies already provide financial incentives to doctors to file electronic claims forms, eventually threatening to reject any claim submitted on paper. The American College of Physicians has established a rather large department to develop programs for physicians to use in their practices; twenty years ago the University of Indiana started much the same thing. The College of Physicians of Philadelphia has spent close to a million dollars on such a project. It is reported that Microsoft Corp. has a massive project underway to supply electronic medical records. It sounds fairly easy to obtain large research grants from the government to devise something, anything, useful in this area. In my own case, training funds really weren't necessary, since I eagerly got into the field when everybody was a beginner. I was just as good a beginner as any other beginner. But let me repeat: the electronic medical record has been in the past and will be for decades, an expensive digression. In health care, creating more administrative work isn't the solution, it is the problem.

For fifty years the problem with an electronic medical record was that it took too much of the doctor's time to complete his part of the input, and then cost him too much to pay employees to do the rest. Presumably, automatic voice recognition and dictation will soon make it possible to record doctor's notes without handwriting or typing. Since, however, the elimination of current paper forms and check-off boxes will create a major problem in organizing the dictation verbiage, it could add five or ten additional years before programmers manage to rearrange dictation material and effectively integrate it into organized form, complete with laboratory results, dictated x-ray and EKG reports, even small images of the original material. Temperature, blood pressure, weight, photographs and the like can all be readily integrated into the stored electronic record, but to do so usefully is an expensive programming project. Doctors are quite right to be anxious they will lose control of the usefulness of their records in order to ease the task of programmers, speed up the sluggish pace of development, and reduce what will surely be an unexpected cost overrun. Storage and retrieval of such records is known to be an achievable but expensive task, which however also risks sacrificing the speed and ease requirements of the medical task it is supposed to serve -- in the name of cost effectiveness.

Computers are no longer an unfamiliar tool; physicians have altogether too much experience with "vaporware" , unrealized promises of convenience, and the damaging effect on medical quality of the philosophy of Quick and Dirty. To respond to their resistance to design blunders with an accusation of undue conservatism is to provoke an icy stare and gritted teeth. Inevitably, the effective use of automation will require a redesign of workflow with major disintermediation of "gopher" staff; after all, that is how cost savings are to be achieved. That will provoke outcry that physician time is the most expensive component in the process, but unfortunately physicians will discover Information Specialists with a business background will brush that argument aside. The most overpaid people on the face of the earth are investment bankers, but information consultants have persuaded business executives that inefficiency of the investment process is more expensive than even an investment banker's time. Having been through this themselves, insurance executives are unlikely to pay the slightest attention to physicians dancing to a familiar old tune.

For all that, data input is not the real problem; it's just the first problem. It's in a class with data storage and retrieval, which is expensive and cumbersome when you add a need for instant access and total privacy. But costs will come down steadily, and eventually we can expect automated fingerprints or other biological identification, and cheap instant retrieval. Doctors will be able to make rounds in the hospital with a computer in their pocket, record telephone calls in their entirety, dial automatically and whatnot. There are problems with wireless transmission inside buildings with steel girders, and legal requirements for signatures on narcotic orders, but if we are determined, these problems can be overcome as easily as they were with electronic check writing and stock brokerage. Cost may top twenty billion dollars in twenty years, but it all can be done if we insist.

But then you encounter the real problem. Information will accumulate in these records in staggering amounts. Even if you resolutely resist demands to have the nurses record every groan, and the orderlies file every laundry slip, the legitimately important medical information will be exposed as the massive heap of transients that they really are. Plaintiff lawyers will insist no scrap of data may be deleted, hospital administrators will insist on compliance, when in fact most of a doctor's concentrated effort is devoted to brushing aside momentarily distracting data in order to see what's going on, and react to it instantly. When a quick look doesn't solve the problem, the doctor goes back for additional data. If you disrupt these skills and traditions of coping with information overload, evolved over centuries, you will at best impose frustrating delays on a complex system under pressure, and ultimately inspire elaborate systems of short-cuts. The Armed Forces are famous for paperwork, but even they know better than to ask a pilot for his Social Security number as he starts a bombing run. The hospital nursing profession has already just about collapsed under paperwork pressure. If you see five nurses in a hospital, three of them will be sitting down writing something. The terrible truth is that no one reads it, no one checks it, and ultimately it sits in the record room waiting for a plaintiff lawyer with unlimited time to sieve out some misrecorded misconception or uninformed conclusion. My faith in the computer is such that I feel sure that methods can be devised to produce periodic summaries, automatic alarm signals, and mostly effective prioritization of data elements. Unfortunately, medical care is changing at such a rapid rate that ad hoc automation of physician thought processes cannot keep up with the current pace of change in medical progress. You would think some things would be unthinkable, but since I can remember the organized campaign to suppress the CAT scan as an unnecessary expense, I confidently predict that programmer inability to keep up with some advance in medical care will at times lead to organized outcry that we should slow down the pace of improving medical care, so that computer clerks can keep up with it. But that is only a small part of the issue, which at its center is that physician time will be dissipated and his attention distracted by presenting him with unwieldy amounts of neatly printed, spell-checked, encrypted and de-encrypted, biometrically secure, hierarchically prioritized -- avalanches of data which are irrelevant to the issues of the moment. The goal is not, after all, an electronic record. The local goal is to decrease the cost of medical care by increasing the productivity of the physician, and the overarching goal is high quality patient care at reasonable price. Behind all that, since the impetus comes from NBCOH -- the ones paying the insurance premiums -- suggests that the local goal is not so much the improvement of care as oversight reassurance that care provided has been as good and as cheap as possible. The goal is legitimate, but this cybernation approach looks to be self-defeating by being overly specific.

If the reader has the patience for it, let me now cite a historical example of the third-party tail wagging the medical dog. In this case, third-party health insurance similarly overextended its reach by imposing internal health system changes, trying to facilitate the role of monitoring it externally. Specifically, the system of diagnostic code numbers was changed from one devised by the medical profession for its purposes, into a different coding system devised outside medial profession sponsorship, which seemed to suit the needs of payment agencies better even though it suited medical purposes less. After twenty-five years, it is now clear that third-party payers have shot themselves in the foot on this matter, and everyone is worse off. The topic, please pardon the obscurity, is the diagnostic coding system.

To go back to beginnings, the American Medical Association perceived a need for a diagnostic coding system in the 1920s. Organizing or even merely indexing vast amounts of information about a disease required more specificity than free style verbal nomenclature could provide. Quite a distinguished panel of specialists and consultants then produced the Standard Nomenclature of Diseases (SNODO) which in time became the Standard Nomenclature of Diseases and Operations. In order to reduce ambiguity, this system developed a branching-tree code design for anatomy, linked to a branching-tree for causes of disease, ultimately linkable to a branching tree of procedures. These three sets of three-digit codes linked the components together with hyphens (000-000-000). The first digit of each was the most general, as in Digestive, Musculo-skeletal, etc. and subsequent digits were progressively more specific and detailed, as in "Digestive, large intestine, sigmoid colon". The causes of disease would resemble "Infections, bacterial, streptococcal". An example of Procedures would be "Incision, incision and drainage, drainage and insertion of drain". In nine digits, it was thus possible to represent " incision, drainage and insertion of a drain into a streptococcal infection of the sigmoid colon". After a while, the codes grew from three to five and six digits, again repeated three times, so an immensely detailed, unambiguous description might be coded in fifteen digits by a physician who knew the rules, but didn't own a code book. This code was ultimately taken over by the Academy of Pathology, expanded and is called SNOP. The pathologists absolutely refused to give it up.

The rest of the profession gradually yielded to the pressure of hospital administration, who were pressured by the Association of Medical Record Librarians, responding to the views of outside statistical interests, particularly insurance. A simpler, shorter coding system was needed, they felt, concentrating on the thousand most common diseases. The International Classification of Diseases was produced, reducing the millions of SNODO diagnoses to 999 by heavy use of several varieties of "Miscellaneous" or "Not Otherwise Classifiable (NOC)". Since the goal was to count the incidence of common diseases, the coding system was stripped of any logical tree-branching, and became a short list of what was most common, starting with 1 and going to 999. In time, of course, the common-ness of conditions changed, and various complaints from various directions forced the ICD to go to 4 digits, then five. Unanticipated conditions or complications eventually required the patchwork of some alpha "modifiers", and the original short hodge-podge became a long and bewildering hodge-podge. Coding accuracy declined markedly, but ho-hum. The health insurance companies paid the bill, no matter what the code said. At another place, we will discuss the entertaining way that Ross Perot became a billionaire out of the computer chaos of Blue Cross and Medicare at this time, but right now the central theme to follow is DRG, Diagnosis Related Groups. Try to follow, please.

By 1980, Medicare was fifteen years old. It was clear that certain things just had to be changed, because the excuse that the system was new and untried was beginning to wear thin. The early designers of the system based their payments on auditing a hospital's yearly costs, auditing the proportion of patients who were Medicare beneficiaries, and paying a proportionate share. That was easy and reasonably accurate, but it had the rather significant flaw that it took no account of whether the patients needed to be in the hospital in the first place. Or whether they needed to stay so long. The response they adopted (in the Budget Reconciliation Act of 1983) is a measure of just how desperate they must have felt. Knowing full well how inaccurate the ICD coding system was in practice, it was all there was. Consultants, particularly at Yale, ran computer simulations of various subsets of ICD codes to find a formula that would produce approximately the same hospital payments as the system of cost reimbursement. If memory serves, the original formula was to divide the thousand ICD codes into 27 diagnosis-related groups (DRG). Eventually, the process was tweaked to seventy or eighty groups. Walter McNerny, then Past President of the American Hospital Association told Congress hospitals could live with this system, and promptly we had a system for paying out hundreds of millions of dollars. It was touted as a highly sophisticated advance in the arcane science of hospital reimbursement, so it must have included a lot of deliberate overpayment. I can remember trying to remonstrate with McNerny, who felt he didn't have time for the discussion. Physicians had very little to do with the DRG portion of the 1983 Medicare Amendments, because the AMA had long insisted that physicians and hospitals go their separate ways on reimbursement. Russell Roth, who was president of the AMA at the time, recounted many times the episode in the Oval Office, when it was announced to Lyndon Johnson that Dwight D. Eisenhower"was in the next room waiting for him. LBJ excused himself to leave, and on the way out said to Wilbur Cohen, "Give him anything he wants." Things were destined to change, but at least for a very long time, physician and hospital reimbursements were strictly independent.

Money Bags

This little morality tale was told to me by two unrelated sources, one of whom was a staff aide to Wilbur Cohen, the author of the Medicare law. And the other was a high official of Pennslvania Blue Shield, the appointed administrative agent for Medicare in Pennsylvania.

After Lyndon Johnson rammed the Medicare amendment to the Social Security Act through Congress, he wasn't shy about drawing attention to it. The press was present in great numbers, with staff officials who had a role in crafting the document, members of Congress, and anyone else who was standing around. The legislation was laid before him and signed with twenty different pens to be presented as mementos to the in-group. Each pen was only used to inscribe about half of one letter of his name, so it was a slow but joyful process. As intended, it got lots and lots of publicity.

{H. Ross Perot}
H. Ross Perot

So, thousands of thankful old folks saw the ceremony on television, thought they heard that the law was in effect immediately, and proceeded to dump their medical bills in a shoe box, sending them to Medicare to pay. Unfortunately, Medicare didn't have an office, a staff, or even a telephone number. These things take time. As fast as they could, the Medicare staff constructed a system of carriers and intermediaries, carriers for part A, and intermediaries for part B. And almost without exception, appointed the local Blue Cross and Blue Shield organizations to be the carriers and intermediaries, respectively, since the organization of Medicare was patterned closely after the organization of the two administrative corporations. Meanwhile, the bills from old folks just kept pouring in through the postal service. It was just about all the staff in Washington could do, just to direct the mail out to the local intermediaries and at least get it out of their hair.

Less than a year later, that's how the claims got to Camp Hill, PA, a little suburban town near Harrisburg. In desperation, Blue Shield had rented a local vacant supermarket, and piled the mailbags ten feet high. There were quite a few telephone calls of inquiry, and the old folks were politely told the matter was being looked into. It was beginning to look as though the supermarket wasn't big enough.

Computers were of course rented from IBM, who had a policy of renting, not selling, its valuable equipment. Keypunch operators, computer operators were hired, air conditioning was installed, and one team after another of computer programmers was hired -- and fired. Consultants were called, scratched their heads, sent big consultation bills, and turned sadly away. Sorry, but somehow it just don't work.

So that's how it happened that one Friday afternoon, a vice-president of Texas Blue Cross named H. Ross Perot came in, accompanied by a fellow with glasses so thick they looked like the bottom of Cocoa Cola bottles. So far as anyone can remember, the guy with coke-bottle glasses never said one word. The desperate, hopeless mess was explained to Perot, whose salary at that time was rumored to be twenty-five thousand dollars a year, about right for a Blue Cross executive. His background as a kindred Blue Cross person inspired confidence, and the conversation rambled on for an hour or so. Meanwhile, the guy with coke bottles went over to the Penn-Harris Hotel across the street, and got to work. By the end of the weekend, he had come back a couple of times, but eventually, would you believe, it really, well it really worked. Contracts were quickly signed, the wheels began to turn, the mail bags in the supermarket began to march through the processing cycle. Blue Shield, the Medicare program, the finances of the nation's elderly, and Lyndon Johnson's reputation -- were all rescued.

As everyone now knows, the Medicare processing contracts made Ross Perot into a billionaire, living on Bermuda in the lap of luxury, eventually upsetting the re-election hopes of George Bush, senior by running for President himself on a third party ticket that had something or other to do with giant sucking sounds. A Congressional investigating committee looked into the outrageous profits Perot had extracted from his homeland's elderly, volleyed and thundered. Whether Perot actually thumbed his nose at them is doubtful, but he certainly was in a position to do so.

Meanwhile, whatever happened to that guy with the coke bottle glasses, no one seems to know.

Geo Positioning

Geo Tagging refers to adding latitude and longitude information to websites and photographs. This has been around for a long time but it has taken the advent of Google Earth for it to really start to catch on.

This blog entry has geo meta tags that you can see if you look at the HTML source ("View > [Page] Source"). The input was as follows:

Address: 82 Devonshire St Boston MA

Lat: 42.3578 Lon: -71.0577

Descriptive Place Name: Fidelity Investments headquarters

Region: US-MA Country Code: US Country Name: United States


This creates meta tags in the HTML Header as follows:

<!-- geo tags for 82 Devonshire St Boston MA -->
<meta name="ICBM"          content="42.3578, -71.0577" />

<meta name="geo.country"   content="US" />
<meta name="geo.region"    content="US-MA" />
<meta name="geo.placename" content="Fidelity Investments headquarters" />
<meta name="geo.position"  content="42.3578; -71.0577" />

<meta name="tgn.name"      content="Fidelity Investments headquarters" />
<meta name="tgn.nation"    content="United States" />
  • To be precise you can go to the location with a GPS device and record the exact coordinates.
    (see below for a discussion of GPX, the GPS transfer protocol)
  • Google Earth is a very good way to find coordinates of any spot on earth
    and both GE itself and the KML Editor will allow you to pick up coordinates from GE.
  • You can find the lat and lon of a US address using GeoURL Header Generator
  • My Geo Position can also be helpful

  • But for most lookups, the easiest method is to use the Firefox Sidebar AddOn called Minimap; it's right in your browser so you don't have to switch back and forth. (The AddOn download is found here: Mini Map Sidebar).

The Region, Country Code and Country Name can be found here: ISO-3166-1 Country Names

geo.placename and tgn.name are often rendered as the city name but are intended to describe the geographical feature ("Pyramids of Giza" or something). This tag is optional.

HTML geo meta tags can be validated here: {geo tag validation}



There is a search engine of long standing that reads HTML geo meta tags and indexes the website based upon its location; for searching, it groups sites based on their geographic proximity: GeoURL.



Photographs can also contain geo meta data, so-called EXIF data (Firefox has an EXIF viewer AddOn).

JPEG is the most common image format and the easiest to deal with. The combination of Picassa2 and Google Earth allow you easily to add this information to your own photos.

The process of adding lat and lon to your photographs is this:

1. Select one or more photos in Picassa
2. Select Tools > GeoTag > GeoTag with Google Earth ...
3. This starts Google Earth and you can "fly" to the location of the picture

4. A small Picasa window will appear in Earth's lower-right corner displaying thumbnails of the pictures you selected; press the "Geotag" button.
5. When all of your pictures are tagged, press the "Done" button

Slowly, camera manufacturers are providing GPS capability. Some few have GPS devices built in and some others allow an external GPS device to be attached, although both Canon and Nikon are way behind the curve ... if you own either, you can essentially forget it: the best - lousy - solution is to carry around a GPS with you and synchronize the times ... ugly.


The Google Maps API allows maps to be embedded in a website as is done here. Google Maps API

The JavaScript required to embed the map on this page can also be seen in the HTML source ("View > [Page] Source"). In addition to JavaScript, you need a DIV with an ID of "map" or whatever is specified in the JavaScript document.getElementById entry, which specifies the height and width of the map to be displayed.

To embed these maps you must register with Google


In addition, there are extensions to ATOM and RSS to include lat and lon in your syndication feeds; there are three standards that I have found: GeoRSS (ATOM example) , W3C Geo (RSS example) and an "ICBM RSS Module". This website extends the namespaces of both its ATOM feed and its RSS feed to include all the tags.

Google, Yahoo and Microsoft all now support GeoRSS as a feed to their map programs. My sense of it is that KML is a richer protocol, allowing more features, but fundamentally all these XML variants do mostly the same thing.

Google Sitemaps can include links to KML files (and ATOM, now, too). Part of the sitemap generation on this site is some code that picks up every *.kml and *.kmz file in the /kml/ folder and adds them to our sitemap.xml file.


Google Earth is filled with delights, not the least of which is a Flight Simulator! Google Earth Flight Simulator Keyboard Controls


KML ( Keyhole Markup Language, Keyhole being the predecessor to Google Earth) is an XML protocol that allows you to incorporate Google Earth into graphical presentations. Google KML Overview

Google Earth Outreach helps you get started: Google Earth Outreach

An extraordinary collection of KML files you can view is found here: Spectacular satellite images of the world

I found a KML editor here: NorthGates' KML Editor for Windows. It's rudimentary but very handy for what it does do.

Here's the Google Earth tools list where I found the KML editor: EarthPlot Software Tools For Google Earth


The way we serve the KML in the link that connects to Google Earth from individual blogs uses the following PHP script as its base:

<?php

// See Google Earth's KML 2.1 Reference
// http://code.google.com/apis/kml/documentation/kml_tags_21.html

$lat			= $_GET['lat'];
$lon			= $_GET['lon'];
$placename		= $_GET['placename'];
	
$altitude		= $_GET['altitude'];
$range			= $_GET['range'];
$heading		= $_GET['heading'];
$tilt			= $_GET['tilt'];
	
if ($altitude	== NULL) {$altitude	= 0;}
if ($range	== NULL) {$range	= 1000;}
if ($heading	== NULL) {$heading	= 0;}
if ($tilt	== NULL) {$tilt		= 0;}
	
$description	= "<h3><font color=\"#ea9f20\"><a href=\"http://www.philadelphia-reflections.com/\">
		PHILADELPHIA REFLECTIONS</font></a></h3>
		<p>The musings of a Philadelphia physician who has served the community for six decades.</p>";
	
	
header('Content-Type: application/vnd.google-earth.kml+xml');
header('Content-Disposition: inline; filename="philadelphia-reflections.kml"');

echo '<?xml version="1.0" encoding="UTF-8"?>'; 

?>

<kml xmlns="http://earth.google.com/kml/2.0">

  <Placemark>
    <name><?php echo $placename; ?></name>
    <description>
        <![CDATA[<?php echo $description; ?>]]>
    </description>
    
    <LookAt>
      <longitude><?php echo $lon; ?></longitude>
      <latitude><?php echo $lat; ?></latitude>
      
      <altitude><?php echo $altitude; ?></altitude>
      <range><?php echo $range; ?></range>
      <tilt><?php echo $tilt; ?></tilt>
      <heading><?php echo $heading; ?></heading>
      
      <altitudeMode>relativeToGround</altitudeMode>
    </LookAt>
    
    <Point>
      <coordinates><?php echo "$lon,$lat,$altitude"; ?></coordinates>
    </Point>
    
  </Placemark>

</kml>


Of course, GPS devices are an integral part of this process of Geo Positioning. GPS devices are supposed to support the open-source protocol GPX,
which is an XML-based description of waypoints and routes. Wikipedia describes GPX here: GPS eXchange Format

The GPX protocol's official website is here: GPX The GPS Exchange Format

Google Earth supports raw GPX (File > Open ...) and when you open a GPX file in Google Earth, it converts it to KML. But if you want stand alone programs to do this:

If you have non-standard GPS data, you may want to have a look at GPS Babel for conversion of native GPS formats as well as the GPS Utility and G7ToWin

A nice blog on these things relative to Google Maps is here: Using XSL to Transform Google Earth (KML) and GPX to Google Maps API


At Philadelphia Reflections, we are creating tours by carrying a GPS and a camera around on our travels. The GPS track becomes a path and waypoints become placemarks. When you come home, download the GPS data in GPX format and open up the GPX file in Google Earth. Use Google Earth to edit the placemark balloons, including pictures and text.



There are many, many sightseeing blogs around that take you to interesting places on Google Maps and Google Earth. A place to start looking is Sightseeing with Google Satellite Maps


Somehow, the concept of "mashup" is related to all of this but it sort of sounds like the term "multimedia" a few years ago ... fancy in concept but somewhat vague in reality.

Google has a Mashup Editor and Wikipedia has a definition but it's not clear what it all adds up to.



(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

Process .htm and .html as php

It is sometimes helpful to include php scripting in files that do not have the file extension of php.

There is quite a lot of discussion on the web about this but at least on this server the answer is not what most people think.

In the .htaccess file in the root folder include these two lines:

AddType application/x-httpd-php .html .php .htm
AddHandler application/x-httpd-php .html .php

Captcha

With the rise of spam entries in web forms, a security feature called "captcha" has been developed.

CAPTCHA stands for "Completely Automated Public Turing test to tell Computers and Humans Apart". The idea is that only a human could read the letters contained in the image and then enter them in the form. "Accessibility", ie., designing websites to accommodate people with handicaps is obviously hindered by Captcha; but at least given our experience with this website, spamming is a huge problem and the inability of handicapped people to leave comments is a price we are willing to pay to rid ourselves of spam. The W3C, the Godhead of web standards, does not agree with me and lectures at length on the futility of captcha: Inaccessibility of CAPTCHA. Whatever. I may get around to implementing some of their recommendations later, if we continue to be spammed.

Spammers have countered captcha in a number of ways. The first is OCR, which is why the images have fuzzy backgrounds and distorted letters: trying to defeat OCR programs. As OCR techniques have improved, captcha programs have moved from letters to "objects" such as kittens, boxes, etc., which are thought to be harder for computers to recognize; harder for people, too: cat vs kitten, for example. I am amazed to learn during my captcha research that there are spammers who offer micro-payments to people in India, etc. to enter hundreds of spam manually in captcha-ed websites that have defeated their automated spamming systems. Move, counter move; seemingly endlessly.

In this website captcha has been implemented using PHP: the comments form that appears at the end of every page has an image created using the PHP image-creation routines which has random characters in it. If the characters in the image are entered correctly in the form, the comments are entered into the database.

I cribbed the PHP captcha code from http://www.white-hat-web-design.co.uk/articles/php-captcha.php and it worked right out of the box with the minor exception that the form HTML didn't quite pass XHTML muster; easily fixed. (I have subsequently discovered that PHP security and sessions don't play well together; this problem remains unresolved and I've had to turn off captcha processing for my secure pages.)

I implemented a number of other spam counter measures before I got around to captcha, which involved noticing what the spammers did and writing code to frustrate it. I am constantly on the lookout for new security techniques to implement.

Javascript: document.write and XHTML

For reasons that make no sense to me, the Javascript command document.write does not work when your page is rendered properly in XHMTL (as described elsewhere in this Topic).

I have searched the web in vain to find a Javascript solution. Many are offered but none work worth a damn.

So don't bother. Use PHP's echo function. It works perfectly.

Open a new window with XHTML

Once upon a time you could say

<a href="link" target="_blank">Click to open a new window</a>

and a new window would open. Highly annoying if used very often, but sometimes it's the right thing to do.

And then XHTML comes along and this is not longer legal. target="_blank" is "deprecated" without a single word as to what a poor developer is to substitute.

Here's what you do:

<a href="link" onclick="window.open(this.href); return false;">Click to open a new window</a>

Web Standards Validation

There are two primary aspects of a website that need validation:



1. (X)HTML

You can use the W3C's QA Markup Validation Service.
The URL to test the main page of Philadelphia Reflections is http://validator.w3.org/

Firefox has several useful add-ons for (X)HTML validation; one that uses Tidy is here: Html Validator

2. CSS

The W3C has a validation service for CSS, too.
For Philadelphia Reflections, the following URL checks all the CSS definitions in the main page: http://jigsaw.w3.org/css-validator/ (note: this validator is a little flakey: it produces different answers for the same file; you have to refresh a couple of times to get the whole story)

Firefox has several useful web developer add-on tools; try this one: Web Developer


Once you've gotten the HTML and CSS basics under control, there are other aspects of your site that you will want to validate:

Broken Links

The W3C will check all your links for both response time and validity.
http://validator.w3.org/checklink/checklink

Tidy

There is an absolutely lovely program called HTML Tidy, origianlly written by Dave Raggett and decribed by the W3C here: http://www.w3.org/People/Raggett/tidy/

Calls to Tidy are available in some newer renditions of PHP (sadly, not the one we are using), however, on Widows (only) versions of Firefox and Mozilla, you can download an extension that will provide all the Tidy functions in your browser! ... https://addons.mozilla.org/firefox/249/. This a fantastic feature that I use all the time.

Syndication XML Validation

Validating RSS and Atom files is greatly facilitated by http://feedvalidator.org/. It has a number of quirks, the worst of which is that it has a length limitation that we exceed and so we have to provide "short" syndication files since all the feed aggregators use this facility and reject any feeds that aren't validated by it.

Google Sitemap Validation

If you submit a sitemap to Google through their Webmaster Tools facility they will validate your sitemap when they load it. An external validation tool is available here: Validome Google Sitemap(s) Validator

Yahoo and Microsoft have agreed to support Google's Sitemap protocol and to support the inclusion of the line "Sitemap: http://www.philadelphia-reflections.com/sitemap.xml" in robots.txt. If other search engines adopt this facility it will make it much easier to get into the world's many search engines ... they'll pick up this line instead of us having to hunt them down.

Meta Tag Validator

As you puzzle the mysteries of search-engine indexing, you'll want to check your meta tags: http://www.widexl.com/remote/search-engines/metatag-analyzer.html

gzip Compression & Headers

When you start getting really fancy and want to include automatic gzip compression, you'll want to see it in action and you'll want to check out all of your HTTP headers: http://www.gidnetwork.com/tools/gzip-test.php

Response Time

Of course, the reason you''re experimenting with gzip is because you're concerned about response time.
(1) Try this site for a detailed analysis: http://www.websitepulse.com/help/tools.php
(2) Firefox to the rescue again: FasterFox is another lovely add-in: https://addons.mozilla.org/firefox/1269/

Geo Tags

Check the validity of your geo tags here: Geo-Tag Validator

Big List

http://uitest.com/en/analysis/ is the mother of all lists of validations routines

Call KML files from within a blog or topic

Here's how to create a button in your blogs or topics that calls KML or KMZ files you create. In the Modify A Blog utility you can include a KML or KMZ file, but to call it explicity from within the blog, you can create a button as shown here.

  1. Create and save your KML file.
  2. FTP the file to the kml folder in Philadelphia Reflections
  3. Use the following code in a blog or topic to call the kml file


<button onclick="location.href='http://www.philadelphia-reflections.com/kml-read.php?file=Franklin.kmz'">Button Label</button>

To create this button:



Instead of Franklin.kmz, put any .kml or .kmz file that is in the kml folder:
http://www.philadelphia-reflections.com/kml/

Google Earth Tour of Franklin Locations

On the front page of Philadelphia Reflections is found a button which will download Google Earth, and if you follow instructions on the left column, will give you a satellite tour of every bloglet on the site. At least, it will when we get it finished; it's only about half complete at present. If you are unfamiliar with this approach, we suggest you download the Earth program from the Google site and get acquainted by locating your own house, or Independence Hall, or the Vatican.

In addition, every Topic (listed in the left hand column of the front page of Philadelphia Reflections) will contain a button which generates a tour of the geoTags of that particular Topic, providing there are three or more such tags. You will generally get the best results from tours developed by unknown authors if you turn off ALL of the layers provided in the lower section of the left-hand panel of Google Earth, although you might turn them on, one at a time, if you want to enhance the effects. Generally speaking, the route of Interstate 95 seems a little out of place in the wanderings of Ben Franklin.

{start quote}
Take a satellite tour of nearly every place Ben Franklin ever visited. {end quote}
Bob Florig

You should also become familiar with KMZ files and KML files. Keyhole ma