PHILADELPHIA REFLECTIONS
The musings of a Philadelphia Physician who has served the community for nearly six decades

Related Topics

Website Development
The website technology supporting Philadelphia Reflections is PHP, MySQL and DHTML. The web hosting service is Internet Planners.

Computers and Websites
Much of the early development of the electronic computer took place in Philadelphia. We lost the lead, but it might return.

XHTML vs. HTML

The markup language used by web browsers continues to evolve. The most current version (as of August 2006) is XHTML 1.1, an XML version of HTML.

Many browsers, most particularly IE, do not support XHTML. Technically speaking, they support only the "text/html" mime type, not "application/xhtml+xml". Lots of web developers have gone to the trouble of sticking closing tags ( />) in their BR, HR, META and INPUT tags and a DOCTYPE at the top but then serve the code as "text/html".

This produces a syntactic mish mash which is almost certainly worse than using strict HTML 4.01.

Why "worse"? Because of the possibility of unintended results from providing incorrect instructions to the browser. If you care about the output produced by the browser, which most developers and content providers emphatically do, then you have to be careful about what instructions you give the browser. You simply cannot count on getting what you want if what you're telling the browser to do is syntactically incorrect.

However, it's a little difficult to see just what good XHTML is:

  • There are rumors that it renders the non-image portion of a page as much as 50% faster than HTML, but what with gzip and broadband being pretty common these days, it's hard to see that as an especially compelling reason to be bothered.
  • Furthermore, those browsers that do render XHTML (Mozilla, Firefox) are very picky about syntax and blow up much too easily.
  • And the claim that XHTML is the way to get your web pages onto cell phones and toaster ovens leaves me cold. It's just not believable that the format required for these special devices will be the same as for a computer monitor.

Internet cognoscenti speak disparagingly of "tag soup" but the Internet is a lot more about content than it is about syntax, so who really cares?

Well, somehow, I do. A little. Since we use PHP on this site, we have the opportunity to figure out what features are supported by a browser and render the correct types of tags, mime-types, etc.

Check out the headers and the page source in Mozilla to see it in action:

  1. It renders XHTML 1.1 correctly whenever it encounters a browser that can support it
  2. It uses output buffering (which demonstrably if illogically improves rendering response time)
  3. It sends the whole thing using gzip compression if the browser will support it
<?php

#
# Both http://www.workingwith.me.uk/articles/scripting/mimetypes
# and http://keystonewebsites.com/articles/mime_type.php
#
# ... show how to serve the correct mime type and HTML prologue
#
# ... I prefer http://www.workingwith.me.uk/articles/scripting/mimetypes
#     because it serves XHTML 1.1 instead of XHTML 1.0 Transitional
#     and HTML 4.01 Loose
#
# http://www.hixie.ch/advocacy/xhtml Discusses using the wrong mime type.
#
# http://www.goer.org/ is a very interesting site on this subject
#
#
# I also include the privacy header created for me by http://www.p3pwiz.com/
# and I modified the fix_code function to include gzip.
#
# It is possible that we can get the server to do all our compression for us automatically.
# At this point I have not tested this but here are two references:
#
# http://elliottback.com/wp/archives/2006/01/12/http-gzip-compression-in-php/
# http://lists.evolt.org/archive/Week-of-Mon-20050228/170256.html
#

$charset = "iso-8859-1";
$mime = "text/html";

function fix_code($buffer)
	{
	#
	# I modified this for gzip
	#
	if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip'))
		{
		header("Content-Encoding: gzip"); // required to un-gzip to output
		return (gzencode(str_replace(" />", ">", $buffer),6,FORCE_GZIP));
		}
		else
			{
			return (str_replace(" />", ">", $buffer));
			}
	}

if(stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml")) {
   # if there's a Q value for "application/xhtml+xml" then also 
   # retrieve the Q value for "text/html"
   if(preg_match("/application\/xhtml\+xml;q=0(\.[1-9]+)/i",
                 $_SERVER["HTTP_ACCEPT"], $matches)) {
      $xhtml_q = $matches[1];
      if(preg_match("/text\/html;q=0(\.[1-9]+)/i",
                    $_SERVER["HTTP_ACCEPT"], $matches)) {
         $html_q = $matches[1];
         # if the Q value for XHTML is greater than or equal to that 
         # for HTML then use the "application/xhtml+xml" mimetype
         if($xhtml_q >= $html_q) {
            $mime = "application/xhtml+xml";
         }
      }
   # if there was no Q value, then just use the 
   # "application/xhtml+xml" mimetype
   } else {
      $mime = "application/xhtml+xml";
   }
}

# special check for the W3C_Validator
if (stristr($_SERVER["HTTP_USER_AGENT"],"W3C_Validator")) {
   $mime = "application/xhtml+xml";
}

# set the prolog_type according to the mime type which was determined
if($mime == "application/xhtml+xml")
	{
	#
	# I added this, G4
	#
	ob_start("ob_gzhandler");
	#
	#
	$prolog_type = "<?xml version='1.0' encoding='$charset' ?>
<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.1//EN' 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'>
<html xmlns='http://www.w3.org/1999/xhtml' xml:lang='en'>\n\n";
	}
	else
		{
		ob_start("fix_code");
		$prolog_type = "<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01//EN' 
		'http://www.w3.org/TR/html4/strict.dtd'>
		<html lang='en'>\n\n";
		}

# finally, output the mime type and prolog type
header("Content-Type: $mime;charset=$charset");
header("Vary: Accept");

// privacy header created at http://www.p3pwiz.com/
header("P3P: policyref=\"http://www.philadelphia-reflections.com/w3c/p3p.xml\",
 CP=\"NID DSP NOI COR\"");

print $prolog_type;
?>

Here's an interesting article on Doctype Switching: http://gutfeldt.ch/matthias/articles/doctypeswitch.html

The Philadelphia Reflections webmaster: George IV

(my thanks to http://centricle.com/tools/html-entities/ for HTML encoding)

2007 PS:
It turns out that Firefox has a number of intolerable quirks in the way it displays pages presented to it using XHTML 1.1 and the application/xhtml+xml mime type. I was unable to figure out a satisfactory way of circumventing these bugs and so I have reverted to XHTML 1.0 Strict and the text/html mime type, which solves all the problems but annoys me quite a lot.

(1119)

Please enter your comments here

Name

Comments

captcha image