PHILADELPHIA REFLECTIONS
Musings of a Philadelphia Physician who has served the community for six decades

Return to Home

Related Topics

Website Development
The website technology supporting Philadelphia Reflections is PHP, MySQL and DHTML. The web hosting service is Internet Planners. The development of this website has provided an opportunity to learn new technology, to try out different techniques for getting noticed by the search engines and the trials and tribulations of dealing with malicious hackers and spammers who range from the annoying to the abusive. This collection of articles documents some of our experiences and we hope that people surfing the web looking for solutions to problems we've encountered will benefit.

Ampersand Madness: Convert & to & to prevent XHTML errors

The whole subject of "encoding" gives me a headache.



Encoding In General

The first thing you have to know is: what is HTML encoding ... so look here:
http://htmlhelp.com/reference/html40/entities/
or here:
http://www.cookwood.com/html/extras/entities.html

(These are HTML encodings; URL encoding is something else again ... look here:
» http://www.blooberry.com/indexdot/html/topics/urlencoding.htm)

Ampersand Encoding and Conversion

Later on, you'll find out that the ampersand is a huge source of XHTML errors because it has to be written

  • &
    or
  • &
    or
  • &

but you will struggle endlessly with how to get the darn thing to stay converted. First of all, content providers feel justifiably justified in including bare naked "&"s wherever they please; second of all, you will find that encoded ampersands get stripped back to their bare naked selves by browsers and other well-meaning sorts.

So, my undying thanks to Michael Ash's Regex Blog for providing the regex pattern in the following bit of PHP code:


$pattern = '/&(?!(?i:\#((x([\dA-F]){1,5})|(104857[0-5]|10485[0-6]\d|1048[0-4]\d\d|104[0-7]\d{3}|10[0-3]\d{4}|0?\d{1,6}))|([A-Za-z\d.]{2,31}));)/i';
				
$replacement = '&';
				
$string = preg_replace ( $pattern, $replacement, $string);

I don't know how it can possibly work, and I may yet eat my words, but for the moment it seems to do the trick.

Ampersand Encoding In RSS

Another thing: & is the only ampersand encoding form acceptable to both RSS and Atom. So, look at the souce of this page and you will find that I use this encoding in the title ... that's because the title goes into the Title field of my RSS and Atom feeds.

(1216)

The author deserves for the monument:DD
Posted by: followers exchanege   |   Feb 13, 2012 9:53 AM
Can be also this issue because the truth can be achieved only in a dispute :DD
Posted by: cheapostay   |   Feb 13, 2012 9:31 AM
As usual, the webmaster posted correctly..!!
Posted by: esalerugs coupon   |   Feb 13, 2012 9:10 AM

Please Let Us Know What You Think


(HTML tags provide better formatting)

Because of robot spam we ask you to confirm your comment: we will send you an email containing a link to click. We apologize for this inconvenience but this ensures the quality of the comments. (Your email will not be displayed.)
Thank you.