Parsing name-value pair attributes in an HTML tag
Not only do the attributes in an HTML tag come in random order but many are optional
Here's a regex solution:
<?php
function tagAttr($matches) {print_r($matches);}
$string = '<img src="/images/picture.jpg" width="300" class="left" alt="alt keywords" />';
$foo = preg_replace_callback(
'/<img\b(?>\s+(?:alt="([^"]*)"|class="([^"]*)"|style="([^"]*)"|src="([^"]*)"|height="([^"]*)"|width="([^"]*)")|[^\s>]+|\s+)*>/i',
"tagAttr",
$string);
?>
Produces the following:
Array
(
[0] => <img src="/images/picture.jpg" width="300" class="left" alt="alt keywords" />
[1] => alt keywords
[2] => left
[3] =>
[4] => /images/picture.jpg
[5] =>
[6] => 300
)
The regex is a series of alternating sequences; so, add href="([^"]*)"| in front of alt="([^"]*)" to select an additional attribute.
My thanks (a) to Flagrant Badassery for putting me onto the idea and (b) to http://centricle.com/tools/html-entities/ for HTML encoding
(1350)







