[html] Regular expression for extracting tag attributes

splattne,

@VonC solution partly works but there is some issue if the tag had a mixed of unquoted and quoted

This one works with mixed attributes

$pat_attributes = "(\S+)=(\"|'| |)(.*)(\"|'| |>)"

to test it out

<?php
$pat_attributes = "(\S+)=(\"|'| |)(.*)(\"|'| |>)"

$code = '    <IMG title=09.jpg alt=09.jpg src="http://example.com.jpg?v=185579" border=0 mce_src="example.com.jpg?v=185579"
    ';

preg_match_all( "@$pat_attributes@isU", $code, $ms);
var_dump( $ms );

$code = '
<a href=test.html class=xyz>
<a href="test.html" class="xyz">
<a href=\'test.html\' class="xyz">
<img src="http://"/>      ';

preg_match_all( "@$pat_attributes@isU", $code, $ms);

var_dump( $ms );

$ms would then contain keys and values on the 2nd and 3rd element.

$keys = $ms[1];
$values = $ms[2];