[php] Best XML Parser for PHP

I have used the XML Parser before, and even though it worked OK, I wasn't happy with it in general, it felt like I was using workarounds for things that should be basic functionality.

I recently saw SimpleXML but I haven't tried it yet. Is it any simpler? What advantages and disadvantages do both have? Any other parsers you've used?

This question is related to php xml parsing xml-parsing

The answer is


It depends on what you are trying to do with the XML files. If you are just trying to read the XML file (like a configuration file), The Wicked Flea is correct in suggesting SimpleXML since it creates what amounts to nested ArrayObjects. e.g. value will be accessible by $xml->root->child.

If you are looking to manipulate the XML files you're probably best off using DOM XML


This is a useful function for quick and easy xml parsing when an extension is not available:

<?php
/**
 * Convert XML to an Array
 *
 * @param string  $XML
 * @return array
 */
function XMLtoArray($XML)
{
    $xml_parser = xml_parser_create();
    xml_parse_into_struct($xml_parser, $XML, $vals);
    xml_parser_free($xml_parser);
    // wyznaczamy tablice z powtarzajacymi sie tagami na tym samym poziomie
    $_tmp='';
    foreach ($vals as $xml_elem) {
        $x_tag=$xml_elem['tag'];
        $x_level=$xml_elem['level'];
        $x_type=$xml_elem['type'];
        if ($x_level!=1 && $x_type == 'close') {
            if (isset($multi_key[$x_tag][$x_level]))
                $multi_key[$x_tag][$x_level]=1;
            else
                $multi_key[$x_tag][$x_level]=0;
        }
        if ($x_level!=1 && $x_type == 'complete') {
            if ($_tmp==$x_tag)
                $multi_key[$x_tag][$x_level]=1;
            $_tmp=$x_tag;
        }
    }
    // jedziemy po tablicy
    foreach ($vals as $xml_elem) {
        $x_tag=$xml_elem['tag'];
        $x_level=$xml_elem['level'];
        $x_type=$xml_elem['type'];
        if ($x_type == 'open')
            $level[$x_level] = $x_tag;
        $start_level = 1;
        $php_stmt = '$xml_array';
        if ($x_type=='close' && $x_level!=1)
            $multi_key[$x_tag][$x_level]++;
        while ($start_level < $x_level) {
            $php_stmt .= '[$level['.$start_level.']]';
            if (isset($multi_key[$level[$start_level]][$start_level]) && $multi_key[$level[$start_level]][$start_level])
                $php_stmt .= '['.($multi_key[$level[$start_level]][$start_level]-1).']';
            $start_level++;
        }
        $add='';
        if (isset($multi_key[$x_tag][$x_level]) && $multi_key[$x_tag][$x_level] && ($x_type=='open' || $x_type=='complete')) {
            if (!isset($multi_key2[$x_tag][$x_level]))
                $multi_key2[$x_tag][$x_level]=0;
            else
                $multi_key2[$x_tag][$x_level]++;
            $add='['.$multi_key2[$x_tag][$x_level].']';
        }
        if (isset($xml_elem['value']) && trim($xml_elem['value'])!='' && !array_key_exists('attributes', $xml_elem)) {
            if ($x_type == 'open')
                $php_stmt_main=$php_stmt.'[$x_type]'.$add.'[\'content\'] = $xml_elem[\'value\'];';
            else
                $php_stmt_main=$php_stmt.'[$x_tag]'.$add.' = $xml_elem[\'value\'];';
            eval($php_stmt_main);
        }
        if (array_key_exists('attributes', $xml_elem)) {
            if (isset($xml_elem['value'])) {
                $php_stmt_main=$php_stmt.'[$x_tag]'.$add.'[\'content\'] = $xml_elem[\'value\'];';
                eval($php_stmt_main);
            }
            foreach ($xml_elem['attributes'] as $key=>$value) {
                $php_stmt_att=$php_stmt.'[$x_tag]'.$add.'[$key] = $value;';
                eval($php_stmt_att);
            }
        }
    }
    return $xml_array;
}
?>

Hi I think the SimpleXml is very useful . And with it I am using xpath;

$xml = simplexml_load_file("som_xml.xml");

$blocks  = $xml->xpath('//block'); //gets all <block/> tags
$blocks2 = $xml->xpath('//layout/block'); //gets all <block/> which parent are   <layout/>  tags

I use many xml configs and this helps me to parse them really fast. SimpleXml is written on C so it's very fast.


It depends on what you are trying to do with the XML files. If you are just trying to read the XML file (like a configuration file), The Wicked Flea is correct in suggesting SimpleXML since it creates what amounts to nested ArrayObjects. e.g. value will be accessible by $xml->root->child.

If you are looking to manipulate the XML files you're probably best off using DOM XML


Hi I think the SimpleXml is very useful . And with it I am using xpath;

$xml = simplexml_load_file("som_xml.xml");

$blocks  = $xml->xpath('//block'); //gets all <block/> tags
$blocks2 = $xml->xpath('//layout/block'); //gets all <block/> which parent are   <layout/>  tags

I use many xml configs and this helps me to parse them really fast. SimpleXml is written on C so it's very fast.


the crxml parser is a real easy to parser.

This class has got a search function, which takes a node name with any namespace as an argument. It searches the xml for the node and prints out the access statement to access that node using this class. This class also makes xml generation very easy.

you can download this class at

http://freshmeat.net/projects/crxml

or from phpclasses.org

http://www.phpclasses.org/package/6769-PHP-Manipulate-XML-documents-as-array.html


It depends on what you are trying to do with the XML files. If you are just trying to read the XML file (like a configuration file), The Wicked Flea is correct in suggesting SimpleXML since it creates what amounts to nested ArrayObjects. e.g. value will be accessible by $xml->root->child.

If you are looking to manipulate the XML files you're probably best off using DOM XML


Have a look at PHP's available XML extensions.

The main difference between XML Parser and SimpleXML is that the latter is not a pull parser. SimpleXML is built on top of the DOM extensions and will load the entire XML file into memory. XML Parser like XMLReader will only load the current node into memory. You define handlers for specific nodes which will get triggered when the Parser encounters it. That is faster and saves on memory. You pay for that with not being able to use XPath.

Personally, I find SimpleXml quite limiting (hence simple) in what it offers over DOM. You can switch between DOM and SimpleXml easily though, but I usually dont bother and go the DOM route directly. DOM is an implementation of the W3C DOM API, so you might be familiar with it from other languages, for instance JavaScript.


Have a look at PHP's available XML extensions.

The main difference between XML Parser and SimpleXML is that the latter is not a pull parser. SimpleXML is built on top of the DOM extensions and will load the entire XML file into memory. XML Parser like XMLReader will only load the current node into memory. You define handlers for specific nodes which will get triggered when the Parser encounters it. That is faster and saves on memory. You pay for that with not being able to use XPath.

Personally, I find SimpleXml quite limiting (hence simple) in what it offers over DOM. You can switch between DOM and SimpleXml easily though, but I usually dont bother and go the DOM route directly. DOM is an implementation of the W3C DOM API, so you might be familiar with it from other languages, for instance JavaScript.


This is a useful function for quick and easy xml parsing when an extension is not available:

<?php
/**
 * Convert XML to an Array
 *
 * @param string  $XML
 * @return array
 */
function XMLtoArray($XML)
{
    $xml_parser = xml_parser_create();
    xml_parse_into_struct($xml_parser, $XML, $vals);
    xml_parser_free($xml_parser);
    // wyznaczamy tablice z powtarzajacymi sie tagami na tym samym poziomie
    $_tmp='';
    foreach ($vals as $xml_elem) {
        $x_tag=$xml_elem['tag'];
        $x_level=$xml_elem['level'];
        $x_type=$xml_elem['type'];
        if ($x_level!=1 && $x_type == 'close') {
            if (isset($multi_key[$x_tag][$x_level]))
                $multi_key[$x_tag][$x_level]=1;
            else
                $multi_key[$x_tag][$x_level]=0;
        }
        if ($x_level!=1 && $x_type == 'complete') {
            if ($_tmp==$x_tag)
                $multi_key[$x_tag][$x_level]=1;
            $_tmp=$x_tag;
        }
    }
    // jedziemy po tablicy
    foreach ($vals as $xml_elem) {
        $x_tag=$xml_elem['tag'];
        $x_level=$xml_elem['level'];
        $x_type=$xml_elem['type'];
        if ($x_type == 'open')
            $level[$x_level] = $x_tag;
        $start_level = 1;
        $php_stmt = '$xml_array';
        if ($x_type=='close' && $x_level!=1)
            $multi_key[$x_tag][$x_level]++;
        while ($start_level < $x_level) {
            $php_stmt .= '[$level['.$start_level.']]';
            if (isset($multi_key[$level[$start_level]][$start_level]) && $multi_key[$level[$start_level]][$start_level])
                $php_stmt .= '['.($multi_key[$level[$start_level]][$start_level]-1).']';
            $start_level++;
        }
        $add='';
        if (isset($multi_key[$x_tag][$x_level]) && $multi_key[$x_tag][$x_level] && ($x_type=='open' || $x_type=='complete')) {
            if (!isset($multi_key2[$x_tag][$x_level]))
                $multi_key2[$x_tag][$x_level]=0;
            else
                $multi_key2[$x_tag][$x_level]++;
            $add='['.$multi_key2[$x_tag][$x_level].']';
        }
        if (isset($xml_elem['value']) && trim($xml_elem['value'])!='' && !array_key_exists('attributes', $xml_elem)) {
            if ($x_type == 'open')
                $php_stmt_main=$php_stmt.'[$x_type]'.$add.'[\'content\'] = $xml_elem[\'value\'];';
            else
                $php_stmt_main=$php_stmt.'[$x_tag]'.$add.' = $xml_elem[\'value\'];';
            eval($php_stmt_main);
        }
        if (array_key_exists('attributes', $xml_elem)) {
            if (isset($xml_elem['value'])) {
                $php_stmt_main=$php_stmt.'[$x_tag]'.$add.'[\'content\'] = $xml_elem[\'value\'];';
                eval($php_stmt_main);
            }
            foreach ($xml_elem['attributes'] as $key=>$value) {
                $php_stmt_att=$php_stmt.'[$x_tag]'.$add.'[$key] = $value;';
                eval($php_stmt_att);
            }
        }
    }
    return $xml_array;
}
?>

Examples related to php

I am receiving warning in Facebook Application using PHP SDK Pass PDO prepared statement to variables Parse error: syntax error, unexpected [ Preg_match backtrack error Removing "http://" from a string How do I hide the PHP explode delimiter from submitted form results? Problems with installation of Google App Engine SDK for php in OS X Laravel 4 with Sentry 2 add user to a group on Registration php & mysql query not echoing in html with tags? How do I show a message in the foreach loop?

Examples related to xml

strange error in my Animation Drawable How do I POST XML data to a webservice with Postman? PHP XML Extension: Not installed How to add a Hint in spinner in XML Generating Request/Response XML from a WSDL Manifest Merger failed with multiple errors in Android Studio How to set menu to Toolbar in Android How to add colored border on cardview? Android: ScrollView vs NestedScrollView WARNING: Exception encountered during context initialization - cancelling refresh attempt

Examples related to parsing

Got a NumberFormatException while trying to parse a text file for objects Uncaught SyntaxError: Unexpected end of JSON input at JSON.parse (<anonymous>) Python/Json:Expecting property name enclosed in double quotes Correctly Parsing JSON in Swift 3 How to get response as String using retrofit without using GSON or any other library in android UIButton action in table view cell "Expected BEGIN_OBJECT but was STRING at line 1 column 1" How to convert an XML file to nice pandas dataframe? How to extract multiple JSON objects from one file? How to sum digits of an integer in java?

Examples related to xml-parsing

How to create XML file with specific structure in Java jQuery xml error ' No 'Access-Control-Allow-Origin' header is present on the requested resource.' SyntaxError of Non-ASCII character xml.LoadData - Data at the root level is invalid. Line 1, position 1 XML Error: Extra content at the end of the document How to use sed to extract substring How to fix Invalid byte 1 of 1-byte UTF-8 sequence The best node module for XML parsing Parsing XML with namespace in Python via 'ElementTree' oracle plsql: how to parse XML and insert into table