I have the following XML file, the file is rather large and i haven't been able to get simplexml to open and read the file so i'm trying XMLReader with no success in php
<?xml version="1.0" encoding="ISO-8859-1"?>
<products>
<last_updated>2009-11-30 13:52:40</last_updated>
<product>
<element_1>foo</element_1>
<element_2>foo</element_2>
<element_3>foo</element_3>
<element_4>foo</element_4>
</product>
<product>
<element_1>bar</element_1>
<element_2>bar</element_2>
<element_3>bar</element_3>
<element_4>bar</element_4>
</product>
</products>
I've unfortunately not found a good tutorial on this for PHP and would love to see how I can get each element content to store in a database.
The accepted answer gave me a good start, but brought in more classes and more processing than I would have liked; so this is my interpretation:
$xml_reader = new XMLReader;
$xml_reader->open($feed_url);
// move the pointer to the first product
while ($xml_reader->read() && $xml_reader->name != 'product');
// loop through the products
while ($xml_reader->name == 'product')
{
// load the current xml element into simplexml and we’re off and running!
$xml = simplexml_load_string($xml_reader->readOuterXML());
// now you can use your simpleXML object ($xml).
echo $xml->element_1;
// move the pointer to the next product
$xml_reader->next('product');
}
// don’t forget to close the file
$xml_reader->close();
Most of my XML parsing life is spent extracting nuggets of useful information out of truckloads of XML (Amazon MWS). As such, my answer assumes you want only specific information and you know where it is located.
I find the easiest way to use XMLReader is to know which tags I want the information out of and use them. If you know the structure of the XML and it has lots of unique tags, I find that using the first case is the easy. Cases 2 and 3 are just to show you how it can be done for more complex tags. This is extremely fast; I have a discussion of speed over on What is the fastest XML parser in PHP?
The most important thing to remember when doing tag-based parsing like this is to use if ($myXML->nodeType == XMLReader::ELEMENT) {...
- which checks to be sure we're only dealing with opening nodes and not whitespace or closing nodes or whatever.
function parseMyXML ($xml) { //pass in an XML string
$myXML = new XMLReader();
$myXML->xml($xml);
while ($myXML->read()) { //start reading.
if ($myXML->nodeType == XMLReader::ELEMENT) { //only opening tags.
$tag = $myXML->name; //make $tag contain the name of the tag
switch ($tag) {
case 'Tag1': //this tag contains no child elements, only the content we need. And it's unique.
$variable = $myXML->readInnerXML(); //now variable contains the contents of tag1
break;
case 'Tag2': //this tag contains child elements, of which we only want one.
while($myXML->read()) { //so we tell it to keep reading
if ($myXML->nodeType == XMLReader::ELEMENT && $myXML->name === 'Amount') { // and when it finds the amount tag...
$variable2 = $myXML->readInnerXML(); //...put it in $variable2.
break;
}
}
break;
case 'Tag3': //tag3 also has children, which are not unique, but we need two of the children this time.
while($myXML->read()) {
if ($myXML->nodeType == XMLReader::ELEMENT && $myXML->name === 'Amount') {
$variable3 = $myXML->readInnerXML();
break;
} else if ($myXML->nodeType == XMLReader::ELEMENT && $myXML->name === 'Currency') {
$variable4 = $myXML->readInnerXML();
break;
}
}
break;
}
}
}
$myXML->close();
}
This Works Better and Faster For Me
<html>
<head>
<script>
function showRSS(str) {
if (str.length==0) {
document.getElementById("rssOutput").innerHTML="";
return;
}
if (window.XMLHttpRequest) {
// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp=new XMLHttpRequest();
} else { // code for IE6, IE5
xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.onreadystatechange=function() {
if (this.readyState==4 && this.status==200) {
document.getElementById("rssOutput").innerHTML=this.responseText;
}
}
xmlhttp.open("GET","getrss.php?q="+str,true);
xmlhttp.send();
}
</script>
</head>
<body>
<form>
<select onchange="showRSS(this.value)">
<option value="">Select an RSS-feed:</option>
<option value="Google">Google News</option>
<option value="ZDN">ZDNet News</option>
<option value="job">Job</option>
</select>
</form>
<br>
<div id="rssOutput">RSS-feed will be listed here...</div>
</body>
</html>
**The Backend File **
<?php
//get the q parameter from URL
$q=$_GET["q"];
//find out which feed was selected
if($q=="Google") {
$xml=("http://news.google.com/news?ned=us&topic=h&output=rss");
} elseif($q=="ZDN") {
$xml=("https://www.zdnet.com/news/rss.xml");
}elseif($q == "job"){
$xml=("https://ngcareers.com/feed");
}
$xmlDoc = new DOMDocument();
$xmlDoc->load($xml);
//get elements from "<channel>"
$channel=$xmlDoc->getElementsByTagName('channel')->item(0);
$channel_title = $channel->getElementsByTagName('title')
->item(0)->childNodes->item(0)->nodeValue;
$channel_link = $channel->getElementsByTagName('link')
->item(0)->childNodes->item(0)->nodeValue;
$channel_desc = $channel->getElementsByTagName('description')
->item(0)->childNodes->item(0)->nodeValue;
//output elements from "<channel>"
echo("<p><a href='" . $channel_link
. "'>" . $channel_title . "</a>");
echo("<br>");
echo($channel_desc . "</p>");
//get and output "<item>" elements
$x=$xmlDoc->getElementsByTagName('item');
$count = $x->length;
// print_r( $x->item(0)->getElementsByTagName('title')->item(0)->nodeValue);
// print_r( $x->item(0)->getElementsByTagName('link')->item(0)->nodeValue);
// print_r( $x->item(0)->getElementsByTagName('description')->item(0)->nodeValue);
// return;
for ($i=0; $i <= $count; $i++) {
//Title
$item_title = $x->item(0)->getElementsByTagName('title')->item(0)->nodeValue;
//Link
$item_link = $x->item(0)->getElementsByTagName('link')->item(0)->nodeValue;
//Description
$item_desc = $x->item(0)->getElementsByTagName('description')->item(0)->nodeValue;
//Category
$item_cat = $x->item(0)->getElementsByTagName('category')->item(0)->nodeValue;
echo ("<p>Title: <a href='" . $item_link
. "'>" . $item_title . "</a>");
echo ("<br>");
echo ("Desc: ".$item_desc);
echo ("<br>");
echo ("Category: ".$item_cat . "</p>");
}
?>
Simple example:
public function productsAction()
{
$saveFileName = 'ceneo.xml';
$filename = $this->path . $saveFileName;
if(file_exists($filename)) {
$reader = new XMLReader();
$reader->open($filename);
$countElements = 0;
while($reader->read()) {
if($reader->nodeType == XMLReader::ELEMENT) {
$nodeName = $reader->name;
}
if($reader->nodeType == XMLReader::TEXT && !empty($nodeName)) {
switch ($nodeName) {
case 'id':
var_dump($reader->value);
break;
}
}
if($reader->nodeType == XMLReader::END_ELEMENT && $reader->name == 'offer') {
$countElements++;
}
}
$reader->close();
exit(print('<pre>') . var_dump($countElements));
}
}
For xml formatted with attributes...
data.xml:
<building_data>
<building address="some address" lat="28.902914" lng="-71.007235" />
<building address="some address" lat="48.892342" lng="-75.0423423" />
<building address="some address" lat="58.929753" lng="-79.1236987" />
</building_data>
php code:
$reader = new XMLReader();
if (!$reader->open("data.xml")) {
die("Failed to open 'data.xml'");
}
while($reader->read()) {
if ($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'building') {
$address = $reader->getAttribute('address');
$latitude = $reader->getAttribute('lat');
$longitude = $reader->getAttribute('lng');
}
$reader->close();
XMLReader is well documented on PHP site. This is a XML Pull Parser, which means it's used to iterate through nodes (or DOM Nodes) of given XML document. For example, you could go through the entire document you gave like this:
<?php
$reader = new XMLReader();
if (!$reader->open("data.xml"))
{
die("Failed to open 'data.xml'");
}
while($reader->read())
{
$node = $reader->expand();
// process $node...
}
$reader->close();
?>
It is then up to you to decide how to deal with the node returned by XMLReader::expand().
Source: Stackoverflow.com