How to read XML using XPath in Java

Question

I want to read XML data using XPath in Java  so for the information I have gathered I am not able to parse XML according to my requirement   here is what I want to do   Get XML file from online via its URL  then use XPath to parse it  I want to create two methods in it  One is in which I enter a specific node attribute id  and I get all the child nodes as result  and second is suppose I just want to get a specific child node value only   lt  xml version  1 0   gt   lt howto gt     lt topic name  Java  gt         lt url gt http   www rgagnonjavahowto htm lt  url gt     lt car gt taxi lt  car gt     lt  topic gt     lt topic name  PowerBuilder  gt          lt url gt http   www rgagnon pbhowto htm lt  url gt          lt url gt http   www rgagnon pbhowtonew htm lt  url gt     lt  topic gt     lt topic name  Javascript  gt           lt url gt http   www rgagnon jshowto htm lt  url gt     lt  topic gt    lt topic name  VBScript  gt          lt url gt http   www rgagnon vbshowto htm lt  url gt    lt  topic gt    lt  howto gt    In above example I want to read all the elements if I search via  name and also one function in which I just want the url from  name  Javascript  only return one node element

User · Answer

You can try this   XML Document  Save as employees xml    lt  xml version  1 0  encoding  UTF-8   gt   lt Employees gt       lt Employee id  1  gt           lt age gt 29 lt  age gt           lt name gt Pankaj lt  name gt           lt gender gt Male lt  gender gt           lt role gt Java Developer lt  role gt       lt  Employee gt       lt Employee id  2  gt           lt age gt 35 lt  age gt           lt name gt Lisa lt  name gt           lt gender gt Female lt  gender gt           lt role gt CEO lt  role gt       lt  Employee gt       lt Employee id  3  gt           lt age gt 40 lt  age gt           lt name gt Tom lt  name gt           lt gender gt Male lt  gender gt           lt role gt Manager lt  role gt       lt  Employee gt       lt Employee id  4  gt           lt age gt 25 lt  age gt           lt name gt Meghan lt  name gt           lt gender gt Female lt  gender gt           lt role gt Manager lt  role gt       lt  Employee gt   lt  Employees gt    Parser class  The class have following methods     List item A Method that will return the Employee Name for input ID  A Method that will return list of Employees Name with age greater than the input age  A Method that will return list of Female Employees Name    Source Code  import java io IOException  import java util ArrayList  import java util Arrays  import java util List   import javax xml parsers DocumentBuilder  import javax xml parsers DocumentBuilderFactory  import javax xml parsers ParserConfigurationException  import javax xml xpath XPath  import javax xml xpath XPathConstants  import javax xml xpath XPathExpression  import javax xml xpath XPathExpressionException  import javax xml xpath XPathFactory   import org w3c dom Document  import org w3c dom NodeList  import org xml sax SAXException    public class Parser        public static void main String   args            DocumentBuilderFactory factory   DocumentBuilderFactory newInstance            factory setNamespaceAware true           DocumentBuilder builder          Document doc   null          try               builder   factory newDocumentBuilder                doc   builder parse  employees xml                    Create XPathFactory object             XPathFactory xpathFactory   XPathFactory newInstance                    Create XPath object             XPath xpath   xpathFactory newXPath                 String name   getEmployeeNameById doc  xpath  4               System out println  Employee Name with ID 4      name                List lt String gt  names   getEmployeeNameWithAge doc  xpath  30               System out println  Employees with  age gt 30  are     Arrays toString names toArray                   List lt String gt  femaleEmps   getFemaleEmployeesName doc  xpath               System out println  Female Employees names are                         Arrays toString femaleEmps toArray                 catch  ParserConfigurationException   SAXException   IOException e                e printStackTrace                           private static List lt String gt  getFemaleEmployeesName Document doc  XPath xpath            List lt String gt  list   new ArrayList lt  gt             try                 create XPathExpression object             XPathExpression expr                   xpath compile   Employees Employee gender  Female   name text                    evaluate expression result on XML document             NodeList nodes    NodeList  expr evaluate doc  XPathConstants NODESET               for  int i   0  i  lt  nodes getLength    i                    list add nodes item i  getNodeValue               catch  XPathExpressionException e                e printStackTrace                      return list              private static List lt String gt  getEmployeeNameWithAge Document doc  XPath xpath  int age            List lt String gt  list   new ArrayList lt  gt             try               XPathExpression expr                   xpath compile   Employees Employee age gt     age      name text                  NodeList nodes    NodeList  expr evaluate doc  XPathConstants NODESET               for  int i   0  i  lt  nodes getLength    i                    list add nodes item i  getNodeValue               catch  XPathExpressionException e                e printStackTrace                      return list              private static String getEmployeeNameById Document doc  XPath xpath  int id            String name   null          try               XPathExpression expr                   xpath compile   Employees Employee  id      id       name text                  name    String  expr evaluate doc  XPathConstants STRING             catch  XPathExpressionException e                e printStackTrace                       return name

User · Answer

Getting started  example   xml file    lt inventory gt       lt book year  2000  gt           lt title gt Snow Crash lt  title gt           lt author gt Neal Stephenson lt  author gt           lt publisher gt Spectra lt  publisher gt           lt isbn gt 0553380958 lt  isbn gt           lt price gt 14 95 lt  price gt       lt  book gt        lt book year  2005  gt           lt title gt Burning Tower lt  title gt           lt author gt Larry Niven lt  author gt           lt author gt Jerry Pournelle lt  author gt           lt publisher gt Pocket lt  publisher gt           lt isbn gt 0743416910 lt  isbn gt           lt price gt 5 99 lt  price gt       lt  book gt        lt book year  1995  gt           lt title gt Zodiac lt  title gt           lt author gt Neal Stephenson lt  author gt           lt publisher gt Spectra lt  publisher gt           lt isbn gt 0553573862 lt  isbn gt           lt price gt 7 50 lt  price gt       lt  book gt        lt  -- more books    -- gt    lt  inventory gt    Java code   import javax xml parsers DocumentBuilder  import javax xml parsers DocumentBuilderFactory   import org testng annotations DataProvider  import org testng annotations Test  import org w3c dom Document  import org w3c dom Element  import org w3c dom Node  import org w3c dom NodeList  import org xml sax SAXException  import org xml sax SAXParseException    try        DocumentBuilderFactory docBuilderFactory   DocumentBuilderFactory newInstance        DocumentBuilder docBuilder   docBuilderFactory newDocumentBuilder        Document doc   docBuilder parse  new File  c   tmp  my xml             normalize text representation     doc getDocumentElement   normalize        System out println   Root element of the doc is     doc getDocumentElement   getNodeName          NodeList listOfBooks   doc getElementsByTagName  book        int totalBooks   listOfBooks getLength        System out println  Total no of books       totalBooks        for int i 0  i lt listOfBooks getLength     i               Node firstBookNode   listOfBooks item i           if firstBookNode getNodeType      Node ELEMENT NODE                 Element firstElement    Element firstBookNode                                            System out println  Year    firstElement getAttribute  year                    -------             NodeList firstNameList   firstElement getElementsByTagName  title                Element firstNameElement    Element firstNameList item 0                NodeList textFNList   firstNameElement getChildNodes                System out println  title         Node textFNList item 0   getNodeValue   trim                      end of for loop with s var   catch  SAXParseException err        System out println      Parsing error       line     err getLineNumber         uri     err getSystemId          System out println       err getMessage        catch  SAXException e        Exception x   e getException           x    null    e   x  printStackTrace       catch  Throwable t        t printStackTrace

User · Answer

If you have a xml like below    lt e Envelope     xmlns d    http   www w3 org 2001 XMLSchema      xmlns e    http   schemas xmlsoap org soap envelope       xmlns wn0    http   systinet com xsd SchemaTypes       xmlns i    http   www w3 org 2001 XMLSchema-instance  gt       lt e Header gt           lt Friends gt               lt friend gt                   lt Name gt Testabc lt  Name gt                   lt Age gt 12121 lt  Age gt                   lt Phone gt Testpqr lt  Phone gt               lt  friend gt           lt  Friends gt       lt  e Header gt       lt e Body gt           lt n0 ForAnsiHeaderOperResponse xmlns n0    http   systinet com wsdl com magicsoftware ibolt localhost ForAnsiHeader ForAnsiHeaderImpl ForAnsiHeaderOper KExqYXZhL2xhbmcvU3RyaW5nOylMamF2YS9sYW5nL1N0cmluZzs   gt               lt response i type    d string  gt 12--abc--pqr lt  response gt           lt  n0 ForAnsiHeaderOperResponse gt       lt  e Body gt   lt  e Envelope gt    and wanted to extract the below xml    lt e Header gt      lt Friends gt         lt friend gt            lt Name gt Testabc lt  Name gt            lt Age gt 12121 lt  Age gt            lt Phone gt Testpqr lt  Phone gt         lt  friend gt      lt  Friends gt   lt  e Header gt    The below code helps to achieve the same   public static void main String   args         File fXmlFile   new File  C   Users  abhijitb  Desktop  Test xml        DocumentBuilderFactory dbf   DocumentBuilderFactory newInstance        Document document      Node result   null      try           document   dbf newDocumentBuilder   parse fXmlFile           XPath xPath   XPathFactory newInstance   newXPath            String xpathStr      Envelope  Header           result    Node  xPath evaluate xpathStr  document  XPathConstants NODE           System out println nodeToString result          catch  SAXException   IOException   ParserConfigurationException   XPathExpressionException               TransformerException e            e printStackTrace             private static String nodeToString Node node  throws TransformerException       StringWriter buf   new StringWriter        Transformer xform   TransformerFactory newInstance   newTransformer        xform setOutputProperty OutputKeys OMIT XML DECLARATION   yes        xform transform new DOMSource node   new StreamResult buf        return  buf toString         Now if you want only the xml like below    lt Friends gt      lt friend gt         lt Name gt Testabc lt  Name gt         lt Age gt 12121 lt  Age gt         lt Phone gt Testpqr lt  Phone gt      lt  friend gt   lt  Friends gt    You need to change the   String xpathStr      Envelope  Header   to String xpathStr      Envelope  Header

User · Answer

Here is an example of processing xpath with vtd-xml    for heavy duty XML processing it is second to none  here is the a recent paper on this subject Processing XML with Java     A Performance Benchmark   import com ximpleware     public class changeAttrVal       public  static  void main String s    throws VTDException java io UnsupportedEncodingException java io IOException          VTDGen vg   new VTDGen            if   vg parseFile  input xml   false               return          VTDNav vn   vg getNav            AutoPilot ap   new AutoPilot vn           XMLModifier xm   new XMLModifier vn           ap selectXPath     place  id   p14   and    initialMarking   2     initialMarking            int i 0          while  i ap evalXPath     -1               xm updateToken i 1   499      change initial marking from 2 to 499                   xm output  new xml

User · Answer

This shows you how to   Read in an XML file to a DOM Filter out a set of Nodes with XPath Perform a certain action on each of the extracted  Nodes     We will call the code with the following statement  processFilteredXml xmlIn  xpathExpr  node  - gt     Do something            In our case we want to print some creatorNames from a book xml using    book creators creator creatorName  as xpath to perform a printNode action on each Node that matches the XPath   Full code   Test public void printXml         try  InputStream in   readFile  book xml              processFilteredXml in     book creators creator creatorName    node  - gt                printNode node  System out                     catch  Exception e            throw new RuntimeException e            private InputStream readFile String yourSampleFile        return Thread currentThread   getContextClassLoader   getResourceAsStream yourSampleFile      private void processFilteredXml InputStream in  String xpath  Consumer lt Node gt  process        Document doc   readXml in       NodeList list   filterNodesByXPath doc  xpath       for  int i   0  i  lt  list getLength    i              Node node   list item i           process accept node            public Document readXml InputStream xmlin        try           DocumentBuilderFactory dbf   DocumentBuilderFactory newInstance            DocumentBuilder db   dbf newDocumentBuilder            return db parse xmlin         catch  Exception e            throw new RuntimeException e            private NodeList filterNodesByXPath Document doc  String xpathExpr        try           XPathFactory xPathFactory   XPathFactory newInstance            XPath xpath   xPathFactory newXPath            XPathExpression expr   xpath compile xpathExpr           Object eval   expr evaluate doc  XPathConstants NODESET           return  NodeList  eval        catch  Exception e            throw new RuntimeException e            private void printNode Node node  PrintStream out        try           Transformer transformer   TransformerFactory newInstance   newTransformer            transformer setOutputProperty OutputKeys INDENT   yes            transformer setOutputProperty OutputKeys OMIT XML DECLARATION   yes            transformer setOutputProperty   http   xml apache org xslt indent-amount    2            StreamResult result   new StreamResult new StringWriter             DOMSource source   new DOMSource node           transformer transform source  result           String xmlString   result getWriter   toString            out println xmlString         catch  Exception e            throw new RuntimeException e             Prints   lt creatorName gt Fosmire  Michael lt  creatorName gt    lt creatorName gt Wertz  Ruth lt  creatorName gt    lt creatorName gt Purzer  Senay lt  creatorName gt    For book xml   lt book gt     lt creators gt       lt creator gt         lt creatorName gt Fosmire  Michael lt  creatorName gt         lt givenName gt Michael lt  givenName gt         lt familyName gt Fosmire lt  familyName gt       lt  creator gt       lt creator gt         lt creatorName gt Wertz  Ruth lt  creatorName gt         lt givenName gt Ruth lt  givenName gt         lt familyName gt Wertz lt  familyName gt       lt  creator gt       lt creator gt         lt creatorName gt Purzer  Senay lt  creatorName gt          lt givenName gt Senay lt  givenName gt          lt familyName gt Purzer lt  familyName gt       lt  creator gt     lt  creators gt     lt titles gt       lt title gt Critical Engineering Literacy Test  CELT  lt  title gt     lt  titles gt   lt  book gt

User · Answer

Expanding on the excellent answer by  bluish and  Yishai  here is how you make the NodeLists and node attributes support iterators  i e  the for Node n  nodelist  interface   Use it like   NodeList nl       for Node n   XmlUtil asList nl           and  Node n       for Node attr   XmlUtil asList n getAttributes            The code          Converts NodeList to an iterable construct     From  https   stackoverflow com a 19591302 779521     public final class XmlUtil       private XmlUtil           public static List lt Node gt  asList NodeList n            return n getLength      0   Collections  lt Node gt emptyList     new NodeListWrapper n              static final class NodeListWrapper extends AbstractList lt Node gt  implements RandomAccess           private final NodeList list           NodeListWrapper NodeList l                this list   l                     public Node get int index                return this list item index                      public int size                 return this list getLength                         public static List lt Node gt  asList NamedNodeMap n            return n getLength      0   Collections  lt Node gt emptyList     new NodeMapWrapper n              static final class NodeMapWrapper extends AbstractList lt Node gt  implements RandomAccess           private final NamedNodeMap list           NodeMapWrapper NamedNodeMap l                this list   l                     public Node get int index                return this list item index                      public int size                 return this list getLength

User · Answer

You need something along the lines of this   DocumentBuilderFactory factory   DocumentBuilderFactory newInstance    DocumentBuilder builder   factory newDocumentBuilder    Document doc   builder parse  lt uri as string gt    XPathFactory xPathfactory   XPathFactory newInstance    XPath xpath   xPathfactory newXPath    XPathExpression expr   xpath compile  lt xpath expression gt      Then you call expr evaluate   passing in the document defined in that code and the return type you are expecting  and cast the result to the object type of the result   If you need help with a specific XPath expressions  you should probably ask it as separate questions  unless that was your question in the first place here - I understood your question to be how to use the API in Java    Edit   Response to comment   This XPath expression will get you the text of the first URL element under PowerBuilder    howto topic  name  PowerBuilder   url text     This will get you the second    howto topic  name  PowerBuilder   url 2  text     You get that with this code   expr evaluate doc  XPathConstants STRING     If you don t know how many URLs are in a given node  then you should rather do something like this   XPathExpression expr   xpath compile   howto topic  name  PowerBuilder   url    NodeList nl    NodeList  expr evaluate doc  XPathConstants NODESET     And then loop over the NodeList

User · Answer

Read XML file using XPathFactory  SAXParserFactory  and StAX  JSR-173    Using XPath get node and its child data   public static void main String   args        String xml     lt soapenv Body xmlns soapenv  http   schemas xmlsoap org soap envelope   gt                   lt Yash Data xmlns Yash  http   Yash stackoverflow com Services Yash  gt                   lt Yash Tags gt Java lt  Yash Tags gt  lt Yash Tags gt Javascript lt  Yash Tags gt  lt Yash Tags gt Selenium lt  Yash Tags gt                   lt Yash Top gt javascript lt  Yash Top gt  lt Yash User gt Yash-777 lt  Yash User gt                   lt  Yash Data gt  lt  soapenv Body gt        String jsonNameSpaces      soapenv   http   schemas xmlsoap org soap envelope                     Yash   http   Yash stackoverflow com Services Yash         String xpathExpression      Yash Data        Document doc1   getDocument false   fileName   xml       getNodesFromXpath doc1  xpathExpression  jsonNameSpaces       System out println   n                         Document doc2   getDocument true     books xml   xml       getNodesFromXpath doc2     person            static Document getDocument  boolean isFileName  String fileName  String xml         Document doc   null      try            DocumentBuilderFactory factory   DocumentBuilderFactory newInstance            factory setValidating false           factory setNamespaceAware true           factory setIgnoringComments true           factory setIgnoringElementContentWhitespace true            DocumentBuilder builder   factory newDocumentBuilder            if  isFileName                 File file   new File  fileName                FileInputStream stream   new FileInputStream  file                doc   builder parse  stream              else               doc   builder parse  string2Source  xml                      catch  SAXException   IOException e            e printStackTrace          catch  ParserConfigurationException e            e printStackTrace              return doc            ELEMENT NODE 1  ATTRIBUTE NODE 2  TEXT NODE 3  CDATA SECTION NODE 4      ENTITY REFERENCE NODE 5  ENTITY NODE 6  PROCESSING INSTRUCTION NODE 7      COMMENT NODE 8  DOCUMENT NODE 9  DOCUMENT TYPE NODE 10  DOCUMENT FRAGMENT NODE 11  NOTATION NODE 12      public static void getNodesFromXpath  Document doc  String xpathExpression  String jsonNameSpaces         try           XPathFactory xpf   XPathFactory newInstance            XPath xpath   xpf newXPath             JSONObject namespaces   getJSONObjectNameSpaces jsonNameSpaces           if   namespaces size    gt  0                 NamespaceContextImpl nsContext   new NamespaceContextImpl                 Iterator lt   gt  key   namespaces keySet   iterator                while  key hasNext         Apache WebServices Common Utilities                 String pPrefix   key next   toString                    String pURI   namespaces get pPrefix  toString                    nsContext startPrefixMapping pPrefix  pURI                             xpath setNamespaceContext nsContext                       XPathExpression compile   xpath compile xpathExpression           NodeList nodeList    NodeList  compile evaluate doc  XPathConstants NODESET           displayNodeList nodeList         catch  XPathExpressionException e            e printStackTrace             static void displayNodeList  NodeList nodeList         for  int i   0  i  lt  nodeList getLength    i              Node node   nodeList item i           String NodeName   node getNodeName             NodeList childNodes   node getChildNodes            if   childNodes getLength    gt  1                 for  int j   0  j  lt  childNodes getLength    j                       Node child   childNodes item j                   short nodeType   child getNodeType                    if   nodeType    1                         System out format    n t Node Name   s   Text  s     child getNodeName    child getTextContent                                                else               System out format    n Node Name   s   Text  s     NodeName  node getTextContent                         static InputSource string2Source  String str         InputSource inputSource   new InputSource  new StringReader  str          return inputSource    static JSONObject getJSONObjectNameSpaces  String jsonNameSpaces         if jsonNameSpaces indexOf       gt  -1     jsonNameSpaces   jsonNameSpaces replace                 JSONParser parser   new JSONParser        JSONObject namespaces   null      try           namespaces    JSONObject  parser parse jsonNameSpaces         catch  ParseException e            e printStackTrace              return namespaces      XML Document   lt  xml version  1 0  encoding  UTF-8   gt   lt book gt   lt person gt     lt first gt Yash lt  first gt     lt last gt M lt  last gt     lt age gt 22 lt  age gt   lt  person gt   lt person gt     lt first gt Bill lt  first gt     lt last gt Gates lt  last gt     lt age gt 46 lt  age gt   lt  person gt   lt person gt     lt first gt Steve lt  first gt     lt last gt Jobs lt  last gt     lt age gt 40 lt  age gt   lt  person gt   lt  book gt    Out put for the given XPathExpression   String xpathExpression      person first     OutPut   Node Name  first   Text Yash    Node Name  first   Text Bill    Node Name  first   Text Steve      String xpathExpression      person     OutPut       Node Name  first   Text Yash        Node Name  last   Text M        Node Name  age   Text 22        Node Name  first   Text Bill        Node Name  last   Text Gates        Node Name  age   Text 46        Node Name  first   Text Steve        Node Name  last   Text Jobs        Node Name  age   Text 40      String xpathExpression      Yash Data     OutPut       Node Name  Yash Tags   Text Java        Node Name  Yash Tags   Text Javascript        Node Name  Yash Tags   Text Selenium        Node Name  Yash Top   Text javascript        Node Name  Yash User   Text Yash-777       See this link  for our own Implementation of NamespaceContext

[java] How to read XML using XPath in Java

XML Document

Parser class

Source Code

Examples related to java

Examples related to xml

Examples related to parsing

Examples related to xpath