Getting XML Node text value with Java DOM

Question

I can t fetch text value with Node getNodeValue    Node getFirstChild   getNodeValue   or with Node getTextContent     My XML is like    lt add job  351  gt       lt tag gt foobar lt  tag gt       lt tag gt foobar2 lt  tag gt   lt  add gt    And I m trying to get tag value  non-text element fetching works fine   My Java code sounds like  Document doc   db parse new File args 0     Node n   doc getFirstChild    NodeList nl   n getChildNodes       Node an an2   for  int i 0  i  lt  nl getLength    i          an   nl item i        if an getNodeType    Node ELEMENT NODE            NodeList nl2   an getChildNodes             for int i2 0  i2 lt nl2 getLength    i2                  an2   nl2 item i2                   DEBUG PRINTS             System out println an2 getNodeName        type      an2 getNodeType                         if an2 hasChildNodes                    System out println an2 getFirstChild   getTextContent                  if an2 hasChildNodes                    System out println an2 getFirstChild   getNodeValue                  System out println an2 getTextContent                 System out println an2 getNodeValue                         It prints out  tag type  1    tag1 tag1 tag1 null  text type  3    blank line   blank line        Thanks for the help

User · Answer

If your XML goes quite deep  you might want to consider using XPath  which comes with your JRE  so you can access the contents far more easily using   String text   xp evaluate    add  job  351   tag position   1  text          document getDocumentElement       Full example   import static org junit Assert assertEquals  import java io StringReader      import javax xml parsers DocumentBuilder  import javax xml parsers DocumentBuilderFactory  import javax xml xpath XPath  import javax xml xpath XPathFactory      import org junit Before  import org junit Test  import org w3c dom Document  import org xml sax InputSource   public class XPathTest        private Document document        Before     public void setup   throws Exception           String xml     lt add job   351   gt  lt tag gt foobar lt  tag gt  lt tag gt foobar2 lt  tag gt  lt  add gt            DocumentBuilderFactory dbf   DocumentBuilderFactory newInstance            DocumentBuilder db   dbf newDocumentBuilder            document   db parse new InputSource new StringReader xml                 Test     public void testXPath   throws Exception           XPathFactory xpf   XPathFactory newInstance            XPath xp   xpf newXPath            String text   xp evaluate    add  job  351   tag position   1  text                     document getDocumentElement             assertEquals  foobar   text

User · Answer

I d print out the result of an2 getNodeName   as well for debugging purposes   My guess is that your tree crawling code isn t crawling to the nodes that you think it is   That suspicion is enhanced by the lack of checking for node names in your code   Other than that  the javadoc for Node defines  getNodeValue    to return null for Nodes of type Element   Therefore  you really should be using getTextContent     I m not sure why that wouldn t give you the text that you want   Perhaps iterate the children of your tag node and see what types are there   Tried this code and it works for me   String xml     lt add job   351   gt  n                       lt tag gt foobar lt  tag gt  n                       lt tag gt foobar2 lt  tag gt  n                   lt  add gt    DocumentBuilderFactory dbf   DocumentBuilderFactory newInstance    DocumentBuilder db   dbf newDocumentBuilder    ByteArrayInputStream bis   new ByteArrayInputStream xml getBytes     Document doc   db parse bis   Node n   doc getFirstChild    NodeList nl   n getChildNodes    Node an an2   for  int i 0  i  lt  nl getLength    i          an   nl item i       if an getNodeType    Node ELEMENT NODE            NodeList nl2   an getChildNodes             for int i2 0  i2 lt nl2 getLength    i2                  an2   nl2 item i2                  DEBUG PRINTS             System out println an2 getNodeName        type      an2 getNodeType                        if an2 hasChildNodes    System out println an2 getFirstChild   getTextContent                 if an2 hasChildNodes    System out println an2 getFirstChild   getNodeValue                 System out println an2 getTextContent                 System out println an2 getNodeValue                         Output was    text  type  3   foobar foobar  text  type  3   foobar2 foobar2

User · Answer

I use a very old java  Jdk 1 4 08 and I had the same issue  The Node class for me did not had the getTextContent   method  I had to use Node getFirstChild   getNodeValue   instead of Node getNodeValue   to get the value of the node  This fixed for me

User · Answer

If you are open to vtd-xml  which excels at both performance and memory efficiency  below is the code to do what you are looking for   in both XPath and manual navigation    the overall code is much concise and easier to understand      import com ximpleware    public class queryText       public static void main String   s  throws VTDException          VTDGen vg   new VTDGen            if   vg parseFile  input xml   true               return          VTDNav vn   vg getNav            AutoPilot ap   new AutoPilot vn              first manually navigate         if vn toElement VTDNav FC  tag                 int i  vn getText                if  i  -1                   System out println  text     gt   vn toString i                              if  vn toElement VTDNav NS  tag                     i vn getText                    System out println  text     gt   vn toString i                                        second version use XPath         ap selectXPath   add tag text              int i 0          while  i ap evalXPath      -1               System out println  text node      gt   vn toString i

[java] Getting XML Node text value with Java DOM

Examples related to java

Examples related to xml

Examples related to dom