[c#] Reading specific XML elements from XML file

I have the following XML file

<lexicon>
<word>
  <base>a</base>
  <category>determiner</category>
  <id>E0006419</id>
</word>
<word>
  <base>abandon</base>
  <category>verb</category>
  <id>E0006429</id>
  <ditransitive/>
  <transitive/>
</word>
<word>
  <base>abbey</base>
  <category>noun</category>
  <id>E0203496</id>
</word>
<word>
  <base>ability</base>
  <category>noun</category>
  <id>E0006490</id>
</word>
<word>
  <base>able</base>
  <category>adjective</category>
  <id>E0006510</id>
  <predicative/>
  <qualitative/>
</word>
<word>
  <base>abnormal</base>
  <category>adjective</category>
  <id>E0006517</id>
  <predicative/>
  <qualitative/>
</word>
<word>
  <base>abolish</base>
  <category>verb</category>
  <id>E0006524</id>
  <transitive/>
</word>
</lexicon>

I need to read this file with C# application, and if only the category is verb I want to print its entire element word.
How can I do that?

This question is related to c# xml

The answer is


XDocument xdoc = XDocument.Load(path_to_xml);
var word = xdoc.Elements("word")
               .SingleOrDefault(w => (string)w.Element("category") == "verb");

This query will return whole word XElement. If there is more than one word element with category verb, than you will get an InvalidOperationException. If there is no elements with category verb, result will be null.


You could use an XPath, too. A bit old fashioned but still effective:

using System.Xml;

...

XmlDocument xmlDocument;

xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xml);

foreach (XmlElement xmlElement in 
    xmlDocument.DocumentElement.SelectNodes("word[category='verb']"))
{
    Console.Out.WriteLine(xmlElement.OuterXml);
}

This is how I would do it (the code below has been tested, full source provided below), begin by creating a class with common properties

    class Word
    {
        public string Base { get; set; }
        public string Category { get; set; }
        public string Id { get; set; }
    }

load using XDocument with INPUT_DATA for demonstration purposes and find element name with lexicon . . .

    XDocument doc = XDocument.Parse(INPUT_DATA);
    XElement lex = doc.Element("lexicon");

make sure there is a value and use linq to extract the word elements from it . . .

    Word[] catWords = null;
    if (lex != null)
    {
        IEnumerable<XElement> words = lex.Elements("word");
        catWords = (from itm in words
                    where itm.Element("category") != null
                        && itm.Element("category").Value == "verb"
                        && itm.Element("id") != null
                        && itm.Element("base") != null
                    select new Word() 
                    {
                        Base = itm.Element("base").Value,
                        Category = itm.Element("category").Value,
                        Id = itm.Element("id").Value,
                    }).ToArray<Word>();
    }

The where statement checks if the category element exists and that the category value is not null and then check it again that it is a verb. Then check that the other nodes also exists . . .

The linq query will return an IEnumerable< Typename > object, so we can call ToArray< Typename >() to cast the entire collection into the type we want.

Then print it to get . . .

[Found]
 Id: E0006429
 Base: abandon
 Category: verb

[Found]
 Id: E0006524
 Base: abolish
 Category: verb

Full Source:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;

namespace test
{
    class Program
    {

        class Word
        {
            public string Base { get; set; }
            public string Category { get; set; }
            public string Id { get; set; }
        }

        static void Main(string[] args)
        {
            XDocument doc = XDocument.Parse(INPUT_DATA);
            XElement lex = doc.Element("lexicon");
            Word[] catWords = null;
            if (lex != null)
            {
                IEnumerable<XElement> words = lex.Elements("word");
                catWords = (from itm in words
                            where itm.Element("category") != null
                                && itm.Element("category").Value == "verb"
                                && itm.Element("id") != null
                                && itm.Element("base") != null
                            select new Word() 
                            {
                                Base = itm.Element("base").Value,
                                Category = itm.Element("category").Value,
                                Id = itm.Element("id").Value,
                            }).ToArray<Word>();
            }

            //print it
            if (catWords != null)
            {
                Console.WriteLine("Words with <category> and value verb:\n");
                foreach (Word itm in catWords)
                    Console.WriteLine("[Found]\n Id: {0}\n Base: {1}\n Category: {2}\n", 
                        itm.Id, itm.Base, itm.Category);
            }
        }

        const string INPUT_DATA =
        @"<?xml version=""1.0""?>
        <lexicon>
        <word>
          <base>a</base>
          <category>determiner</category>
          <id>E0006419</id>
        </word>
        <word>
          <base>abandon</base>
          <category>verb</category>
          <id>E0006429</id>
          <ditransitive/>
          <transitive/>
        </word>
        <word>
          <base>abbey</base>
          <category>noun</category>
          <id>E0203496</id>
        </word>
        <word>
          <base>ability</base>
          <category>noun</category>
          <id>E0006490</id>
        </word>
        <word>
          <base>able</base>
          <category>adjective</category>
          <id>E0006510</id>
          <predicative/>
          <qualitative/>
        </word>
        <word>
          <base>abnormal</base>
          <category>adjective</category>
          <id>E0006517</id>
          <predicative/>
          <qualitative/>
        </word>
        <word>
          <base>abolish</base>
          <category>verb</category>
          <id>E0006524</id>
          <transitive/>
        </word>
        </lexicon>";

    }
}

Alternatively, you can use XPath query via XPathSelectElements method:

var document = XDocument.Parse(yourXmlAsString);
var words = document.XPathSelectElements("//word[./category[text() = 'verb']]");