[xml] How to parse XML using vba

I work in VBA, and want to parse a string eg

<PointN xsi:type='typens:PointN' 
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' 
xmlns:xs='http://www.w3.org/2001/XMLSchema'>
    <X>24.365</X>
    <Y>78.63</Y>
</PointN>

and get the X & Y values into two separate integer variables.

I'm a newbie when it comes to XML, since I'm stuck in VB6 and VBA, because of the field I work in.

How do I do this?

This question is related to xml vba parsing xml-parsing

The answer is


Thanks for the pointers.

I don't know, whether this is the best approach to the problem or not, but here is how I got it to work. I referenced the Microsoft XML, v2.6 dll in my VBA, and then the following code snippet, gives me the required values

Dim objXML As MSXML2.DOMDocument

Set objXML = New MSXML2.DOMDocument

If Not objXML.loadXML(strXML) Then  'strXML is the string with XML'
    Err.Raise objXML.parseError.ErrorCode, , objXML.parseError.reason
End If
 
Dim point As IXMLDOMNode
Set point = objXML.firstChild

Debug.Print point.selectSingleNode("X").Text
Debug.Print point.selectSingleNode("Y").Text

You can use a XPath Query:

Dim objDom As Object        '// DOMDocument
Dim xmlStr As String, _
    xPath As String

xmlStr = _
    "<PointN xsi:type='typens:PointN' " & _
    "xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' " & _
    "xmlns:xs='http://www.w3.org/2001/XMLSchema'> " & _
    "    <X>24.365</X> " & _
    "    <Y>78.63</Y> " & _
    "</PointN>"

Set objDom = CreateObject("Msxml2.DOMDocument.3.0")     '// Using MSXML 3.0

'/* Load XML */
objDom.LoadXML xmlStr

'/*
' * XPath Query
' */        

'/* Get X */
xPath = "/PointN/X"
Debug.Print objDom.SelectSingleNode(xPath).text

'/* Get Y */
xPath = "/PointN/Y"
Debug.Print objDom.SelectSingleNode(xPath).text

Add reference Project->References Microsoft XML, 6.0 and you can use example code:

    Dim xml As String

    xml = "<root><person><name>Me </name> </person> <person> <name>No Name </name></person></root> "
    Dim oXml As MSXML2.DOMDocument60
    Set oXml = New MSXML2.DOMDocument60
    oXml.loadXML xml
    Dim oSeqNodes, oSeqNode As IXMLDOMNode

    Set oSeqNodes = oXml.selectNodes("//root/person")
    If oSeqNodes.length = 0 Then
       'show some message
    Else
        For Each oSeqNode In oSeqNodes
             Debug.Print oSeqNode.selectSingleNode("name").Text
        Next
    End If 

be careful with xml node //Root/Person is not same with //root/person, also selectSingleNode("Name").text is not same with selectSingleNode("name").text


Often it is easier to parse without VBA, when you don't want to enable macros. This can be done with the replace function. Enter your start and end nodes into cells B1 and C1.

Cell A1: {your XML here}
Cell B1: <X>
Cell C1: </X>
Cell D1: =REPLACE(A1,1,FIND(A2,A1)+LEN(A2)-1,"")
Cell E1: =REPLACE(A4,FIND(A3,A4),LEN(A4)-FIND(A3,A4)+1,"")

And the result line E1 will have your parsed value:

Cell A1: {your XML here}
Cell B1: <X>
Cell C1: </X>
Cell D1: 24.365<X><Y>78.68</Y></PointN>
Cell E1: 24.365

This is an example OPML parser working with FeedDemon opml files:

Sub debugPrintOPML()

' http://msdn.microsoft.com/en-us/library/ms763720(v=VS.85).aspx
' http://msdn.microsoft.com/en-us/library/system.xml.xmlnode.selectnodes.aspx
' http://msdn.microsoft.com/en-us/library/ms256086(v=VS.85).aspx ' expressions
' References: Microsoft XML

Dim xmldoc As New DOMDocument60
Dim oNodeList As IXMLDOMSelection
Dim oNodeList2 As IXMLDOMSelection
Dim curNode As IXMLDOMNode
Dim n As Long, n2 As Long, x As Long

Dim strXPathQuery As String
Dim attrLength As Byte
Dim FilePath As String

FilePath = "rss.opml"

xmldoc.Load CurrentProject.Path & "\" & FilePath

strXPathQuery = "opml/body/outline"
Set oNodeList = xmldoc.selectNodes(strXPathQuery)

For n = 0 To (oNodeList.length - 1)
    Set curNode = oNodeList.Item(n)
    attrLength = curNode.Attributes.length
    If attrLength > 1 Then ' or 2 or 3
        Call processNode(curNode)
    Else
        Call processNode(curNode)
        strXPathQuery = "opml/body/outline[position() = " & n + 1 & "]/outline"
        Set oNodeList2 = xmldoc.selectNodes(strXPathQuery)
        For n2 = 0 To (oNodeList2.length - 1)
            Set curNode = oNodeList2.Item(n2)
            Call processNode(curNode)
        Next
    End If
        Debug.Print "----------------------"
Next

Set xmldoc = Nothing

End Sub

Sub processNode(curNode As IXMLDOMNode)

Dim sAttrName As String
Dim sAttrValue As String
Dim attrLength As Byte
Dim x As Long

attrLength = curNode.Attributes.length

For x = 0 To (attrLength - 1)
    sAttrName = curNode.Attributes.Item(x).nodeName
    sAttrValue = curNode.Attributes.Item(x).nodeValue
    Debug.Print sAttrName & " = " & sAttrValue
Next
    Debug.Print "-----------"

End Sub

This one takes multilevel trees of folders (Awasu, NewzCrawler):

...
Call xmldocOpen4
Call debugPrintOPML4(Null)
...

Dim sText4 As String

Sub debugPrintOPML4(strXPathQuery As Variant)

Dim xmldoc4 As New DOMDocument60
'Dim xmldoc4 As New MSXML2.DOMDocument60 ' ?
Dim oNodeList As IXMLDOMSelection
Dim curNode As IXMLDOMNode
Dim n4 As Long

If IsNull(strXPathQuery) Then strXPathQuery = "opml/body/outline"

' http://msdn.microsoft.com/en-us/library/ms754585(v=VS.85).aspx
xmldoc4.async = False
xmldoc4.loadXML sText4
If (xmldoc4.parseError.errorCode <> 0) Then
   Dim myErr
   Set myErr = xmldoc4.parseError
   MsgBox ("You have error " & myErr.reason)
Else
'   MsgBox xmldoc4.xml
End If

Set oNodeList = xmldoc4.selectNodes(strXPathQuery)

For n4 = 0 To (oNodeList.length - 1)
    Set curNode = oNodeList.Item(n4)
    Call processNode4(strXPathQuery, curNode, n4)
Next

Set xmldoc4 = Nothing

End Sub

Sub processNode4(strXPathQuery As Variant, curNode As IXMLDOMNode, n4 As Long)

Dim sAttrName As String
Dim sAttrValue As String
Dim x As Long

For x = 0 To (curNode.Attributes.length - 1)
    sAttrName = curNode.Attributes.Item(x).nodeName
    sAttrValue = curNode.Attributes.Item(x).nodeValue
    'If sAttrName = "text"
    Debug.Print strXPathQuery & " :: " & sAttrName & " = " & sAttrValue
    'End If
Next
    Debug.Print ""

If curNode.childNodes.length > 0 Then
    Call debugPrintOPML4(strXPathQuery & "[position() = " & n4 + 1 & "]/" & curNode.nodeName)
End If

End Sub

Sub xmldocOpen4()

Dim oFSO As New FileSystemObject ' Microsoft Scripting Runtime Reference
Dim oFS
Dim FilePath As String

FilePath = "rss_awasu.opml"
Set oFS = oFSO.OpenTextFile(CurrentProject.Path & "\" & FilePath)
sText4 = oFS.ReadAll
oFS.Close

End Sub

or better:

Sub xmldocOpen4()

Dim FilePath As String

FilePath = "rss.opml"

' function ConvertUTF8File(sUTF8File):
' http://www.vbmonster.com/Uwe/Forum.aspx/vb/24947/How-to-read-UTF-8-chars-using-VBA
' loading and conversion from Utf-8 to UTF
sText8 = ConvertUTF8File(CurrentProject.Path & "\" & FilePath)

End Sub

but I don't understand, why xmldoc4 should be loaded each time.


Update

The procedure presented below gives an example of parsing XML with VBA using the XML DOM objects. Code is based on a beginners guide of the XML DOM.

Public Sub LoadDocument()
    Dim xDoc As MSXML.DOMDocument
    Set xDoc = New MSXML.DOMDocument
    xDoc.validateOnParse = False
    If xDoc.Load("C:\My Documents\sample.xml") Then
        ' The document loaded successfully.
        ' Now do something intersting.
        DisplayNode xDoc.childNodes, 0
    Else
        ' The document failed to load.
        ' See the previous listing for error information.
    End If
End Sub

Public Sub DisplayNode(ByRef Nodes As MSXML.IXMLDOMNodeList, _
   ByVal Indent As Integer)

   Dim xNode As MSXML.IXMLDOMNode
   Indent = Indent + 2

   For Each xNode In Nodes
      If xNode.nodeType = NODE_TEXT Then
         Debug.Print Space$(Indent) & xNode.parentNode.nodeName & _
            ":" & xNode.nodeValue
      End If

      If xNode.hasChildNodes Then
         DisplayNode xNode.childNodes, Indent
      End If
   Next xNode
End Sub

Nota Bene - This initial answer shows the simplest possible thing I could imagine (at the time I was working on a very specific issue) . Naturally using the XML facilities built into the VBA XML Dom would be much better. See the updates above.

Original Response

I know this is a very old post but I wanted to share my simple solution to this complicated question. Primarily I've used basic string functions to access the xml data.

This assumes you have some xml data (in the temp variable) that has been returned within a VBA function. Interestingly enough one can also see how I am linking to an xml web service to retrieve the value. The function shown in the image also takes a lookup value because this Excel VBA function can be accessed from within a cell using = FunctionName(value1, value2) to return values via the web service into a spreadsheet.

sample function


openTag = ""
closeTag = "" 

' Locate the position of the enclosing tags startPos = InStr(1, temp, openTag) endPos = InStr(1, temp, closeTag) startTagPos = InStr(startPos, temp, ">") + 1 ' Parse xml for returned value Data = Mid(temp, startTagPos, endPos - startTagPos)

Here is a short sub to parse a MicroStation Triforma XML file that contains data for structural steel shapes.

'location of triforma structural files
'c:\programdata\bentley\workspace\triforma\tf_imperial\data\us.xml

Sub ReadTriformaImperialData()
Dim txtFileName As String
Dim txtFileLine As String
Dim txtFileNumber As Long

Dim Shape As String
Shape = "w12x40"

txtFileNumber = FreeFile
txtFileName = "c:\programdata\bentley\workspace\triforma\tf_imperial\data\us.xml"

Open txtFileName For Input As #txtFileNumber

Do While Not EOF(txtFileNumber)
Line Input #txtFileNumber, txtFileLine
    If InStr(1, UCase(txtFileLine), UCase(Shape)) Then
        P1 = InStr(1, UCase(txtFileLine), "D=")
        D = Val(Mid(txtFileLine, P1 + 3))

        P2 = InStr(1, UCase(txtFileLine), "TW=")
        TW = Val(Mid(txtFileLine, P2 + 4))

        P3 = InStr(1, UCase(txtFileLine), "WIDTH=")
        W = Val(Mid(txtFileLine, P3 + 7))

        P4 = InStr(1, UCase(txtFileLine), "TF=")
        TF = Val(Mid(txtFileLine, P4 + 4))

        Close txtFileNumber
        Exit Do
    End If
Loop
End Sub

From here you can use the values to draw the shape in MicroStation 2d or do it in 3d and extrude it to a solid.


Examples related to xml

strange error in my Animation Drawable How do I POST XML data to a webservice with Postman? PHP XML Extension: Not installed How to add a Hint in spinner in XML Generating Request/Response XML from a WSDL Manifest Merger failed with multiple errors in Android Studio How to set menu to Toolbar in Android How to add colored border on cardview? Android: ScrollView vs NestedScrollView WARNING: Exception encountered during context initialization - cancelling refresh attempt

Examples related to vba

Copy filtered data to another sheet using VBA Better way to find last used row Check if a value is in an array or not with Excel VBA Creating an Array from a Range in VBA Excel: macro to export worksheet as CSV file without leaving my current Excel sheet VBA: Convert Text to Number What's the difference between "end" and "exit sub" in VBA? Rename Excel Sheet with VBA Macro Extract Data from PDF and Add to Worksheet Quicker way to get all unique values of a column in VBA?

Examples related to parsing

Got a NumberFormatException while trying to parse a text file for objects Uncaught SyntaxError: Unexpected end of JSON input at JSON.parse (<anonymous>) Python/Json:Expecting property name enclosed in double quotes Correctly Parsing JSON in Swift 3 How to get response as String using retrofit without using GSON or any other library in android UIButton action in table view cell "Expected BEGIN_OBJECT but was STRING at line 1 column 1" How to convert an XML file to nice pandas dataframe? How to extract multiple JSON objects from one file? How to sum digits of an integer in java?

Examples related to xml-parsing

How to create XML file with specific structure in Java jQuery xml error ' No 'Access-Control-Allow-Origin' header is present on the requested resource.' SyntaxError of Non-ASCII character xml.LoadData - Data at the root level is invalid. Line 1, position 1 XML Error: Extra content at the end of the document How to use sed to extract substring How to fix Invalid byte 1 of 1-byte UTF-8 sequence The best node module for XML parsing Parsing XML with namespace in Python via 'ElementTree' oracle plsql: how to parse XML and insert into table