[xml] What is the difference between XML and XSD?

What is the difference between Extensible Markup Language (XML) and XML Schema (XSD)?

This question is related to xml xsd

The answer is


XML has a much wider application than f.ex. HTML. It doesn't have an intrinsic, or default "application". So, while you might not really care that web pages are also governed by what's allowed, from the author's side, you'll probably want to precisely define what an XML document may and may not contain.

It's like designing a database.

The thing about XML technologies is that they are textual in nature. With XSD, it means you have a data structure definition framework that can be "plugged in" to text processing tools like PHP. So not only can you manipulate the data itself, but also very easily change and document the structure, and even auto-generate front-ends.

Viewed like this, XSD is the "glue" or "middleware" between data (XML) and data-processing tools.



Take an example

<root>
  <parent>
     <child_one>Y</child_one>
     <child_two>12</child_two>
  </parent>
</root>

and design an xsd for that:

<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" 
xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="root">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="parent">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="child_one" type="xs:string" />
              <xs:element name="child_two" type="xs:int" />
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>


What isn't possible with XSD: would like to write it first as the list is very small
1) You can't validate a node/attribute using the value of another node/attribute.
2) This is a restriction : An element defined in XSD file must be defined with only one datatype. [in the above example, for <child_two> appearing in another <parent> node, datatype cannot be defined other than int.
3) You can't ignore the validation of elements and attributes, ie, if an element/attribute appears in XML, it must be well-defined in the corresponding XSD. Though usage of <xsd:any> allows it, but it has got its own rules. Abiding which leads to the validation error. I had tried for a similar approach, and certainly wasn't successful, here is the Q&A


what are possible with XSD:
1) You can test the proper hierarchy of the XML nodes. [xsd defines which child should come under which parent, etc, abiding which will be counted as error, in above example, child_two cannot be the immediate child of root, but it is the child of "parent" tag which is in-turn a child of "root" node, there is a hierarchy..]
2) You can define Data type of the values of the nodes. [in above example child_two cannot have any-other data than number]
3) You can also define custom data_types, [example, for node <month>, the possible data can be one of the 12 months.. so you need to define all the 12 months in a new data type writing all the 12 month names as enumeration values .. validation shows error if the input XML contains any-other value than these 12 values .. ]
4) You can put the restriction on the occurrence of the elements, using minOccurs and maxOccurs, the default values are 1 and 1.

.. and many more ...


XML versus XSD

XML defines the syntax of elements and attributes for structuring data in a well-formed document.

XSD (aka XML Schema), like DTD before, powers the eXtensibility in XML by enabling the user to define the vocabulary and grammar of the elements and attributes in a valid XML document.


SIMPLE XML EXAMPLE:

<school>
  <firstname>John</firstname>
  <lastname>Smith</lastname>
</school>

XSD OF ABOVE XML(Explained):

<xs:element name="school">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="firstname" type="xs:string"/>
      <xs:element name="lastname" type="xs:string"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

Here:

xs:element : Defines an element.

xs:sequence : Denotes child elements only appear in the order mentioned.

xs:complexType : Denotes it contains other elements.

xs:simpleType : Denotes they do not contain other elements.

type: string, decimal, integer, boolean, date, time,

  • In simple words, xsd is another way to represent and validate XML data with the specific type.
  • With the help of extra attributes, we can perform multiple operations.

  • Performing any task on xsd is simpler than xml.


XSD:
XSD (XML Schema Definition) specifies how to formally describe the elements in an Extensible Markup Language (XML) document.
Xml:
XML was designed to describe data.It is independent from software as well as hardware.
It enhances the following things.
-Data sharing.
-Platform independent.
-Increasing the availability of Data.

Differences:

  1. XSD is based and written on XML.

  2. XSD defines elements and structures that can appear in the document, while XML does not.

  3. XSD ensures that the data is properly interpreted, while XML does not.

  4. An XSD document is validated as XML, but the opposite may not always be true.

  5. XSD is better at catching errors than XML.

An XSD defines elements that can be used in the documents, relating to the actual data with which it is to be encoded.
for eg:
A date that is expressed as 1/12/2010 can either mean January 12 or December 1st. Declaring a date data type in an XSD document, ensures that it follows the format dictated by XSD.


Basically an XSD file defines how the XML file is going to look like. It's a Schema file which defines the structure of the XML file. So it specifies what the possible fields are and what size they are going to be.

An XML file is an instance of XSD as it uses the rules defined in the XSD.