What does the 'standalone
' directive mean in an XML document?
This question is related to
xml
The intent of the standalone=yes
declaration is to guarantee that the information inside the document can be faithfully retrieved based only on the internal DTD, i.e. the document can "stand alone" with no external references. Validating a standalone document ensures that non-validating processors will have all of the information available to correctly parse the document.
The standalone declaration serves no purpose if a document has no external DTD, and the internal DTD has no parameter entity references, as these documents are already implicitly standalone.
The following are the actual effects of using standalone=yes
.
Forces processors to throw an error when parsing documents with an external DTD or parameter entity references, if the document contains references to entities not declared in the internal DTD (with the exception of replacement text of parameter entities as non-validating processors are not required to parse this); amp
, lt
, gt
, apos
, and quot
are the only exceptions
When parsing a document not declared as standalone, a non-validating processor is free to stop parsing the internal DTD as soon as it encounters a parameter entity reference. Declaring a document as standalone forces non-validating processors to parse markup declarations in the internal DTD even after they ignore one or more parameter entity references.
Forces validating processors to throw an error if any of the following are found in the document, and their respective declarations are in the external DTD or in parameter entity replacement text:
amp
, lt
, gt
, apos
, and quot
)A non-validating processor might consider retrieving the external DTD and expanding all parameter entity references for documents that are not standalone, even though it is under no obligation to do so, i.e. setting standalone=yes
could theoretically improve performance for non-validating processors (spoiler alert: it probably won't make a difference).
The other answers here are either incomplete or incorrect, the main misconception is that
The standalone declaration is a way of telling the parser to ignore any markup declarations in the DTD. The DTD is thereafter used for validation only.
standalone="yes" means that the XML processor must use the DTD for validation only.
Quite the opposite, declaring a document as standalone will actually force a non-validating processor to parse internal declarations it must normally ignore (i.e. those after an ignored parameter entity reference). Non-validating processors must still use the info in the internal DTD to provide default attribute values and normalize tokenized attributes, as this is independent of validation.
standalone
describes if the current XML document depends on an external markup declaration.
W3C describes its purpose in "Extensible Markup Language (XML) 1.0 (Fifth Edition)":
yes
and no
, where no
is the default value.standalone="yes"
means that the XML processor must use the DTD for validation only. In that case it will not be used for:
standalone="yes"
may add validity constraints if the document uses an external DTD. When the document contains things that would require modification of the XML, such as default values for attributes, and standalone="yes"
is used then the document is invalid.standalone="yes"
may help to optimize performance of document processing.Source: The standalone pseudo-attribute is only relevant if a DTD is used
Markup declarations can affect the content of the document, as passed from an XML processor to an application; examples are attribute defaults and entity declarations. The standalone document declaration, which may appear as a component of the XML declaration, signals whether or not there are such declarations which appear external to the document entity or in parameter entities. [Definition: An external markup declaration is defined as a markup declaration occurring in the external subset or in a parameter entity (external or internal, the latter being included because non-validating processors are not required to read them).]
Source: Stackoverflow.com