June 21, 2011, 12:21 p.m.
posted by pythonics
Documents and DTDs
To be perfectly correct, we must explain that "XML" has come to mean many subtly different things. An XML document is a document containing content that conforms to a markup language defined from the XML standard. An XML Document Type Definition (XML DTD) is a set of rulesmore formally known as entity and element declarationsthat define an XML markup language; i.e., how the tags are arranged in a correct (valid) XML document. To make things even more confusing, entity and element declarations may appear in an XML document itself, as well as within an XML DTD.
An XML document contains character data, which consists of plain content and markup in the form of tags and XML declarations. Thus:
is a line in a well-formed XML document. Well-formed XML documents follow certain rules, such as the requirement for every tag to have a closing tag. These rules are presented in the context of XHTML in Chapter 16.
To be considered valida valid XML document conforms to a DTDevery XML document must have a corresponding set of XML declarations that define how the tags and content should be arranged within it. These declarations may be included directly in the XML document, or they may be stored separately in an XML DTD. If an XML DTD exists that defines the <blah> tag, our well-formed XML document is valid, provided you preface it with a <!DOCTYPE> tag that explains where to find the appropriate DTD:
<?xml version="1.0"?> <!DOCTYPE blah SYSTEM "blah.dtd"> <blah>harrumph</blah>
The example document begins with the optional <?xml> directive declaring the version of XML it uses. It then uses the <!DOCTYPE> directive to identify the DTD that some automated system, such as a browser, uses to process and perhaps display the contents of the document. In this case, a DTD named blah.dtd should be accessible to the browser[*] so that the browser can determine whether the <blah> tag is valid within the document.
XML DTDs contain only XML entity and element declarations. XML documents, on the other hand, may contain both XML element declarations and conventional content that uses those elements to create a document. This intermingling of content and declarations is perfectly acceptable to a computer processing an XML document, but it can get confusing for humans trying to learn about XML. For this reason, we focus our attention in this chapter on the XML entity and element declaration features that you can use to define new tags and document types. In other words, we are addressing only the DTD features of XML; the content features mirror the rules and requirements you already know and use in order to create HTML documents.