Other Application Areas of XML

Other Application Areas of XML

XML is so powerful and so flexible that many groups of people are considering using it for different purposes. Let us consider some of them.

  • Metadata—Using XML to describe meta information about other documents or online resources

  • Configuration files—Using XML to describe the configuration parameters of software

  • Rich documents—Using XML to customize and enrich document description

1 Metadata

In the first half of 1997, XML was mainly thought of as the language for metadata. Metadata refers to information about data, such as title, author, file size, creation date, revision history, and keywords. Metadata can be used for searches, information filtering, document management, and so on.

To show the usefulness of explicit metadata, let us assume that we want to search documents that were written by Bill Clinton. We will get thousands of hits that contain the phrase "Bill Clinton" if we input "Bill Clinton" as the search keyword in current search engines. Most of the hits will be "noises" that merely mention Bill Clinton in the body of the article and will not be articles written by Bill Clinton. The search would be much more productive if we could express the search query as "find documents whose Author element contains the words 'Bill Clinton'."

Unfortunately, no such element, or tag, is defined in HTML, and it is unlikely that HTML will be extended to have a new Author tag in the near future. One reason is historical. HTML has been too rapidly extended as a result of the "browser war" between Netscape and Microsoft. W3C seems to have been more cautious about further extensions to HTML after HTML 4.0 was released in April 1998.

The second reason is that an HTML extension would not solve all the problems with metadata. Other resources, such as image files, audio and video files, and other content types, may require metadata extensions as well.

The third reason concerns performance. HTML has the TITLE and META tags, which can accommodate some metadata. Because these tags are inside an HTML document, search engines cannot refer to the information without downloading the entire HTML file. It is not efficient to download a 100KB HTML file just to check whether the TITLE tag contains a certain character string, particularly when hundreds of such files are available from a Web site. If we could put the metadata of all the documents available on the site into a single metafile, the search performance would greatly improve.

For these reasons, an external metadata description has received a lot of attention. Because of its extensibility, flexibility, and readability, XML is considered to be the best method for defining metadata syntax. RDF is an example of such metadata formats defined in XML.

2 Configuration Files

Another application area of XML is as a language for software configuration files. As software becomes complex, the configuration files also become complex. If a configuration file has multiple sections, or some fields require complex data, or support for international character sets is mandatory, it makes a lot of sense to use XML as the language for the configuration file. For example, Tomcat, an application server that we will use in Chapter 10, uses XML extensively in its configuration files.

3 Rich Documents

XML was originally developed as a simple subset of the Standard Generalized Markup Language, or SGML (ISO 8879), which was defined in 1986 as an international standard for document markup. Therefore, it is more than natural that XML is also used for document markup. Extensible Hypertext Markup Language (XHTML), an XML-compliant version of HTML, is one such example. The entire manuscript of this book was also written in XML.[5]

[5] It is processed by a tool called SmartDoc (see http://www.asahi-net.or.jp/~dp8t-asm/java/tools/SmartDoc/index.html for details).

One popular use of XML as a document markup language is Web publishing, in which a content provider prepares original content marked up in XML and publishes it through Web sites. The content provider then transforms the content using a tool such as an XSLT processor into a set of HTML files with a customized design based on their customers' requirements. We briefly touch on a tool that automates this process in Chapter 10.

Although metadata and document markup are significant application areas of XML, in this book we concentrate mostly on the use of XML for B2B messaging.

     Python   SQL   Java   php   Perl 
     game development   web development   internet   *nix   graphics   hardware 
     telecommunications   C++ 
     Flash   Active Directory   Windows