Validating XML Documents






Validating XML Documents

Problem

You want to make sure your XML document abides by a schema, such as XML Schema, RelaxNG, and DTDs.

Solution

Use the DOM extension.

With existing DOM objects, call DOMDocument::schemaValidate( ) or DOMDocument::relaxNGValidate( ):

$file = 'address-book.xml';
$schema = 'address-book.xsd';
$ab = new DOMDocument
$ab->load($file);

if ($ab->schemaValidate($schema)) {
    print "$file is valid.\n";
} else {
    print "$file is invalid.\n";
}

If your XML document specifies a DTD at the top, call DOMDocument::validate( ) to validate it against the DTD.

With XML in a string, call DOMDocument::schemaValidateSource( ) or DOMDocument::relaxNGValidateSource( ):

$xml = '<person><firstname>Adam</firstname></person>';
$schema = 'address-book.xsd';
$ab = new DOMDocument
$ab->&gt;load($file);

if ($ab->&gt;schemaValidateSource($schema)) {
    print "XML is valid.\n";
} else {
    print "XML is invalid.\n";
}

Discussion

Schemas are a way of defining a specification for your XML documents. While the goal is the same, there are multiple ways to encode a schema, each with a different syntax.

Some popular formats are DTDs (Document Type Definitions), XML Schema, and RelaxNG. DTDs have been around longer, but they are not written in XML and have other issues, so they can be difficult to work with. XML Schema and RelaxNG are more recent schemas and attempt to solve some of the issues surrounding DTDs.

PHP 5 uses the libxml2 library to provide its validation support. Therefore, it lets you validate files against all three types. It is most flexible when you're using XML Schema and RelaxNG, but its XML Schema support is incomplete. You shouldn't run into issues in most XML Schema documents; however, you may find that libxml2 cannot handle some complex schemas or schemas that use more esoteric features.

Within PHP, the DOM extension supports DTD, XML Schema, and RelaxNG validation, while SimpleXML provides only an XML Schema validator.

Validating any file using DOM is a similar process, regardless of the underlying schema format. To validate, call a validation method on a DOM object (see Figure). It returns true if the file passes. If there's an error, it returns false and prints a message to the error log. There is no method for "capturing" the error message.

Validating an XML document

$file = 'address-book.xml';
$schema = 'address-book.xsd';
$ab = new DOMDocument
$ab->load($file);

if ($ab->schemaValidate($schema)) {
    print "$file is valid.\n";
} else {
    print "$file is invalid.\n";
}

If the schema is stored in a string, use DOMDocument::schemaValidateSource( ) instead of schemaValidate( ).

Figure lists all the validation methods.

DOM schema validation methods

Method name

Schema type

Data location

schemaValidate

XML Schema

File

schemaValidateSource

XML Schema

String

relaxNGValidate

RelaxNG

File

relaxNGValidateSource

RelaxNG

String

validate

DTD

N/A


All of the validation methods behave in a similar manner, so you only need to switch the method name in the previous example to switch to a different validation scheme.

Both XML Schema and RelaxNG support validation against files and strings. You can validate a DOM object only against the DTD defined at the top of the XML document.

See Also

The XML Schema specification at http://www.w3.org/XML/Schema; the Relax NG specification at http://www.relaxng.org/.



 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows