Feb. 14, 2011, 5:28 a.m.
posted by un
Selecting the XML Format
Use a Standard or Roll Your Own?
There are always benefits to not reinventing the wheel, and this rule holds true when considering schemas for common business documents. However, though a sad prospect to consider, the wheel you're looking for may not have been invented yet. This is one area where I will offer some somewhat unqualified guidance. If you can find a schema that fits your purposes and has acceptance in your community as a standard, then by all means use it. If you can't find one that fits or if you find more than one and there is no consensus in your trading community or user base about which one is "standard," then you have at least a few different ways you can go. If one or more schemas might work but aren't universally accepted, there are a few criteria you could use to select among them. One criterion might be the number and quality of other programs, stylesheets, sample documents, and documentation available for the schemas. Another might be whether or not you can use the schemas on a royalty-free basis. If you do find an existing schema that is attractive from a number of these perspectives yet doesn't fully accommodate your data, it may still be a good fit for you if you can easily customize or extend it. However, if all else fails, don't have any hesitancy about developing your own schema.
Choosing or creating an XML format is an important decision, but your particular choice is probably not going to be extremely critical. If people don't like the formats you have selected, they can always perform transformations (particularly if you make it easy for them, as we'll discuss later in the chapter). Go back and review the first part of Chapter 10 on XSLT if you like; transformations are something we are going to live with for several years to come, if not forever. The rest of this section deals with general issues regarding designing your formats.
General Document Design Decisions
In Chapter 4 we discussed various issues regarding the design of XML documents that are independent of their schema representation. Here are a few of them again, from the perspective of designing your own documents.
Again, this list is not exhaustive. These are some of the major decisions you'll have to make, but I'm sure there will be others. The next issue has more to do with the data that you include in a document than a particular set of design choices.
Providing Identifying Information
This is probably more of a concern for electronic commerce applications than for application integration, but it may be an issue there too depending on the situation. In the EDI world, most EDI management systems rely on one or a few specific fields in an outbound document to look up the EDI-related details about the trading partner. This is usually a customer or vendor number and is used as a key to the trading partner setup in the EDI system.
Although the world of e-commerce using XML is still evolving, things may be slightly different in that world than they have been in the EDI world. There's a tendency among utilities that move XML around to just consider documents as payload and not look inside them. This is unlike EDI management systems that must examine the application data in order to transform it. The strategy you will have to follow will depend on the particular methods or systems used for data transport. The bottom line is that you may have requirements for providing identifying information within a document or by some external means such as specific file names, locations, or key values that are passed in method calls.
Schema design is a very broad and complex topic. As much as I would like to offer you my knowledge and opinions about it, I'm afraid it would qualify for another complete chapter, if not another book. I have to set an appropriate scope somewhere for a book that has already turned out longer than planned, so I'm going to point you elsewhere for details about schema design. There are several good resources listed at the end of the chapter. Beyond that I'll offer only a few general observations.
Despite what some authorities may tell you, no single technique is right for all circumstances. If you're going to design only one or a few schemas, probably any approach that provides the required validation will be adequate. This includes letting an IDE like XMLSPY or TurboXML generate a schema from a representative instance document as I discussed in Chapter 6. On the other end of the spectrum, if you are designing several documents for a fairly large system, you would be well served to take a more disciplined approach. I favor creating type library schemas containing simple and complex types that are reused in other schemas, similar to the approach discussed in Chapter 4 that was considered by X12 and OASIS. However, there are certainly other techniques.