Handling XML Documents in a Web Service





Handling XML Documents in a Web Service

Up to now, this chapter addressed issues applicable to all Web service implementations. There are additional considerations when a Web service implementation expects to receive an XML document containing all the information from a client, and which the service uses to start a business process to handle the request. There are several reasons why it is appropriate to exchange documents:

  • Documents, especially business documents, may be very large, and as such, they are often sent as a batch of related information. They may be compressed independently from the SOAP message.

  • Documents may be legally binding business documents. At a minimum, their original form needs to be conserved through the exchange and, more than likely, they may need to be archived and kept as evidence in case of disagreement. For these documents, the complete infoset of the original document should be preserved, including comments and external entity references (as well as the referred entities).

  • Some application processing requires the complete document infoset, including comments and external entity references. As with the legally binding documents, it is necessary to preserve the complete infoset, including comments and external entity references, of the original document.

  • When sent as attachments, it is possible to handle documents that may conform to schemas expressed in languages not supported by the Web service endpoint or that are prohibited from being present within a SOAP message infoset (such as the Document Type Declaration <!DOCTYPE> for a DTD-based schema).

For example, consider the travel agency Web service, which typically receives a client request as an XML document containing all information needed to arrange a particular trip. The information in the document includes details about the customer's account, credit card status, desired travel destinations, preferred airlines, class of travel, dates, and so forth. The Web service uses the documents contents to perform such steps as verifying the customer's account, obtaining authorization for the credit card, checking accommodations and transportation availability, building an itinerary, and purchasing tickets.

In essence, the service, which receives the request with the XML document, starts a business process to perform a series of steps to complete the request. The contents of the XML document are used throughout the business process. Handling this type of scenario effectively requires some considerations in addition to the general ones for all Web services.

graphics/box_icon.gif Good design expects XML documents to be received as javax.xml.transform.Source objects. See "Exchanging XML Documents" on page 107, which discusses exchanging XML documents as parameters. Keep in mind the effect on interoperability (see "Interoperability" on page 86).

graphics/box_icon.gif It is good design to do the validation and any required transformation of the XML documents as close to the endpoint as possible. Validation and transformation should be done before applying any processing logic to the document content. See Figure and the discussion on receiving requests in "Receiving Requests" on page 89.

graphics/box_icon.gif It is important to consider the processing time for a request and whether the client waits for the response. When a service expects an XML document as input and starts a lengthy business process based on the document contents, then clients typically do not want to wait for the response. Good design when processing time may be extensive is to delegate a request to a JMS queue or topic and return a correlation identifier for the client's future reference. (Recall Figure on page 96 and its discussion.)

The following sections discuss other considerations.

1 Exchanging XML Documents

As noted earlier, there are times when you may have to exchange XML documents as part of your Web service and such documents are received as parameters of a method call. The J2EE platform provides three ways to exchange XML documents.

The first option is to use the Java-MIME mappings provided by the J2EE platform. See Figure on page 75. With this option, the Web service endpoint receives documents as javax.xml.transform.Source objects. (See Code Figure on page 75.) Along with the document, the service endpoint can also expect to receive other JAX-RPC arguments containing metadata, processing requirements, security information, and so forth. When an XML document is passed as a Source object, the container automatically handles the document as an attachment—effectively, the container implementation handles the document-passing details for you. This frees you from the intricacies of sending and retrieving documents as part of the endpoint's request/response handling.

graphics/box_icon.gif Passing XML documents as Source objects is the most effective option in a completely Java-based environment (one in which all Web service clients are based on Java). However, sending documents as Source objects may not be interoperable with non-Java clients. (As already noted in the section "Interoperability" on page 86, standard ways to exchange attachments are currently being formulated. Future versions of the J2EE platform will incorporate these standards once they are final.)

The second option is to design your service endpoint such that it receives documents as String types. Code Figure shows the WSDL description for a service that receives documents as String types, illustrating how the WSDL maps the XML document.

Code Figure. Mapping XML Document to xsd:string

<?xml version="1.0" encoding="UTF-8"?>

<definitions ...>

   <types/>

   <message name="PurchaseOrderService_submitPurchaseOrder">

      <part name="PurchaseOrderXMLDoc" type="xsd:string"/>

   </message>

   <message

          name="PurchaseOrderService_submitPurchaseOrderResponse">

      <part name="result" type="xsd:string"/>

   </message>

   <portType name="PurchaseOrderService">

      <operation name="submitPurchaseOrder"

                       parameterOrder="PurchaseOrderXMLDoc">

         <input

         message="tns:PurchaseOrderService_submitPurchaseOrder"/>

         <output message=

         "tns:PurchaseOrderService_submitPurchaseOrderResponse"/>

      </operation>

   </portType>

   ...

</definitions>


Code Figure shows the equivalent Java interface for the WSDL shown in Code Figure.

Code Figure. Receiving an XML Document as a String object

public interface PurchaseOrderService extends Remote {

   public String submitPurchaseOrder(String poDocument)

          throws RemoteException, InvalidOrderException;

}


If you are developing your service using the Java-to-WSDL approach, and the service must exchange XML documents and be interoperable with clients on any platform, then passing documents as String objects may be your only option.

graphics/box_icon.gif There may be a performance drawback to sending an XML document as a String object: As the document size grows, the String equivalent size of the document grows as well. As a result, the payload size of the message you send also grows. In addition, the XML document loses its original format since sending a document as a String object sends it in a canonical format.

The third option is to exchange the XML document as a SOAP document fragment. With this option, you map the XML document to xsd:anyType in the service's WSDL file.

graphics/box_icon.gif It is recommended that Web services exchange XML documents as SOAP document fragments because passing XML documents in this manner is both portable across J2EE implementations and interoperable with all platforms.

graphics/box_icon.gif To pass SOAP document fragments, you must implement your service using the WSDL-to-Java approach.

For example, the travel agency service receives an XML document representing a purchase order that contains all details about the customer's preferred travel plans. To implement this service, you define the WSDL for the service and, in the WSDL, you map the XML document type as xsd:anyType. See Code Figure.

Code Figure. Mapping XML document to xsd:anyType

<?xml version="1.0" encoding="UTF-8"?>

<definitions ...>

   <types/>

   <message name="PurchaseOrderService_submitPurchaseOrder">

      <part name="PurchaseOrderXMLDoc" type="xsd:anyType"/>

   </message>

   <message

          name="PurchaseOrderService_submitPurchaseOrderResponse">

      <part name="result" type="xsd:string"/>

   </message>

   <portType name="PurchaseOrderService">

      <operation name="submitPurchaseOrder"

                       parameterOrder="PurchaseOrderXMLDoc">

         <input

         message="tns:PurchaseOrderService_submitPurchaseOrder"/>

         <output message=

         "tns:PurchaseOrderService_submitPurchaseOrderResponse"/>

      </operation>

   </portType>

   ...

</definitions>


A WSDL mapping of the XML document type to xsd:anyType requires the platform to map the document parameter as a javax.xml.soap.SOAPElement object. For example, Code Figure shows the Java interface generated for the WSDL description in Code Figure.

Code Figure. Java Interface for WSDL in Code Figure

public interface PurchaseOrderService extends Remote {

   public String submitPurchaseOrder(SOAPElement

                    purchaseOrderXMLDoc) throws RemoteException;

}


In this example, the SOAPElement parameter in submitPurchaseOrder represents the SOAP document fragment sent by the client. For the travel agency service, this is the purchase order. The service can parse the received SOAP document fragment using the javax.xml.soap.SOAPElement API. Or, the service can use JAXB to map the document fragment to a Java Object or transform it to another schema. A client of this Web service builds the purchase order document using the client platform-specific API for building SOAP document fragments—on the Java platform, this is the javax.xml.soap.SOAPElement API—and sends the document as one of the Web service's call parameters.

When using the WSDL-to-Java approach, you can directly map the document to be exchanged to its appropriate schema in the WSDL. The corresponding generated Java interface represents the document as its equivalent Java Object. As a result, the service endpoint never sees the document that is exchanged in its original document form. It also means that the endpoint is tightly coupled to the document's schema: Any change in the document's schema requires a corresponding change to the endpoint. If you do not want such tight coupling, consider using xsd:anyType to map the document.

2 Separating Document Manipulation from Processing Logic

When your service's business logic operates on the contents of an incoming XML document, the business processing logic must at a minimum read the document, if not modify the document. By separating the document manipulation logic from the processing logic, a developer can switch between various document manipulation mechanisms without affecting the processing logic. In addition, there is a clear division between developer skills.

graphics/box_icon.gif It is a good practice to separate the XML document manipulation logic from the business logic.

The "Abstracting XML Processing from Application Logic" section on page 155 provides more information on how to accomplish this separation and its merits.

3 Fragmenting XML Documents

When your service's business logic operates on the contents of an incoming XML document, it is a good idea to break XML documents into logical fragments when appropriate. When the processing logic receives an XML document that contains all information for processing a request, the XML document usually has well-defined segments for different entities, and each segment contains the details about a specific entity.

graphics/box_icon.gif Rather than pass the entire document to different components handling various stages of the business process, it's best if the processing logic breaks the document into fragments and passes only the required fragments to other components or services that implement portions of the business process logic.

See "Fragmenting Incoming XML Documents" on page 153 for more details on fragmentation.

4 Using XML

XML, while it has many benefits, also has performance disadvantages. You should weigh the trade-offs of passing XML documents through the business logic processing stages. The pros and cons of passing XML documents take on greater significance when the business logic implementation spans multiple containers. Refer to Chapter 5, specifically the section entitled "Use XML Judiciously" on page 194, which provides guidelines on this issue. Following these guidelines may help minimize the performance overhead that comes with passing XML documents through workflow stages.

Also, when deciding on an approach, keep in mind the costs involved for using XML and weigh them along with the recommendations on parsing, validation, and binding documents to Java objects. See Chapter 4 for a discussion of these topics.

5 Using JAXM and SAAJ Technologies

The J2EE platform provides an array of technologies—including mandatory technologies such as JAX-RPC and SAAJ and optional technologies such as JavaTM API for XML Messaging (JAXM)—that enable message and document exchanges with SOAP. Each of these J2EE technologies offers a different level of support for SOAP-based messaging and communication. (See Chapter 2 for the discussion on JAX-RPC and SAAJ.)

An obvious question that arises is: Why not use JAXM or SAAJ technologies in scenarios where you have to pass XML documents? If you recall:

  • SAAJ lets developers deal directly with SOAP messages, and is best suited for point-to-point messaging environments. SAAJ is better for developers who want more control over the SOAP messages being exchanged and for developers using handlers.

  • JAXM defines an infrastructure for guaranteed delivery of messages. It provides a way of sending and receiving XML documents and guaranteeing their receipt, and is designed for use cases that involve storing and forwarding XML documents and messages.

SAAJ is considered more useful for advanced developers who thoroughly know the technology and who must deal directly with SOAP messages.

Using JAXM for scenarios that require passing XML documents may be a good choice. Note, though, that JAXM is optional in the J2EE 1.4 platform. As a result, a service developed with JAXM may not be portable. When you control both end points of a Web service, it may make more sense to consider using JAXM.


     Python   SQL   Java   php   Perl 
     game development   web development   internet   *nix   graphics   hardware 
     telecommunications   C++ 
     Flash   Active Directory   Windows