SOAP





SOAP

In large part, XML-RPC was invented by a single person who really didn't know a lot about XML. Consequently he made many very questionable choices; and because XML-RPC did not go through any standardization process, there was nobody to fix his mistakes. For example, in XML-RPC the string type is defined as an "ASCII string". Now frankly, this is just plain dumb, as well as not a little ethnocentric. XML documents are Unicode, not ASCII. Modern programming languages like Java can handle Unicode without any trouble. Indeed a language that can't process Unicode really isn't suitable for processing XML. There is no good reason to limit XML-RPC strings to ASCII. I certainly wouldn't say you have to use non-ASCII characters in your XML-RPC documents; but if you want to use them, they should certainly be allowed. However, the inventor of XML-RPC also happened to be the vendor of an ASCII-limited database, so he inserted the ASCII-only constraint into XML-RPC rather than upgrade his database to support Unicode.

There are a lot of other issues like that with XML-RPC, some equally obvious, some more subtle. Nonetheless, clearly XML-RPC was a good idea in principle if not in execution. Consequently work began on a more serious effort to enable remote procedure calls by passing XML documents over HTTP. This effort is known as the Simple Object Access Protocol, or just SOAP. Whereas XML-RPC was a quick hack by one developer, SOAP was developed by a committee of XML experts from various companies including IBM and Microsoft.

You've undoubtedly heard the old saw about a camel being a horse designed by committee. The fact is, a camel is actually superbly adapted to its environment. SOAP is a much more robust protocol than XML-RPC. It is much better designed from an XML standpoint as well. It takes advantage of numerous features of XML, such as attributes, Unicode, and namespaces that XML-RPC either ignores or actively opposes. XML-RPC is adequate for simple tasks, but if you get serious with it you rapidly hit a wall. SOAP can take you a lot further. Although there are some basic services available using XML-RPC, the future clearly lies with SOAP.

The biggest conceptual difference between SOAP and XML-RPC is that XML-RPC exchanges a limited number of parameters of six fixed types, plus structs and arrays. But SOAP allows you to send the server arbitrary XML elements—a much more flexible approach.

A SOAP Example

Let's investigate how the stock quote example would likely be implemented in SOAP. Encoded as a SOAP document, the request document looks quite different, but the same information is present, as demonstrated in Figure.

15 A SOAP Document That Requests the Current Stock Price of Red Hat
<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Body>
    <getQuote
         xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <symbol>RHAT</symbol>
    </getQuote>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

The most obvious difference between this document and the XML-RPC equivalent in Figure is the use of namespaces. Namespaces allow the method request to be an arbitrary XML element. This goes way beyond merely passing a method name and some argument values. SOAP permits much more complex XML messages than does XML-RPC.

The server's response is equally flexible, as Figure demonstrates.

16 A SOAP Response
<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Body>
    <Quote xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <Price>4.12</Price>
    </Quote>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

These two examples are minimal SOAP documents. The root element of every SOAP document is Envelope, which must be in the http://schemas.xmlsoap.org/soap/envelope/ namespace, at least in SOAP 1.1. (The URL will change in SOAP 1.2.) Normally a prefix is used, and as always you can pick any prefix as long as the URI stays the same. In this chapter, I always assume that the prefix SOAP-ENV is mapped to that namespace URI. (This is the prefix that the SOAP 1.1 specification uses.)

Each SOAP-ENV:Envelope element contains exactly one SOAP-ENV:Body element. The content of this element is one or more XML elements specific to the service. These examples use Quote, getQuote, and Price elements in the http://namespaces.cafeconleche.org/xmljava/ch2/ namespace. Other services will use other elements from other namespaces. It's also permissible to use elements from no namespace at all, although using namespaces is highly recommended.

Posting SOAP Documents

Currently, most SOAP messages are passed over HTTP using POST, just like XML-RPC messages. Other transport protocols such as SMTP, BEEP, and Jabber can be supported as well. However, there are a couple of crucial differences in the HTTP headers used for SOAP:

  • The HTTP request header must contain a SOAPAction field.

  • If the SOAP request fails, the server should return an HTTP 500 Internal Server Error rather than 200 OK.

The SOAPAction field alerts web servers and firewalls that they're dealing with a SOAP message. This enables firewalls to filter SOAP requests more easily without looking at the request body. The value of the SOAPAction field is a double-quoted URI that somehow indicates the intent of the message. For instance, if Figure were POSTed to a servlet running on www.ibiblio.org under the control of the user elharo, then you might use the SOAPAction http://www.ibiblio.org/#elharo to indicate to the server and firewall which user was responsible for processing this request. This is shown in Figure.

17 A SOAP Request for the Current Stock Price of Red Hat
POST /xml/cgi-bin/SOAPHandler HTTP/1.1
Content-Type: text/xml; charset="utf-8"
Content-Length: 267
SOAPAction: "http://www.ibiblio.org/#elharo"
<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Body>
    <getQuote
         xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <symbol>RHAT</symbol>
    </getQuote>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Conceptually, SOAPAction URIs are very similar to namespace URIs because they aren't meant to be resolved. They simply provide a convenient way of assigning unique identifiers to certain classes of SOAP messages. There's no particular standard for choosing them. You might use the full absolute URL that receives the SOAP request, or you might use some previously agreed-upon URI. You can even use nothing at all. But the SOAPAction header must be present in order for the request to be identified as a SOAP request.

The server will normally send the response back to the client over the same socket the client used to send the request and then close the connection. Like any other HTTP response, a SOAP response begins with an HTTP return code, message, and header. Assuming the request was successful, the response code is 200 OK. Unlike the request, the response does not use any special header fields beyond those used by regular web browsers and servers. Figure demonstrates.

18 A SOAP Document That Returns the Current Stock Price of Red Hat
HTTP/1.0 200 OK
Content-Type: text/xml; charset="utf-8"
Content-Length: 260

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Body>
    <Quote xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <Price>4.12</Price>
    </Quote>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Faults

It's a fact of life that requests fail. They may fail for reasons beyond the control of the SOAP provider. For example, you may launch your SOAP request into the ether just before the phone company severs the wire connecting you to the Internet while hooking up your neighbor's new DSL line. That sort of failure would make it itself known at a lower layer, below XML and SOAP, probably as a SocketException if you were working in Java.

It's also possible for your request to arrive successfully at the server, only to find that the server doesn't recognize the URL you're posting to. In fact, the server might not even be configured to support SOAP requests. This sort of error would not throw an exception, but it would return a 404 Not Found page rather than the expected SOAP response. Your code should be prepared to handle such events.

Finally, it's also possible for the SOAP responder itself to be reached and correctly invoked, but then be unable to process the request. This may occur because the request contained bad data (for example, a symbol for a stock that doesn't exist) or simply because the server code is buggy and encountered a problem. In these cases the SOAP server itself is responsible for producing the correct error message. This error message is a SOAP response with a SOAP-ENV:Envelope and a SOAP-ENV:Body, just like a normal response. However, the SOAP-ENV:Body must contain exactly one SOAP-ENV:Fault element and must not contain anything else.

The SOAP-ENV:Fault element contains up to four child elements:

faultcode

A faultcode element contains a qualified name such as SOAP-ENV:VersionMismatch to identify the fault.

faultstring

A faultstring element contains a plain text message to describe the fault for human readers.

faultactor

A faultactor element contains a URI to identify the node that generated the fault. It's used when a SOAP request is passed through a chain of handlers. This element is optional.

faultdetail

A faultdetail element is used when the fault is specifically related to the body of the request (for example, an unrecognized stock symbol) as opposed to the envelope. It contains child elements to describe the fault. This element is present only if the fault was related to the SOAP body as opposed to the SOAP header.

Caution

These four child elements of SOAP-ENV:Fault are not namespace qualified, which is a little surprising. They are not in the http://schemas.xmlsoap.org/soap/envelope/ namespace. They are not in some other namespace. They are in no namespace at all.


SOAP defines four specific fault codes in the http://schemas.xmlsoap.org/soap/envelope/ namespace to indicate common conditions in a generic way. These are

SOAP-ENV:VersionMismatch

The namespace of the SOAP-ENV element indicates that this message is intended for a server implementing a different version of the SOAP protocol; for example, a SOAP 1.2 message has been sent to a SOAP 1.1 server.

SOAP-ENV:MustUnderstand

There's something in the header that the message says the server must understand before acting, but the server does not recognize it. (I'll talk about this soon in the section SOAP Headers.)

SOAP-ENV:Client

The client sent a message that is somehow defective. Perhaps it omitted a key piece of information the server needs. For example, the getQuote message was sent and understood, but the getQuote element did not have a symbol child. The client is to blame for the problem.

SOAP-ENV:Server

The client sent a correctly formed message with all the necessary information, but some error prevented the server from processing it. For example, the server may have needed to connect to a remote database to retrieve some information, but the database server had crashed. The server is to blame for the problem.

Figure is a fault that might be returned in response to a request for the nonexistent stock ABCD. The faultcode element is set to SOAP-ENV:Client to indicate that the client's request was incorrect. The faultstring element just contains a brief string of unmarked-up text that can be used to describe the problem to a human reader more fully. The faultdetail content includes elements in the same namespace as the successful response, http://namespaces.cafeconleche.org/xmljava/ch2/. Because this request was processed by a single node, no faultactor element is necessary.

19 A SOAP Fault Response
HTTP/1.0 500 Internal Server Error
Content-Type: text/xml; charset="utf-8"
Content-Length: 498

<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
 xmlns:stock="http://namespaces.cafeconleche.org/xmljava/ch2/">
  <SOAP-ENV:Body>
    <SOAP-ENV:Fault>
      <faultcode>SOAP-ENV:Client</faultcode>
      <faultstring>
        There is no stock with the symbol ABCD.
      </faultstring>
      <faultdetail>
        <stock:InvalidSymbol>ABCD</stock:InvalidSymbol>
      </faultdetail>
    </SOAP-ENV:Fault>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Encoding Styles

The information encoded in the example SOAP documents to this point has been nothing more than Unicode text strings. When you want to encode other types, such as integers, arrays, and objects, you need to specify how the characters that make up the XML document should be deserialized into the local platform's understanding of those types. For example, if a Java program encounters the element <Price>4.12</Price>, should it convert it into a double? a float? a java.lang.String? a java.math.BigDecimal? a custom Price class? something else?

Any element in a SOAP document can have a SOAP-ENV:encodingStyle attribute whose value is a URI pointing to some kind of schema that specifies what types are assigned to which elements. The most common language to use for this schema is the W3C XML Schema Language. However, other schema languages such as RELAX NG are also allowed.

Figure uses the SOAP-ENV:encodingStyle attribute on the getQuote element to point to a schema at the relative URL trading.xsd. This schema defines the symbol element as having the custom type StockSymbol, and is shown in Figure. This schema is used only for assigning types. Although it is not used for validation, with a little extra work it could be.

20 A SOAP Document That Specifies the Encoding Style
<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
  <SOAP-ENV:Body>
    <getQuote
         xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/"
              SOAP-ENV:encodingStyle="trading.xsd">
      <symbol>RHAT</symbol>
    </getQuote>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>
21 A Schema That Assigns Type to Elements in the http://namespaces.cafeconleche.org/xmljava/ch2/ Namespace
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://namespaces.cafeconleche.org/xmljava/ch2/"
xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/"
elementFormDefault="qualified">

  <xsd:element name="getQuote">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="symbol" type="StockSymbol"
                     maxOccurs="unbounded"/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>

  <xsd:simpleType name="StockSymbol">
    <xsd:restriction base="xsd:string">
      <!-- two to six upper case letters -->
      <xsd:pattern value="[A-Z][A-Z][A-Z]?[A-Z]?[A-Z]?[A-Z]?"/>
    </xsd:restriction>
  </xsd:simpleType>

</xsd:schema>

You can place the SOAP-ENV:encodingStyle attribute on any element in the document. It applies to that element and its descendants, and it overrides the schemas declared on any ancestor. It is common to place it on the root SOAP-ENV:Envelope element.

SOAP singles out one encoding style for special treatment. If the SOAPENV:encodingStyle attribute has the value http://schemas.xmlsoap.org/soap/encoding/, then a predefined set of types is available that includes one element for each simple type defined in the W3C XML Schema Language and listed in Figure. For example, assuming that the SOAP-ENC prefix is bound to the http://schemas.xmlsoap.org/soap/encoding/ URI (not the same as the namespace URI or the prefix for the SOAP envelope), then an int can be placed in a SOAP-ENC:int element in the following manner:

<SOAP-ENC:int>12</SOAP-ENC:int> 

Figure gives the complete list of types and their normal Java semantics, although this really just mirrors Figure. In many cases, Java does not have a type that exactly matches one of the derived types; thus it uses the broader base class. For example, Java does not have an unsigned integer type, but all values of type xsd:unsignedInt can fit into a Java long. Java does not have a PositiveInteger class, but all xsd:positiveIntegers can be represented by a java.math.BigInteger. In some cases the mapping is obvious. In others, different programs may use different Java types and objects to deserialize the same values. For example, an xsd:int is exactly a Java int, and an xsd:double is as close to a Java double as it's possible for a base-10 string to be. However, an xsd:anyURI could reasonably be converted to a java.net.URL, a java.lang.String, or some custom URI class.

Simple Value Elements Defined in SOAP
SOAP Type Java Type
SOAP-ENC:string java.lang.String
SOAP-ENC:boolean boolean
SOAP-ENC:decimal java.math.BigDecimal
SOAP-ENC:float float
SOAP-ENC:double double
SOAP-ENC:integer java.math.BigInteger
SOAP-ENC:positiveInteger java.math.BigDecimal
SOAP-ENC:nonPositiveInteger java.math.BigInteger
SOAP-ENC:negativeInteger java.math.BigInteger
SOAP-ENC:nonNegativeInteger java.math.BigInteger
SOAP-ENC:long long
SOAP-ENC:int int
SOAP-ENC:short short
SOAP-ENC:byte byte
SOAP-ENC:unsignedLong double, or java.math.BigInteger
SOAP-ENC:unsignedInt long
SOAP-ENC:unsignedShort int
SOAP-ENC:unsignedByte int
SOAP-ENC:duration custom class
SOAP-ENC:dateTime java.util.Date
SOAP-ENC:time java.sql.Time
SOAP-ENC:date java.sql.Date
SOAP-ENC:gYearMonth custom class
SOAP-ENC:gYear custom class, int, or java.math.BigInteger
SOAP-ENC:gMonthDay custom class
SOAP-ENC:gDay custom class, or int
SOAP-ENC:gMonth custom class, or int
SOAP-ENC:hexBinary byte[]
SOAP-ENC:base64Binary byte[]
SOAP-ENC:anyURI java.net.URL, java.lang.String, or a custom class
SOAP-ENC:QName java.lang.String, or a custom class
SOAP-ENC:NOTATION org.w3c.dom.Notation
SOAP-ENC:normalizedString java.lang.String
SOAP-ENC:token java.lang.String
SOAP-ENC:language java.lang.String, or a custom class
SOAP-ENC:NMTOKEN java.lang.String, or a custom class
SOAP-ENC:NMTOKENS java.lang.String, or a custom class
SOAP-ENC:Name java.lang.String
SOAP-ENC:NCName java.lang.String
SOAP-ENC:ID java.lang.String
SOAP-ENC:IDREF java.lang.String
SOAP-ENC:IDREFS an array or list of java.lang.Strings, or a custom class
SOAP-ENC:ENTITY org.w3c.dom.Entity
SOAP-ENC:ENTITIES an org.w3c.dom.NodeList containing org.w3c.dom.Entity objects

These mappings are not written in stone. Some of the XML-like types such as SOAP-ENC:ENTITY and SOAP-ENC:IDREFS are particularly uncertain, and may be implemented in different ways in different environments. However, this should give you a fairly good idea of the sorts of possible mappings between SOAP types and Java types. In addition to this list of simple types, the http://schemas.xmlsoap.org/soap/encoding/ encoding defines concepts of structs, references, byte arrays, and arrays.

Structs

A struct is simply an element that contains child elements but has no mixed content. For example, following is a Quote struct that contains Symbol and Price members:

<Quote xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/"> 
  <Symbol>RHAT</Symbol>
  <Price>4.12</Price>
</Quote>

In Java terms, by using the http://schemas.xmlsoap.org/soap/encoding/ encoding style, you're indicating that you want this element to be deserialized into an object of type Quote, which has two properties named Symbol and Price. In other words, the class definition looks something like this:

public class Quote {

  public String getSymbol();
  public double getPrice();

}

You may or may not have such a class in your system. If the SOAP request began its life as a Quote object that was subsequently converted to XML, transmitted across the Internet, and then turned back into a Java object, then perhaps you do have such a class. But what if the object began its life as a C struct or a C++ object? or if it was never anything except an XML document? In these cases, there may not be a convenient Quote class into which you can deserialize this compound object. Another possibility is to decode the name value pairs into some form of Hashtable or HashMap. The names of the fields would be the keys, and the values of the fields would be the values.

What this encoding really tells you is roughly how the author intended this document to be handled. However if you have some other way of making sense of this data, you are free to use it. You are not limited to any one deserialization form.

References

A reference type uses an href attribute to point to a value stored elsewhere in the SOAP request. This mirrors the structure when two objects must both contain the same object. For example, consider this trade request:

<Bid xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/"> 
  <Symbol>RHAT</Symbol>
  <Price>4.12</Price>
  <Account>777-7777</Account>
</Bid>
<Bid xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
  <Symbol>YHOO</Symbol>
  <Price>4.12</Price>
  <Account>777-7777</Account>
</Bid>

In both cases the account number is the same. Furthermore, it's not simply that the two numbers are equal: They indicate the same object. In Java terms it's the difference between the equals() method and the == operator. The first tests for equality, whereas the second tests for identity. If the local semantics demand that each Account element be deserialized as an Account object (perhaps with other fields filled in from a database rather than from the XML document), then you want some means of saying that this document should produce one Account object rather than two. This is done with a reference. Give the first Account element a unique id attribute, and use an href attribute in the second element to point to it, as shown here:

<Bid xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/"> 
  <Symbol>RHAT</Symbol>
  <Price>4.12</Price>
  <Account id="a1">777-7777</Account>
</Bid>
<Bid xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
  <Symbol>YHOO</Symbol>
  <Price>4.12</Price>
  <Account href="#a1"/>
</Bid>

This document represents two Bid objects. Each has three properties: Symbol, Price, and Account. The symbols are completely different. The prices are equal but not identical; that is, one can change without changing the other. There are two separate prices here that coincidentally have the same value. The Accounts, on the other hand, are identical. There is only one account here, used in two different places.

Arrays

In Java arrays are a funny kind of object, and in SOAP they are too. An array is represented as an element whose type is SOAP-ENC:Array. For example, this is an array of three numbers:

<Bid xsi:type="SOAP-ENC:Array"> 
  <Price>4.52</Price>
  <Price>0.35</Price>
  <Price>34.68</Price>
</Bid>

In an array, the names of the elements don't really mean anything—only the positions matter. If the name of the array doesn't matter either, you can use a SOAP-ENC:Array element instead. For example, this is an array of three doubles, with no extra information:

<SOAP-ENC:Array> 
  <SOAP-ENC:double>4.52</SOAP-ENC:double>
  <SOAP-ENC:double>0.35</SOAP-ENC:double>
  <SOAP-ENC:double>34.68</SOAP-ENC:double>
</SOAP-ENC:Array>

SOAP arrays are not as strongly typed as Java arrays are, at least by default. Whereas each array in Java must contain exclusively ints, or strings, or objects, a SOAP array can contain data of varying types. For example, the following array contains three items, each with a different name and type:

<Bid xsi:type="SOAP-ENC:Array"> 
  <Symbol  xsi:type="xsd:token">RHAT</Symbol>
  <Price   xsi:type="xsd:double">4.12</Price>
  <Account xsi:type="xsd:string">777-7777</Account>
</Bid>

Given this possibility, it can be difficult to decode a SOAP array into a Java array. The closest Java equivalent is an Object[] array. However, primitive types like double would need to be replaced by an instance of the matching type wrapper class, such as java.lang.Double instead. Another possibility is to use a java.util.Vector or java.util.ArrayList instead of a straight array, though this still doesn't remove the need for the type wrapper classes.

If you want to restrict the type of array components, you can add a SOAP-ENC:arrayType attribute to the array element. The value of this attribute is the type of the individual component followed by square brackets containing the length of the array. This is more similar to C's array declaration syntax than Java's. For example, this array must contain exactly three doubles:

<Bid xsi:type="SOAP-ENC:Array" SOAP-ENC:arrayType="xsd:double[3]"> 
   <Price>4.52</Price>
   <Price>0.35</Price>
   <Price>34.68</Price>
 </Bid>

Any array component not specifically typed otherwise can be a struct. Furthermore, any array component can be another array. However, this does not produce a multidimensional array. Instead, multidimensional arrays are created by stringing together the values from the second row after the values from the first row, the values from the third row after the values from the second row, and so on. The SOAP-ENC:arrayType attribute indicates the number of columns. For example, this is a three-row by two-column array of doubles:

<SOAP-ENC:Array SOAP-ENC:arrayType="xsd:double[3,2]"> 
   <SOAP-ENC:double>1.1</SOAP-ENC:double>
   <SOAP-ENC:double>1.2</SOAP-ENC:double>
   <SOAP-ENC:double>2.1</SOAP-ENC:double>
   <SOAP-ENC:double>2.2</SOAP-ENC:double>
   <SOAP-ENC:double>3.1</SOAP-ENC:double>
   <SOAP-ENC:double>3.2</SOAP-ENC:double>
 </SOAP-ENC:Array>

Although the XML representation is one-dimensional, the Java interpretation is two-dimensional. When deserialized, this forms the following Java array:

double[][] array = {
  {1.1, 1.2},
  {2.1, 2.2},
  {3.1, 3.2}
}

In the interest of efficiency over potentially slow networks, SOAP allows partially transmitted and sparse arrays. A partially transmitted array (also known as a varying array) does not begin with position 0; it instead begins at a specified index. For example, it might have ten components indexed from 3 to 12 inclusive. In SOAP you indicate the position at which a partially transmitted array begins with a SOAP-ENC:offset attribute. The value of this attribute is the index of the first element in the array enclosed in square brackets. For example, the following array begins at 3:

<SOAP-ENC:Array SOAP-ENC:offset="[3]"> 
  <SOAP-ENC:string>Component 3</SOAP-ENC:string>
  <SOAP-ENC:string>Component 4</SOAP-ENC:string>
  <SOAP-ENC:string>Component 5</SOAP-ENC:string>
  <SOAP-ENC:string>Component 6</SOAP-ENC:string>
  <SOAP-ENC:string>Component 7</SOAP-ENC:string>
  <SOAP-ENC:string>Component 8</SOAP-ENC:string>
  <SOAP-ENC:string>...</SOAP-ENC:string>
</SOAP-ENC:Array>

Java doesn't have such arrays, although Pascal and some other languages do. In Java you would likely deserialize such an array by putting null values or zeroes in the places before the beginning of the array.

In a sparse array, a very large percentage of the components are 0 or null. In SOAP a sparse array would pass only the non-zero/non-null components. However, when the array was deserialized, these would be filled in with zeroes or nulls. The number of elements in a sparse array must be specified by a SOAP-ENC:arrayType attribute. The position of each element that is provided is given by a SOAP-ENC:position attribute. For example, following is a ten-element array that provides only the second, third, and fifth elements:

<SOAP-ENC:Array SOAP-ENC:arrayType="xsd:string[10]"> 
  <SOAP-ENC:string SOAP-ENC:position="[2]">
    2nd component
  </SOAP-ENC:string>
  <SOAP-ENC:string SOAP-ENC:position="[3]">
    3rd component
  </SOAP-ENC:string>
  <SOAP-ENC:string SOAP-ENC:position="[5]">
    5th component
  </SOAP-ENC:string>
</SOAP-ENC:Array>

The equivalent Java code looks like this:

String[] array = new String[10]; 
array[2] = "\n     2nd component\n    ";
array[3] = "\n     3rd component\n    ";
array[5] = "\n     5th component\n    ";
Byte Arrays

A byte array is just a string that somehow encodes binary data. The most common such encoding is base64. A schema or an xsi:type attribute is needed to identify the encoding. For example, the following is a base64 encoded byte array that provides an SHA-1 digital signature for a document. The signature is normally 20 bytes, which becomes 56 characters when translated to base64.

<SignatureValue> 
AgGOvkMdqdKT7QyMuXPsuomkOqqEhGukKkj4Em7OKKQxYzheuseS8Q==
</SignatureValue>

In Java this would normally be deserialized into a byte array.

SOAP Headers

In addition to the body of the request, each SOAP document can contain a header. This is not an HTTP header; rather, it is an additional child of the SOAP-ENV:Envelope element, specifically a SOAP-ENV:Header element. If a SOAP request is an envelope, then the body is the letter inside the envelope, and the header is the writing on the outside of the envelope that tells the post office where to deliver it, where to send it back if it can't be delivered, and how much you paid to get the letter delivered. In other words, a SOAP header provides meta-information about the request.

The sort of meta-information provided varies from request to request and from SOAP application to SOAP application. Some things that can be exchanged in headers include

  • Protocols the server must understand to process the request

  • A digital signature for the body of the message

  • A schema for the XML application used in the body

  • Credit card information to pay for the processing

  • A public key to be used to encrypt the response

Figure shows a bid document in which the header carries credit card information to pay for the request. In this case, the syntax used for the Payment element is specific to the XML application used in the body and even comes from the same namespace.

22 A SOAP Request with a Digital Signature in the Header
<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Header>
 <Payment xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <Name>Elliotte Harold</Name>
      <Issuer>VISA</Issuer>
      <Number>5125456787651230</Number>
      <Expires>2005-12</Expires>
    </Payment>
  </SOAP-ENV:Header>
  <SOAP-ENV:Body>
    <buy id="buy1"
         xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <symbol>MRBA</symbol>
      <shares>100</shares>
      <account>777-7777</account>
    </buy>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Like the SOAP body, the SOAP header can use any XML application it cares to use to encode the data. It is not limited to a fixed vocabulary. Indeed it can use more than one such vocabulary. The SOAP-ENV:Header element can contain multiple child elements from a hodgepodge of different namespaces. Each one of these elements, called a header entry, may be treated independently of the other header entries. Figure adds an additional header containing a digital signature for the request body. The syntax used for the Signature element is defined by XML-Signature Syntax and Processing [http://www.w3.org/TR/xmldsig-core/].

23 A SOAP Request with Two Header Entries
<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Header>
 <Payment xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <Name>Elliotte Harold</Name>
      <Issuer>VISA</Issuer>
      <Number>5125456787651230</Number>
      <Expires>2005-12</Expires>
    </Payment>
<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
  <SignedInfo>
    <CanonicalizationMethod
    Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
    <SignatureMethod
      Algorithm="http://www.w3.org/2000/09/xmldsig#dsa-sha1" />
    <Reference URI="file://J/xss4j/requestbody.xml">
    <DigestMethod
      Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />
      <DigestValue>3UxhLrdPpK3faRms5FOS6kAoeZI=</DigestValue>
    </Reference>
  </SignedInfo>
  <SignatureValue>
    ZeW/PYGT6A9iOqOrbMmeKOq1aQk+ars/QOC95Bj0xYrNAnLo/WK7+g==
  </SignatureValue>
</Signature>
  </SOAP-ENV:Header>
  <SOAP-ENV:Body>
    <buy id="buy1"
         xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <symbol>MRBA</symbol>
      <shares>100</shares>
      <account>777-7777</account>
    </buy>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>
The mustUnderstand Attribute

An individual SOAP document tends to be tied pretty closely to the service it plans to talk to. You can't send a request for a stock quote to a server designed to provide basketball scores and expect to get sensible results back. In order to indicate what is required of a server, a SOAP request can contain a SOAP-ENV:mustUnderstand attribute on each header entry. If this attribute has the value 1, then the service that receives the SOAP request must process the header entry. If it cannot, either because it does not understand the header entry or for some other reason, then it must fail the request and return a fault. If the SOAP-ENV:mustUnderstand attribute has the value 0, then processing the header is optional. The service should do so if it can, but failing to do so does not automatically lead to a fault. The default is 0.

Figure is a BUY order that requires the receiver to understand the Payment header. If the server does not recognize that header, it must not attempt to fulfill the order.

24 A SOAP Request with a mustUnderstand Attribute
<?xml version="1.0"?>
<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >
  <SOAP-ENV:Header>
  <Payment xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/"
             SOAP-ENV:mustUnderstand="1">
      <Name>Elliotte Harold</Name>
      <Issuer>VISA</Issuer>
      <Number>5125456787651230</Number>
      <Expires>2005-12</Expires>
    </Payment>
  </SOAP-ENV:Header>
  <SOAP-ENV:Body>
    <buy xmlns="http://namespaces.cafeconleche.org/xmljava/ch2/">
      <symbol>MRBA</symbol>
      <shares>100</shares>
      <account>777-7777</account>
    </buy>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>
The actor Attribute

Although this book mostly focuses on SOAP messages that go straight from the sender system to the receiver who will process them, not all systems are this simple. A SOAP message can be forwarded from one SOAP processor to the next until it reaches its ultimate destination. By default, the headers are only read by the last processor. However, you can indicate that a header is intended for a closer processor by using an actor attribute on the header entry element. The value of this attribute is a URI identifying the processor for which the header entry is intended.

When a processor receives a SOAP message, it searches the header for header entries addressed to it. It acts on these header entries and deletes them. It may also add new header entries intended for processors later in the chain. Then it forwards the message to the next processor in the chain.

All processors except the final one only act on header entries that are specifically addressed to them. After acting on an entry, the processor deletes it before forwarding the request to the next processor in the chain. Furthermore, the URL http://schemas.xmlsoap.org/soap/actor/next indicates that the header entry should be processed and deleted by the first processor that sees it. The last processor in the chain processes any header entries not addressed to any processor in particular as well as any header entries that are addressed specifically to it.

The exact scheme for forwarding SOAP messages from one processor to the next is system dependent. For example, you might set up a gateway server outside the firewall to verify certain characteristics of a SOAP message before forwarding it to a processor inside the firewall. Such a gateway would either block or forward each message. A switching processor might inspect the body of the message and forward the request to different SOAP processors depending on what it saw there. Some systems might even use routing included in the messages themselves.

SOAP Limitations

Regrettably, in my opinion, SOAP does not allow developers to take full advantage of XML's expressiveness and extensibility. First of all, according to the SOAP 1.1 specification, "A SOAP message MUST NOT contain a Document Type Declaration." This allows non-validating parsers and parsers that cannot resolve external entities to be used to process SOAP messages without concern that they may be misinterpreting them because they don't apply default namespaces or resolve external entities. But, it also means the document can't be validated against a DTD.

Also according to the SOAP 1.1 specification, "A SOAP message MUST NOT contain Processing Instructions." Honestly this makes no sense to me whatsoever. I see little reason for forbidding these. This does mean that all information in a SOAP request must be passed through the defined SOAP structure, but it also makes it difficult to include other useful features beyond the SOAP structure. The most obvious is that you can't easily apply a stylesheet to a SOAP document—although that's not a huge loss because SOAP documents aren't meant for humans to read in the first place. However, it also means that it's difficult to serve SOAP documents out of the Cocoon application server. There are probably many other environment-specific instances where this becomes inconvenient.

Validating SOAP

SOAP is actively hostile to DTDs. The SOAP specification specifically forbids a SOAP request from containing a document type declaration. Thus you really have to use a schema to validate your documents, if you validate them at all.

Unlike XML-RPC, SOAP does have an official schema. In fact it has two, which you can download from the SOAP namespace URLs. The envelope schema at http://schemas.xmlsoap.org/soap/envelope/ describes the SOAP complex types: SOAP-ENV:Envelope, SOAP-ENV:header, SOAP-ENV:Body, and so on. The encoding schema at http://schemas.xmlsoap.org/soap/encoding/ defines the SOAP data types listed in Figure: SOAP-ENC:int, SOAP-ENC:NMTOKENS, SOAP-ENC:gYear, and so on. You can find these schemas in Appendix B.

XML-RPC is a monolithic XML application not designed to be integrated with other XML applications. SOAP, by contrast, is incomplete without some other XML application to form the body of the SOAP request. Thus the SOAP schema cannot be monolithic. Because it must rely on some other XML application in its own namespace (or perhaps no namespace at all, although this is not recommended), the SOAP schema cannot on its own validate any SOAP documents. It also requires that the developer provide a separate schema for the document bodies, and then merge the two together using xsd:import elements.

Figure is a master schema for quote request documents such as Figure. This schema declares no elements of its own but does import both SOAP schemas, as well as the schema for getQuote elements seen earlier in Figure. This schema can be used to validate a complete SOAP request that has a getQuote body element. If you wanted to validate the other SOAP documents in this chapter that use other elements in the header and body, you would just need to write declarations for those elements too. They could be placed in the master schema, trading.xsd, or their own schema documents, whichever seems most convenient.

25 A Master Schema for SOAP Trading Documents
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  targetNamespace="http://schemas.xmlsoap.org/soap/envelope/">

  <!-- Standard SOAP schemas -->
  <xsd:include
    schemaLocation="http://schemas.xmlsoap.org/soap/envelope/"
  />
  <xsd:import
    schemaLocation="http://schemas.xmlsoap.org/soap/encoding/"
    namespace="http://schemas.xmlsoap.org/soap/encoding/"
  />

  <!-- Local schema -->
  <xsd:import schemaLocation="trading.xsd"
    namespace="http://namespaces.cafeconleche.org/xmljava/ch2/"
  />

</xsd:schema>

     Python   SQL   Java   php   Perl 
     game development   web development   internet   *nix   graphics   hardware 
     telecommunications   C++ 
     Flash   Active Directory   Windows