Apache Cocoon





Apache Cocoon

In Sections 10.2 and 10.3, we described a B2B approach to processing XML documents by using Servlet and JSP. In the future, more and more services on the Web will accept and provide XML documents. We can regard such an XML-based service as an XML data source that can be accessed via HTTP. This indicates that there can be many XML data sources on the Web.

Under such circumstances, we may want to support both B2B (a machine client) and B2C (a human client) with a single XML-based service in some cases. In one case, we may need to provide an XML document as it is for a machine client and a human-readable document, such as an HTML document, for a human client. In another case, we may want to integrate multiple data sources provided as XML documents.

We believe one of the possibilities is to use Cocoon. Apache Cocoon is commonly regarded as middleware for XML-based Web publishing. We believe, however, that it can be a solution for the goals just described. In this section, we describe Cocoon, revealing the reason why we believe so.

We are not providing a general view of Cocoon in this section because this is not an introduction to it. Rather, we focus on how to achieve our two goals by using Cocoon.

1 Having Well-Grounded Goals

The Need for Document Distribution for Various Web Clients

Today, the Web is globally popularized and various kinds of Web clients are used: PCs, mobile phones, application programs, and so on. They have different CPU speeds, memory limitations, I/O devices, and the expected data format, such as HTML and XML. For example, an HTML document (if it is the expected data format) is displayed differently on each client, and sometimes a client may not be able to display the whole document.

So far, we have to provide different documents for different types of clients. However, as types of clients become diverse, this kind of ad hoc approach becomes difficult.

One solution to handling such an issue is to generate various documents from a single XML document, as shown in Figure. Here, the XML document is regarded as common logical data that should be maintained as first-class data for an application. We call this approach multichanneling of an XML document.

9. Multichanneling using Cocoon

graphics/10fig09.gif

Integrating Multiple XML Documents

Figure illustrates multiple XML data sources at different locations. In this case, we may want to output a single XML document to clients by integrating these XML documents.

10. Merging multiple XML documents

graphics/10fig10.gif

The XML document shown in Listing 10.19 is an aggregation of the stock prices of different companies. Each stock price is collected from different stock quote services, described in Sections 10.2 and 10.3.

Integrating multiple XML documents
<?xml version="1.0" encoding="UTF-8"?>
<StockQuotes
  xmlns:xmlns="http://www.example.com/xmlbook2/chap10/stockquote">
  <StockQuote company="IBM">
    <price>134</price>
  </StockQuote>
  <StockQuote company="ABC">
    <price>52</price>
  </StockQuote>
  <StockQuote company="XYZ">
    <price>83</price>
  </StockQuote>
</StockQuotes>
The Need for XML-Based Content Management

As we described, there are certain needs for managing XML documents as a common logical data source, integrating and transforming them if necessary, and finally sending them to Web clients. We call this XML-based content management.

2 Integrating and Multichanneling XML Documents Using Cocoon

In this section, we describe how to integrate the multiple data sources into a single XML document and transform them for the Web client if necessary by using Cocoon.

Stock Quote Aggregation Service

We use the example of the stock quote service shown in the previous section. First, we prepare StockQuote.xml, shown in Listing 10.20.

Listing 10.20 An XML document that aggregates response XML documents from multiple stock quote services using Cocoon, StockQuote.xml
<?xml version="1.0" encoding="UTF-8"?>

<!-- Stylesheet for Web browsers -->
<?xml-stylesheet href="StockQuote-HTML.xsl"
  type="text/xsl"?>
<!-- Stylesheet for Java clients -->
<?xml-stylesheet href="StockQuote-XML.xsl"
  type="text/xsl" media="java"?>
<!-- Processing instructions for Cocoon  -->
<?cocoon-process type="xsp"?>
<?cocoon-process type="xslt"?>

<!-- XSP (eXtensible Server Pages)  -->
<xsp:page
  xmlns:xsp="http://www.apache.org/1999/XSP/Core"
  xmlns:util="http://www.apache.org/1999/XSP/Util">
  <StockQuotes
    xmlns="http://www.example.com/xmlbook2/chap10/stockquote">
    <util:include-uri href=
"http://demohost:8080/xmlbook2/chap10/StockQuote3.jsp?company=IBM"/>
    <util:include-uri href=
"http://demohost:8080/xmlbook2/chap10/StockQuote3.jsp?company=ABC"/>
    <util:include-uri href=
"http://demohost:8080/xmlbook2/chap10/StockQuote3.jsp?company=XYZ"/>
  </StockQuotes>
</xsp:page>

StockQuote.xml aggregates the stock prices for IBM, ABC, and XYZ, transforms them, and returns the result to Web clients. It returns an HTML table to Web browsers, while it returns an XML document as is to Java clients.

You can open the following URL for StockQuote.xml with a Web browser and see an HTML table, as shown in Figure.

Figure. A screenshot of the Web browser showing StockQuote.xml

graphics/10fig11.gif

http://demohost:8080/xmlbook2/chap10/cocoon/StockQuote.xml

You can access the same URL with a Java client by running the following command. Note that the command is actually a single line but is wrapped for printing.

R:\samples>java chap10.stockquote.StockQuoteClient
 http://demohost:8080/xmlbook2/chap10/cocoon/StockQuote.xml

StockQuoteClient is a simple program that sends an HTTP GET request to the specified URL and returns the response. You will see the same XML document shown in Listing 10.20 as the result.[6]

[6] Note that some redundant namespace declarations will be embedded in the XML document.

As we described, Cocoon allows you not only to collect stock prices from the multiple XML document sources but also to transform the information to generate different output for different types of clients.

Program Details

Next we describe the details of StockQuote.xml.

The following code fragment associates this XML document with XSLT stylesheets. Processing instruction (PI) xml-stylesheet is defined in the W3C Recommendation "Associating Style Sheets with XML Documents Version 1.0."

<!-- Stylesheet for Web browsers -->
<?xml-stylesheet href="StockQuote-HTML.xsl"
  type="text/xsl"?>
<!-- Stylesheet for Java clients -->
<?xml-stylesheet href="StockQuote-XML.xsl"
  type="text/xsl" media="java"?>

The first xml-stylesheet associates this document with the StockQuote-HTML.xsl stylesheet. It is used to convert this document to an HTML document as the default stylesheet for Web browsers. The second xml-stylesheet associates this document with the StockQuote-XML.xsl stylesheet. It is used to output this document as is to Java clients. Cocoon selects an appropriate stylesheet by distinguishing the clients by checking the User-Agent header in an HTTP request. Then it applies the selected stylesheet to this document to generate the output. Note that if more than one stylesheet matches the same client, a more specific stylesheet is adopted. In this case, StockQuote-XML.xsl is adopted for the Java client, although it matches both StockQuote-HTML.xsl and StockQuote-XML.xsl.

Listing 10.21 shows StockQuote-HTML.xsl.

Listing 10.21 An XSLT stylesheet that converts a response XML document from a stock quote service into an HTML document, StockQuote-HTML.xsl
    <?xml version="1.0" encoding="UTF-8"?>

    <xsl:stylesheet
      version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns:sq="http://www.example.com/xmlbook2/chap10/stockquote"
      exclude-result-prefixes="sq">

      <!-- Specifies the output format as HTML -->
      <xsl:output
        method="html"
[12]    media-type="text/html"
[13]    encoding="UTF-8"/>

[15]  <!-- Templates -->
[16]  <xsl:template match="/">
[17]    <HTML lang="en">
[18]      <HEAD>
[19]        <TITLE>Stock Quote in HTML</TITLE>
[20]      </HEAD>
[21]      <BODY>
[22]        <TABLE border="1">
[23]          <TR><TD>Company</TD><TD>Price</TD></TR>
[24]          <xsl:apply-templates select="//sq:StockQuote"/>
[25]        </TABLE>
[26]      </BODY>
[27]    </HTML>
[28]  </xsl:template>

[30]  <xsl:template match="sq:StockQuote">
[31]    <TR>
[32]      <TD><xsl:value-of select="@company"/></TD>
[33]      <TD><xsl:value-of select="sq:price"/></TD>
[34]    </TR>
[35]  </xsl:template>
    </xsl:stylesheet>

The xsl:output element specifies the output format as HTML and the Content-Type as a META element in the HTML document (lines 12–13). The template that matches the document root "/" outputs the whole HTML document containing a table (lines 15–28). The template that matches the StockQuote element outputs each row of the table (lines 30–35).

Listing 10.22 shows StockQuote-XML.xsl.

Listing 10.22 An XSLT stylesheet that returns a response XML document from a stock quote service as is, StockQuote-XML.xsl
    <?xml version="1.0" encoding="UTF-8"?>

    <xsl:stylesheet
      version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns="http://www.example.com/xmlbook2/chap10/stockquote">
      <xsl:output
[8]     method="xml"
[9]     media-type="application/xml"
[10]    encoding="UTF-8"/>

      <xsl:template match="/">
[13]    <xsl:processing-instruction name="cocoon-format">
[14]      type="application/xml"
        </xsl:processing-instruction>
[16]    <xsl:apply-templates/>
      </xsl:template>

      <xsl:template match="*">
        <xsl:copy>
          <xsl:apply-templates select="@*|*|text()"/>
        </xsl:copy>
      </xsl:template>

      <xsl:template match="@*">
        <xsl:copy/>
      </xsl:template>
    </xsl:stylesheet>

The xsl:output element specifies the output format as XML (line 8), the media type (line 9), and encoding (line 10). The template that matches the document root "/" outputs PI cocoon-format as a directive for Cocoon. It indicates that the media type of the document generated by using this stylesheet is application/xml (line 13–14). The xsl:apply-templates instruction applies templates to the document element (line 16). The remaining templates output the input XML documents as is.

The following part of StockQuote.xml calls the stock quote JSP service dynamically using Extensible Server Pages (XSP).[7] XSP is similar to JSP and generates dynamic content being developed by the Apache Cocoon project. It is still in draft form and is subject to change. See the Cocoon Web site (http://xml.apache.org/cocoon1/) for details.

[7] Note that we used StockQuote3.jsp, which is a variant of StockQuote2.jsp. It does not output the XML declaration because of the requirement from Cocoon.

The results of the JSP calls are embedded in StockQuote.xml.

<!-- XSP (eXtensible Server Pages)  -->
<xsp:page
  xmlns:xsp="http://www.apache.org/1999/XSP/Core"
  xmlns:util="http://www.apache.org/1999/XSP/Util">
  <StockQuotes xmlns="http://www.example.com/xmlbook2/chap10/stockquote">
    <util:include-uri
href="http://demohost:8080/xmlbook2/chap10/StockQuote3.jsp?company=IBM"/>
    <util:include-uri
href="http://demohost:8080/xmlbook2/chap10/StockQuote3.jsp?company=ABC"/>
    <util:include-uri
href="http://demohost:8080/xmlbook2/chap10/StockQuote3.jsp?company=XYZ"/>
  </StockQuotes>
</xsp:page>

In this section, we introduced Cocoon because we believe the concept of XML-based content management will become more popular in the near future. We described such a concept of content management by using Cocoon, although there are many features of Cocoon that we did not describe.


     Python   SQL   Java   php   Perl 
     game development   web development   internet   *nix   graphics   hardware 
     telecommunications   C++ 
     Flash   Active Directory   Windows