April 13, 2011, 8:38 p.m.
posted by equivalent
SAX filters sit between a client application and a parser (XMLReader) and intercept calls from the client application to the parser. SAX filters are instances of implementations of the org.xml.sax.XMLFilter interface, a subinterface of XMLReader. Thus a SAX filter is also a parser, albeit one that receives its data from another XML parser rather than by directly reading an XML document.
The easiest way to write a SAX filter is to subclass org.xml.sax.helpers. XMLfilterImpl, which implements several interfaces including XMLFilter, ContentHandler, DTDHandler, and ErrorHandler. org.xml.sax.helpers.XMLFilterImpl intercepts calls in both directions, from the client to the parser and from the parser to the client. The default behavior of this class is to pass all events along unchanged. By overriding the standard methods of the various handler interfaces, you can change the data a client application receives from the parser. This gives the filter the opportunity to log, modify, block, supplement, or replace each call.
The parser normally verifies document well-formedness against the original text document. Most client applications assume that the data they receive through SAX is well-formed. For example, they assume there to be only a single root element. However, it's possible for a filter to violate these assumptions. For example, the root element start-tag and end-tag could be filtered out while leaving the contents intact, or illegal characters such as null and vertical tab could be passed to the characters() method. Unless you know precisely how your handlers will respond to such malformed event streams, it's important to take care to ensure that your filters maintain well-formedness and, if necessary, validity.