March 3, 2011, 5:06 a.m.
posted by oxy
Item 24: Consider rolling your own communication proxies
Recall, for a moment, that one of the benefits of an RPC-based communication system like RMI is that it's easy for developers to work with—it looks like a local object, smells like a local object, and acts like a local object. All the ugly communications stuff is buried under the hood of the local object, a Proxy [GOF, 207] to the remote object. This model is so tempting that despite all the problems associated with remote objects (see Items 5 and 17 for details), it's still widely used.
So why give it up?
Once we recognize that the temptation of the model lies in the local object proxy, it's easy to realize that nothing stops us from rolling our own proxies, giving us the opportunity to provide whatever optimizations and/or different implementations we desire. Want to use HTTP as the communications backbone? Write a proxy that serializes the parameters and ships that as the body of an HTTP request to a servlet that unpacks it and executes the call. Want to avoid some of the problems with objects-first persistence schemes? Write a proxy that eager-loads data, lazy-loads data, or even eager-loads parts of the data and lazy-loads the rest.
For example, as discussed in Item 23, one suggested solution to the problem of chatty distributed persistent objects (namely, entity beans) is to pass data in bulk, so that all updates to the underlying data store happen all at once, thus justifying the cost of the remote call. The problem with passing data in bulk is that it tends to bleed away the "object-ness" of the system, meaning clients have to logically navigate the difference between a Data Transfer Object [Fowler, 401] and the API used to persist and/or restore it. If you want a more "objects-first" approach to preserve a bit more of the object-ness of the system domain model, one choice is to use a Half-Object Plus Protocol [AJP, 189] and create "smart data proxies" that cache the data locally until the client is ready to commit the data all at once.
It's fair to ask why this kind of optimization isn't already in place, particularly with respect to EJB entity beans—if the entity bean is just an objects-first way to model data (see Item 40), it seems to be an obvious understatement that the entity bean stub should somehow optimize the transfer of data to the server. Again, the problem here is the wording of the EJB Specification—because access to each attribute of an entity bean must be done under the auspices of a transaction, the data must be pushed through to the underlying data store. (Remember, changes to an entity bean have to be preserved in the event of a server crash.) Because the current EJB Specification doesn't allow for this notion of "local caching," any EJB container providing such an optimization would be officially incompatible. That's not to say that it wouldn't be a useful optimization—just make sure to read Item 11 and decide whether vendor neutrality is important to you before using it. (Note that if you write the entity bean as bean-managed, you can put this optimization in place, but now you're basically writing the proxy all over again.)
In fact, many vendors and open-source projects are starting to do this; for example, many JDO vendors can now return "parts" of an object returned by a JDO call, thus eliminating one of the principal criticisms I leveled at objects-first persistence layers (see Item 40). This is a welcome and overdue development because it means that you won't have to create these smart data proxies by hand, as suggested here. If your particular vendor and/or toolkit doesn't do what you want, however, this is your fallback.
Data access is only one possibility.
Most EJB containers make use of RMI for their communications link between EJB clients and the EJB container itself. RMI is a useful object-RPC system, but the reference implementation shipped by Sun has one particularly nasty flaw in it that works contrary to our purposes in a J2EE environment: the RMI stub returned by the RMI server is implicitly "pinned" against a particular server. In other words, the stub you get only knows how to talk to the server it was returned from, and if that server should for some reason stop accepting incoming RMI calls (perhaps the server crashed), your stub is now worthless. Even in the event that the container is distributed against two or more machines in a cluster, the RMI stub you've received can't talk to the other machines in the cluster; you'll have to renegotiate a new stub by asking for a new client from the Home object. So wrap it up in a proxy.
Other possible situations that suggest the use of a smart proxy include the idea of creating a Home that always does a JNDI lookup to preserve location independence, or caches the results of a JNDI Home lookup to avoid network round-trips back to the container, or times out a cached JNDI lookup after 24 hours.
You won't want to do this for all possible proxy situations that come up within your system, obviously—this would represent a tremendous amount of work. Use smart proxies only in situations where the default behavior of the stub or proxy handed back to you isn't what you want, and the cost of creating the smart proxy is justified in the amount of saved network traffic, better failover behavior, or some other tangible, measurable benefit.