July 19, 2011, 8:37 p.m.
posted by oxy
Item 1: Prefer components as the key element of development, deployment, and reuse
One of the difficulties many Java developers face when attempting their first J2EE-based project is that J2EE applications are built differently than traditional Java applications: rather than building applications, J2EE mandates the construction of components that plug in to an already-existing application, that being the J2EE container itself.
This may not seem like a large difference at first, but its implications are huge, in two different directions. First of all, this means that developers aren't really interested in constructing objects per se but in creating strongly encapsulated components that (in the case of J2EE) are made up of tightly coupled constituent objects. Second, it means that there are a set of stringent rules that must be obeyed over which we, as the component builder, have little to no influence.
Developers studying the object-oriented paradigm have, since the very early days, been repeatedly taught to promote encapsulation and data hiding. We've all memorized nuggets of wisdom handed down from on high, such as "Never use public fields; instead always create accessor and mutator methods (getters and setters, in JavaBeans lingo)," for example, or "Hide your implementation from client view," and so on. As a result, for every class ever written, we faithfully create get/set method pairs for every private data field, create a default constructor, and so on.
Unfortunately, this is hardly what the original proponents of object-orientation had in mind. Merely forcing clients to go through a get method to get a reference to some internally held data structure doesn't create encapsulation; this has long been documented in such books as Effective C++ [Meyers95] and, more recently, Effective Java [Bloch, Item 24]. More importantly, it unfortunately leads developers to think at a scale too small for effective reuse, that of objects and classes, rather than thinking at a larger scale, such as what was originally intended by the JavaBeans Specification.
Not convinced? Consider a traditional collection class, say, an ArrayList. We can use the ArrayList as an independent object, but when it comes time to walk over the contents of the collection itself, another object, an Iterator-implementing object, becomes desirable, if not outright necessary. Thinking about reuse at the class/object level means we seriously consider reusing the Iterator-implementing class independently of the ArrayList it's bound to, which makes no sense. Instead, because the two are intended to be used as a pair, it means that we're looking at reuse at a higher level, what we refer to as a component. In practical terms, this usually means a standalone .jar file that contains, in this case, the pair of classes for ArrayList and its internal Iterator subtype. In the Collections case, it would also include relevant interfaces like Iterator, Collection, and List because they all help define the contract clients can place faith in.
This gets us into a discussion of coupling. The definition of Tight Coupling states that a given "thing" is tightly coupled to another "thing" if one has to change in response to changes in the other. In other words, consider ArrayList—if I change the definition of the Collection interface, will ArrayList need to change? Absolutely, since it implements Collection through List. If I change ArrayList's internal implementation, will its internal Iterator implementation need to change? Absolutely. However, will clients need to change? Not so long as they treat the Iterator as a general-purpose Iterator (and don't downcast). ArrayList is tightly coupled to its internal Iterator, and vice versa, but clients can remain loosely coupled by sticking to the interfaces. (Loose coupling is explained in greater detail in Item 2, but I need to forward-reference it here.)
Don't see the relevance? Run on over to the servlet-based Web applications. If we obey the traditional Model-View-Controller abstraction, then we're building a minimum of two classes that are more or less tightly coupled to one another: the controller servlet that does the input processing on an incoming HTTP request, and the view JSP that it forwards to. The controller needs to know what data elements the view depends on, the view needs to know what processing has already been done by the controller so as not to duplicate that work, and the two have to agree on the names under which the data elements the view depends on will be bound. (The name-value attributes of HttpSession provide late binding but not loose coupling; again, see Item 2 for more.) In addition, both controller and view are themselves tightly coupled against the model classes, since they will need to know what data elements are present as part of the model(s) used. Ask yourself these questions: Is it really feasible to consider reusing the controller without the corresponding view and/or model classes? Could the view execute successfully without going through the controller servlet first?
Because the answer to this question is almost universally "no," it means that your classes within a given component, in this case your presentation-layer component, are implicitly tightly coupled to one another. That in turn yields a realization: where tight coupling already exists, we can enjoy a certain amount of relaxation of the traditional "encapsulate everything" rule that we obey so mindlessly. I'm not suggesting that you immediately run out and remove all your get/set methods in favor of direct field access, but I recommend that you think long and hard about what you're really protecting against. For example, if your model objects aren't used outside of the presentation layer (note the terminology; see Item 3) itself, does it really make sense to put get/set methods in front of each field, particularly where the model objects are just thin wrappers over a collection of data, as with Data Transfer Objects [Fowler, 401]. Remember that encapsulation was designed to protect clients against implementation changes, not the component against changes within itself. (Few developers ever saw benefits in trying to encapsulate a class against itself; think of the component as a larger, more coarse-grained class, and you're not too far off the mark.)
Given that we have classes that tightly cooperate with one another to achieve some useful work, and that those classes need to be deployed together atomically or else not at all, we're looking for something "larger than objects" as the principal unit of deployment: in other words, we're looking for components.
The Servlet Specification has moved away from the idea of standalone servlets being deployed individually into the servlet container and instead embraces the idea of a Web application, a collection of resources like servlets, JSPs, model classes, utility libraries, and static resources (like HTML, images, audio files, and so on) that collectively work together to provide desired functionality. The Web application is deployed collectively under a single .war file, so that there is no possible way for (a) only part of the application to be deployed, or (b) the application to be version-mismatched between its collective parts. In essence, the .war file serves the same purpose that the Java .jar file did in making Java application deployments easier; if you did Java in the 1.0 days, you'll remember the "unzip the classes onto your CLASSPATH" style of deployment and agree that trying to deploy Java applications in those days was less than elegant. Instead, the atomic deployment provided by the .war file means that only those resources that are supposed to be part of the Web application actually show up in the deployed application.
That is, unless you deliberately screw that up by doing partial file-based deployments by copying individual files over into the deployed application directory.
There's a couple of reasons why this is a Bad Idea, not the least of which is the possibility of introducing version mismatches. While it may seem tempting to "only copy the stuff that changed" into the deployment directory, it's too easy for humans to lose track of exactly what has changed and, more importantly, to forget that the servlet container doesn't necessarily take the same view of what has changed as we do.
Enter the ClassLoader. Remember him? He's established by the servlet container to load your Web application from disk into the JVM. The servlet container is required to start a new ClassLoader each time the Web application "changes," which usually means "changes on disk in the deployment directory." But if you read the Servlet Specification carefully, you'll notice that when a new ClassLoader is started, it's established over the entire Web application, not an individual servlet. It does this so that each of those classes that form the Web application are all loaded by the same ClassLoader, since the container sees these as a single component. So it's entirely conceivable that each file copied into the deployment directory will yield a new ClassLoader instance, creating a whole ton of extra work for the container and yielding no tangible benefit.
By the way, make sure you understand what tangible effects this ClassLoading policy will yield to you directly by reading Item 70.
The point of all this is that the Servlet (and other J2EE) specifications expect you to build Web applications that combine to form components, not individual classes. The JSP Specification goes one step further: it promotes the construction and use of smaller components within and across Web applications by fostering the concept of reusable tag libraries. The specifications could care less what objects you create and use, so long as those objects that are handed back to the container itself obey certain contracts, as conveyed via interfaces, and certain out-of-band restrictions described in the specification itself.
It's more than just implementing existing interfaces, however. Part of being a component means that because you didn't write main, you don't necessarily know the environment in which your code is being executed. One classic mistake that bites servlet developers the world over as they move from one servlet container to the next is the simple assumption regarding "the current directory"—for some servlet containers, it's the directory in which the container's executable files are located (tomcat/bin, for example). For other containers, however, they set a "work" directory in which bits and pieces of the Web application are assembled and called. The net result is that if you try to load a text file from your Web application's deployment directory by creating a FileInputStream with an argument of ../webapps/myapp/data.xml, what works on one system will horribly break on another. For this reason, the Servlet Specification suggests using either the ServletContext.getResource or ServletContext.getResourceAsStream methods, both of which are also available on the ClassLoader for the Web application.
In fact, this concept of "context" takes on an important meaning in component-based environments—the context, such as the ServletContext in servlet applications, or the enterprise bean context (SessionContext, EntityContext, or MessageDrivenContext) in EJB, is the component's official "window to the outside world," and any and all access to that outside world should take place through the context. This gives the container the opportunity to intercept and redirect application requests to the appropriate place, if the container is doing something tricky under the hood hidden from the code's view. For example, when you want to do a forward from a servlet to a JSP, you are required to go through the ServletContext to get a RequestDispatcher to do the actual forward because a clustering container may have decided to put the JSP page on a different machine than the one executing the servlet. As a result, if you were to directly try to access the servlet instance inside the JVM, such as we used to via the getServlet call, the request would fail miserably.
In some respects, this is also why the Java Naming and Directory Interface (JNDI) was invented—to provide a common API for looking up resources rather than having to use per-specification APIs such as that provided by the RMI Naming class. It's no accident that the JNDI starting point is called an InitialContext.
As may now be apparent to you, writing components is different from what you may have expected from application development. In fact, when writing components you're not doing application development at all; you're writing libraries that are being called by an existing application. Part of being a component instead of an application is that your code must take on the same kinds of characteristics that make writing libraries (again, as opposed to applications) so much fun. For example, in order to best promote individual component flexibility, it's usually better to define types exposed to the library client in terms of interfaces, rather than actual concrete objects [Bloch, Item 16], particularly since that enables your components to provide an interesting "hook point" (see Item 6) for future use. Of course, in large measure this is already true for building components directly accessed by the J2EE container, such as servlets (remember javax.servlet.Servlet?) and EJBs (javax.ejb.SessionBean, javax.ejb.EntityBean, and javax.ejb.MessageDrivenBean). But this can also be true for your own domain classes, such as your HttpSession-bound model objects, for the same reasons. Toward this end, you'll also want to pay careful attention to how clients construct your domain objects [Bloch, Item 1], whether you permit others to inherit from your domain objects [Bloch, Item 15], and what kinds of types you hand back from your components [Bloch, Item 34].
One important realization from this is the fact that J2EE components, with very little exception, are entirely passive entities; in other words, J2EE components must borrow a logical thread of control from the container in order to carry out any meaningful work. This notion of the logical thread of control, usually expressed as an actual thread itself (in other words, the container calls into your code using a thread that it created, usually in response to an end-user request somewhere back up the chain), is often called an activity or causality, and it means that you shouldn't write components that expect to do anything too obsessive with that borrowed thread—don't go off and calculate pi to the hundredth digit, for example—because the container expects to get that logical thread of control back at some point. If it doesn't, it could very well consider that your component has hung and decide to unload your component instance entirely.
This creates a bit of a quandary within the J2EE Specification because frequently tasks can't be accomplished in any reasonable fashion except by having a thread under personal control. Classic examples are the desire to poll some external resource every n seconds, to perform some kind of maintenance or nightly operation at midnight every night, and so on. This functionality is coming as part of the EJB 2.1 Specification in the form of the Timer Service, but for those working with containers that predate that specification, no standard J2EE solution exists, except to write a standalone application that calls into the container via HTTP request, EJB session bean call, or JMS message queue delivery, thereby giving the container that logical thread of control.
Ultimately, again, the key characterization of the J2EE application is its component-centric nature, and as a J2EE developer, you have to play into that model yourself. Failure to do so means swimming upstream against the decisions established by the J2EE container, and in many cases this results in a large amount of code that contradicts the policies established by the container; while you may be able to get away with it in today's version of the container, don't be surprised if tomorrow's version suddenly breaks your code. Instead, go with the current by embracing the component concept, and where you need to escape the container for some reason, do so by writing a standalone daemon process or application.