State Management

State Management

To those searching for the truth—not the truth of dogma and darkness but the truth brought by reason, search, examination, and inquiry, discipline is required. For faith, as well intentioned as it may be, must be built on facts, not fiction—faith in fiction is a damnable false hope.

—Thomas Edison

A large part of the work of an enterprise system involves handling data. In fact, arguably, that's the only thing an enterprise system really does.

Although this was never really apparent during the era of the two-tier client/server system, the enterprise programmer needs to take care of two kinds of state: transient state, which is not yet a formal part of the enterprise data footprint, and durable state, which needs to be tracked regardless of what happens.

Transient state is data that the enterprise cares little about—in the event of a crash, nothing truly crucial has been lost, so no tears will be shed. The classic example of transient state is the e-commerce shopping cart. Granted, we don't ever want the system to crash, but let's put the objectivity glasses on for a moment: if the server crashes in the middle of a customer's shopping experience, losing the contents of the shopping cart, nothing is really lost (except for the time the customer spent building it up). Yes, the customer will be annoyed, but there are no implications to the business beyond that.

In a thick-client or rich-client application, transient state is pretty easy to handle: it's just the data stored in local variables in the client-side process that hasn't been preserved to the durable storage layer yet. Nothing particularly fancy needs be done to handle the lifecycle around it—when the client-side process shuts down, the transient state goes away with it.

In a thin-client, HTML-browser-based application, on the other hand, transient state takes on a whole new dimension. Because HTTP itself is a stateless protocol, with no intrinsic way to store per-client state, we've been forced to implement transient state mechanisms on top of the underlying plumbing. To most developers, this is exposed via the HttpSession mechanism that is part of the Servlet 2.x specifications. Unfortunately, nothing in life comes for free, and HttpSession definitely carries its share of costs, most notably to scalability of the entire system as a whole. Judicious use of per-client session state is crucial to a system that wants to scale to more than five concurrent users.

Durable state, on the other hand, is that which we normally think of when somebody starts to ask about "persistent data"—it's the data that has to be kept around for long periods of time. Officially, we'll define durable state as state that absolutely must be kept around even in the event of a JVM termination or crash, but since we could conceivably come up with situations where we'll want transient state to be stored in a database, it's more convenient to simply say that durable state is state that we care about.

Commonly, durable state has implicit legal and/or financial stakes—if, for example, you write a system that loses the items in a book order after the customer's credit card has been charged, you're exposing the company to a lawsuit, at the very least. Or, to flip it around, if you lose the fact that the customer hasn't been charged when the books ship, you're directly costing the company money and will likely find yourself filling out a new résumé pretty soon.

The distinction between the two is crucial when discussing state management because mechanisms that are useful for keeping transient state around won't necessarily be suitable for tracking durable state and vice versa. In some situations, we see some overlap, where we may want to track users' session state in a relational database in order to avoid any sort of situation where a user might lose that transient state. Perhaps it's not a commerce site at all but a human resources process that requires a long-running collection of forms to fill out, or a test that we don't want students to be able to "throw away" and start over just by closing the browser. In such situations, the distinction between transient and durable state may start to blur, but intuitively, it's usually pretty easy to spot the one from the other. Or, arguably, durable state is what we perceive at first to be transient state, as in the case of the students' test or the human resources forms. Either way, drawing this distinction in your mind and designs will help keep straight which mechanisms should be used for state management in your systems.

     Python   SQL   Java   php   Perl 
     game development   web development   internet   *nix   graphics   hardware 
     telecommunications   C++ 
     Flash   Active Directory   Windows