Processing Flow

Processing Flow

Besides all the questions addressed earlier in this chapter, the operating mode is also a matter that may have significant impact on the working system. What I mean by operating mode is whether data should be processed asynchronously (as is the case with batch programs ) or synchronously (as in a typical transactional program).

Batch programs are the historical ancestors of all data processing and are still very much in use today even if no longer very fashionable; synchronous processing is rarely as necessary as one might think. However, the improvement of networks and the increase in bandwidth has led to the "global reach" of an increasing number of applications. As a result, shutting down your online transaction processing (OLTP) application running in the American Midwest may become difficult because of East Asian users connected during one part of the Midwestern night and European users connected during the other part. Batch programs can no longer assume that they are running on empty machines. Moreover, ever-increasing volumes of data may require that incoming data is processed immediately rather than being allowed to accumulate into unmanageably large data sets. Processing streams of data may simply be the most efficient way to manage such quantities.

The way you process data is not without influence on the way you "think" of your system, especially in terms of physical structureswhich I talk about more in Chapter 5. When you have massive batch programs, you are mostly interested in throughputraw efficiency, using as much of the hardware resources as possible. In terms of data processing, a batch program is in the realm of brute force. When you are processing data on the fly, most activity will be small queries that are going to be repeatedly executed a tremendous number of times. For such queries, performing moderately well is not good enoughthey have to perform at the maximum possible efficiency. With an asynchronous program, it is easy to notice that something is wrong (if not always easy to fix): it just takes too long to complete. With synchronous processing, the situation is much more subtle, because performance problems usually show up at the worst moment, when there are surges of activity. If you are not able to spot weaknesses early enough, your system is likely to let you down when your business reaches maximum demand levelsthe very worst time to fail.

A data model is not complete until consideration has also been taken of data flow.

 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows