Iterative, Concurrent, and Reactive Servers





Iterative, Concurrent, and Reactive Servers

Servers can be categorized as either iterative, concurrent, or reactive. The primary trade-offs in this dimension involve simplicity of programming versus the ability to scale to increased service offerings and host loads.

Iterative servers handle each client request in its entirety before servicing subsequent requests. While processing a request, an iterative server therefore either queues or ignores additional requests. Iterative servers are best suited for either

  • Short-duration services, such as the standard Internet ECHO and DAYTIME services, that have minimal execution time variation or

  • Infrequently run services, such as a remote file system backup service that runs nightly when platforms are lightly loaded

Iterative servers are relatively straightforward to develop. Figure (1) illustrates how they often execute their service requests internally within a single process address space, as shown by the following pseudo-code:

void iterative server()
{
  initialize listener endpoint(s)

  for (each new client request) {
    retrieve next request from an input source
    perform requested service
    if (response required) send response to client
  }
}
1. Iterative/Reactive versus Concurrent Servers

graphics/05fig01.gif

Due to this iterative structure, the processing of each request is serialized at a relatively coarse-grained level, for example, at the interface between the application and an OS synchronous event demultiplexer, such as select() or WaitForMultipleObjects(). However, this coarse-grained level of concurrency can underutilize certain processing resources (such as multiple CPUs) and OS features (such as support for parallel DMA transfer to/from I/O devices) that are available on a host platform.

Iterative servers can also prevent clients from making progress while they are blocked waiting for a server to process their requests. Excessive server-side delays complicate application and middleware-level retransmission time-out calculations, which can trigger excessive network traffic. Depending on the types of protocols used to exchange requests between client and server, duplicate requests may also be received by a server.

Concurrent servers handle multiple requests from clients simultaneously, as shown in Figure (2). Depending on the OS and hardware platform, a concurrent server either executes its services using multiple threads or multiple processes. If the server is a single-service server, multiple copies of the same service can run simultaneously. If the server is a multiservice server, multiple copies of different services may also run simultaneously.

Concurrent servers are well-suited for I/O-bound services and/or long-duration services that require variable amounts of time to execute. Unlike iterative servers, concurrent servers allow finer grained synchronization techniques that serialize requests at an application-defined level. This design requires synchronization mechanisms, such as semaphores or mutex locks [EKB92], to ensure robust cooperation and data sharing between processes and threads that run simultaneously. We examine these mechanisms in Chapter 6 and show examples of their use in Chapter 10.

As we'll see in Section 5.2, concurrent servers can be structured various ways, for example, with multiple processes or threads. A common concurrent server design is thread-per-request, where a master thread spawns a separate worker thread to perform each client request concurrently:

void master thread()
{
  initialize listener endpoint(s)

  for (each new client request) {
    receive the request
    spawn new worker thread and pass request to this thread
  }
}

The master thread continues to listen for new requests, while the worker thread processes the client request, as follows:

void worker thread()
{
  perform requested service
  if (response required) send response to client
  terminate thread
}

It's straightforward to modify this thread-per-request model to support other concurrent server models, such as thread-per-connection:

void master thread()
{
  initialize listener endpoint(s)

  for (each new client connection) {
    accept connection
    spawn new worker thread and pass connection to this thread
  }
}

In this design, the master thread continues to listen for new connections, while the worker thread processes client requests from the connection, as follows:

void worker_thread()
{
  for (each request on the connection) {
    receive the request
    perform requested service
    if (response required) send response to client
  }
}

Thread-per-connection provides good support for prioritization of client requests. For instance, connections from high-priority clients can be associated with high-priority threads. Requests from higher-priority clients will therefore be served ahead of requests from lower-priority clients since the OS can preempt lower-priority threads.

Section 5.3 illustrates several other concurrent server models, such as thread pool and process pool.

Reactive servers process multiple requests virtually simultaneously, although all processing is actually done in a single thread. Before multithreading was widely available on OS platforms, concurrent processing was often implemented via a synchronous event demultiplexing strategy where multiple service requests were handled in round-robin order by a single-threaded process. For instance, the standard X Windows server operates this way.

A reactive server can be implemented by explicitly time-slicing attention to each request via synchronous event demultiplexing mechanisms, such as select() and WaitForMultipleObjects() described in Chapter 6. The following pseudo-code illustrates the typical style of programming used in a reactive server based on select():

void reactive_server()
{
  initialize listener endpoint(s)

  // Event loop.
  for (;;) {
    select() on multiple endpoints for client requests
    for (each active client request) {
      receive the request
      perform requested service
      if (response is necessary) send response to client
    }
  }
}

Although this server can service multiple clients over a period of time, it's fundamentally iterative from the server's perspective. Compared with taking advantage of full-fledged OS support for multithreading, therefore, applications developed using this technique possess the following limitations:

  • Increased programming complexity. Certain types of networked applications, such as I/O-bound servers, are hard to program with a reactive server model. For example, developers are responsible for yielding the event loop thread explicitly and saving and restoring context information manually. For clients to perceive that their requests are being handled concurrently rather than iteratively, therefore, each request must execute for a relatively short duration. Likewise, long-duration operations, such as downloading large files, must be programmed explicitly as finite state machines that keep track of an object's processing steps while reacting to events for other objects. This design can become unwieldy as the number of states increases.

  • Decreased dependability and performance. An entire server process can hang if a single operation fails, for example, if a service goes into an infinite loop or hangs indefinitely in a deadlock. Moreover, even if the entire process doesn't fail, its performance will degrade if the OS blocks the whole process whenever one service calls a system function or incurs a page fault. Conversely, if only nonblocking methods are used, it can be hard to improve performance via advanced techniques, such as DMA, that benefit from locality of reference in data and instruction caches. As discussed in Chapter 6, OS multithreading mechanisms can overcome these performance limitations by automating preemptive and parallel execution of independent services running in separate threads. One way to work around these problems without going to a full-blown concurrent server solution is to use asynchronous I/O, which is described in Sidebar 11.

Sidebar 11: Asynchronous I/O and Proactive Servers

Yet another mechanism for handling multiple I/O streams in a single-threaded server is asynchronous I/O. This mechanism allows a server to initiate I/O requests via one or more I/O handles without blocking for completion. Instead, the OS notifies the caller when requests are complete, and the server can then continue processing on the completed I/O handles. Asynchronous I/O is available on the following OS platforms:

  • It's supported on Win32 via overlapped I/O and I/O completion ports [Ric97, Sol98].

  • Some POSIX-compliant platforms implement the aio_*() family of asynchronous I/O functions (POS95, Gal95).

Since asynchronous I/O isn't implemented as portably as multithreading or synchronous event demultiplexing, however, we don't consider it further in this book.

Asynchronous I/O is discussed in [SH] when we present the ACE Proactor framework, which implements the Proactor pattern (SSRB00). This pattern allows event-driven applications to demultiplex and dispatch service requests efficiently when they are triggered by the completion of asynchronous operations, thereby achieving the performance benefits of concurrency without incurring certain of its liabilities. The ACE Proactor framework runs on Win32 and on POSIX-compliant platforms that support the aio_*() family of asynchronous I/O functions.

Logging service For simplicity, the initial implementation of our networked logging service in Chapter 4 used an iterative server design. Subsequent chapters extend the capabilities and scalability of our logging server as follows: Chapter 7 extends the server to show a reactive style, Chapter 8 illustrates a concurrent style using multiple processes, and Chapter 9 shows several concurrent designs using multiple threads.


     Python   SQL   Java   php   Perl 
     game development   web development   internet   *nix   graphics   hardware 
     telecommunications   C++ 
     Flash   Active Directory   Windows