Synchronization Mechanisms

Synchronization Mechanisms

As described in Sections 6.2 and 6.3, OS platforms provide rudimentary synchronization mechanisms that allow processes and threads to wait for other processes and threads to finish. These mechanisms are sufficient for relatively coarse-grained applications that execute multiple independent execution paths concurrently. Many concurrent applications require more fine-grained synchronization mechanisms, however, to allow processes and threads to coordinate their execution order and the order that they access shared resources, such as files, database records, network devices, object members, and shared memory. Access to these resources must be synchronized to prevent race conditions.

A race condition can occur when the execution order of two or more concurrent threads yields unpredictable and erroneous results. One way to prevent race conditions is to use synchronization mechanisms that serialize access to critical sections of code that access shared resources. Common OS synchronization mechanisms include mutexes, readers/writer locks, semaphores, and condition variables.

To illustrate the need for these mechanisms, consider the following addition to the Iterative_Logging_Server::handle_data() method defined in Section 4.4.3 on page 94.

typedef u_long COUNTER;

// File scope global variable.
static COUNTER request count;

// ...

  virtual int handle_data (ACE_SOCK_Stream *) {
    while (logging_handler_.log_record () != -1)
      // Keep track of number of requests.

               "request_count = %d\n", request_count));
    logging_handler_.close ();
    return 0;

The handle_data() method waits for log records to arrive from a client. When a record arrives, it's removed and processed. The request_count variable keeps track of the number of incoming client log records.

The code above works fine as long as handle_data() executes in just one thread in a process. Incorrect results can occur on many platforms, however, when multiple threads execute handle_data() simultaneously in the same process. The problem is that the code's not thread-safe due to race conditions on the global variable request_count. In particular, different threads can increment and print "stale" versions of the request_count variable on platforms where

  • Auto-increment operations compile into multiple assembly language load, add, and store assignments and/or

  • Total memory ordering is not enforced by the hardware.

Operating systems offer a range of synchronization mechanisms to ensure correct results in this and similar situations. The following are the most common synchronization mechanisms.

1 Mutual Exclusion (Mutex) Locks

Mutual exclusion (mutex) locks can be used to protect the integrity of a shared resource that's accessed concurrently by multiple threads. A mutex serializes the execution of multiple threads by defining a critical section of code that can be executed by only one thread at a time. Mutex semantics are "bracketed"; that is, the thread owning a mutex is responsible for releasing it. This semantic simplicity helps ensure efficient mutex implementations, for example, via hardware spin locks.[2]

[2] Although spin locks have low individual overhead they can consume excessive CPU resources if a thread must busy wait a long period of time for a particular condition to occur.

There are two general types of mutexes:

  • Nonrecursive mutex that will deadlock or otherwise fail if the thread currently owning the mutex tries to reacquire it without first releasing it and

  • Recursive mutex that will allow the thread owning the mutex to acquire it multiple times without self-deadlock, as long as the thread ultimately releases the mutex the same number of times.

2 Readers/Writer Locks

A readers/writer lock allows access to a shared resource by either

  • Multiple threads reading a resource concurrently without modifying it or

  • Only one thread at a time modifying the resource and excluding all other threads from both read and write access.

This type of lock can improve the performance of concurrent applications in which resources protected by a readers/writer lock are read from much more frequently than they are written to. Readers/writer locks can be implemented to give preference either to readers or to writers [BA90].

Readers/writer locks share some properties with mutexes; for example, the thread that acquires a lock must also release it. When a writer wishes to acquire the lock, the thread waits for all other lock holders to release the lock, then the writer thread gains an exclusive hold on the lock. Unlike mutexes, however, multiple threads may acquire a readers/writer lock for reading simultaneously.

3 Semaphore Locks

A semaphore is conceptually a non-negative integer that can be incremented and decremented atomically. A thread blocks when it tries to decrement a semaphore whose value equals 0. A blocked thread will be released only after another thread "posts" the semaphore, causing its count to become greater than 0.

Semaphores retain state to keep track of the count and the number of blocked threads. They are often implemented using sleep locks, which trigger a context switch that allows other threads to execute. Unlike mutexes, they need not be released by the same thread that acquired them initially. This capability allows semaphores to be used in a broader range of execution contexts, such as signal handlers or interrupt handlers.

4 Condition Variables

A condition variable provides a different flavor of synchronization than a mutex, readers/writer, or semaphore lock. The latter three mechanisms make other threads wait while the thread holding the lock executes code in a critical section. In contrast, a thread can use a condition variable to coordinate and schedule its own processing.

For example, a condition variable can make itself wait until a condition expression involving data shared by other threads attains a particular state. When a cooperating thread indicates that the shared data state has changed, a thread that's blocked on a condition variable is awakened. The newly awakened thread then reevaluates its condition expression and can resume its processing if the shared data has attained the desired state. Since arbitrarily complex condition expressions can be waited for using a condition variable, they permit more complex scheduling decisions than the other synchronization mechanisms outlined above.

Condition variables provide the core synchronization mechanisms for advanced concurrency patterns, such as Active Object and Monitor Object [SSRB00], and intraprocess communication mechanisms, such as synchronized message queues. Pthreads and UNIX International (UI) threads (implemented on Solaris) support condition variables natively, but other platforms, such as Win32 and many real-time operating systems, do not.

Sidebar 12 evaluates the relative performance characteristics of the OS synchronization mechanisms outlined above. You should generally try to use the most efficient synchronization mechanism that provides the semantics you require. When efficiency is important, look up the reference information for platforms targeted by your application. You can tune your application to take advantage of platform differences by varying the ACE synchronization classes. Moreover, your code can remain portable by using the Strategized Locking pattern [SSRB00] and conforming to the method signature of the ACE_LOCK* pseudo-class described on page 209 in Section 10.1.

Sidebar 12: Evaluating Synchronization Mechanism Performance

Operating systems provide a range of synchronization mechanisms to handle the needs of different applications. Although the performance of the synchronization mechanisms discussed in this chapter varies according to OS implementations and hardware, the following are general issues to consider:

  • Condition variables and semaphores often have higher overhead than mutexes since their Implementations are more complicated. Native OS mechanisms nearly always perform better than user-created substitutes, however, since they can take advantage of internal OS characteristics and hardware-specific tuning.

  • Mutexes generally have lower overhead than readers/writer locks because they don't need to manage multiple walter sets. Multiple reader threads can proceed in parallel, however, so readers/writer locks can scale better on multiprocessors when they are used to access data that are read much more than they are updated.

  • Nonrecursive mutexes are more efficient than recursive mutexes. Moreover, recursive mutexes can cause subtle bugs if programmers forget to release them [But97], whereas nonrecursive mutexes reveal these problems by deadlocking immediately.

     Python   SQL   Java   php   Perl 
     game development   web development   internet   *nix   graphics   hardware 
     telecommunications   C++ 
     Flash   Active Directory   Windows