July 17, 2011, 2:47 p.m.
posted by naxellar
No Keyword Support for Parallelism in C++
The C++ language does not include any keyword primitives for parallelism. The C++ ISO standard is for the most part silent on the topic of multithreading. There is no way within the language to specify that two or more statements should be executed in parallel. Other languages use built-in parallelism as a selling feature. Bjarne Stroustrup, the inventor of the C++ language, had something else in mind. In Stroustrup's opinion:
It is possible to design concurrency support libraries that approach built-in concurrency support both in convenience and efficiency. By relying on libraries, you can support a variety of concurrency models, though, and thus serve the users that need those different models better than can be done by a single built-in concurrency model. I expect this will be the direction taken by most people and that the portability problems that arise when several concurrency-support libraries are used within the community can be dealt with by a thin layer of interface classes.
Furthermore, Stroustrup says, "I recommend parallelism be represented by libraries within C++ rather than as a general language feature." The authors have found Stroustrup's position and recommendation on parallelism as a library the most practical option. This book is only made possible because of the availability of high-quality libraries that can be used for parallel and distributed programming. The libraries that we use to enhance C++ implement national and international standards for parallelism and distributed programming and are used by thousands of C++ programmers worldwide.
1 The Options for Implementing Parallelism Using C++
Although there are special versions of C++ that implement parallelism, we present methods on how parallelism can be implemented using the ISO (International Standard Organization) standard for C++. The library approach to parallelism is the most flexible. System libraries and user-level libraries can be used to support parallelism in C++. System libraries are those libraries provided by the operating system environment. For example, the POSIX threads library is a set of system calls that can be used in conjunction with C++ to support parallelism. The POSIX (Portable Operating System Interface) threads are part of the new Single UNIX Specification. The POSIX threads are included in the IEEE Std. 1003.1-2001. The Single UNIX Specification is sponsored by the Open Group and developed by the Austin Common Standards Revision Group. According to the Open Group, the Single UNIX Specification is:
Designed to give software developers a single set of APIs to be supported by every UNIX system.
Shifts the focus from incompatible UNIX system product implementations to compliance to a single set of APIs.
It is the codification and dejure standardization of the common core of UNIX system practice.
The basic objective is portability of both programmers and application source code.
The Single UNIX Specification Version 3 includes the IEEE Std 1003. 1-2001 and the Open Group Base Specifications Issue 6. The IEEE POSIX standards are now a formal part of the Single UNIX Specification and vice versa. There is now a single international standard for a portable operating system interface. C++ developers benefit because this standard contains APIs for creating threads and processes. Excluding instruction-level parallelism, dividing a program up into either threads or processes is the only way to achieve parallelism with C++. The new standard provides the tools to do this. The developer can use:
POSIX threads (also referred to as pthreads)
POSIX spawn function
the exec() family of functions
These are all supported by system API calls and system libraries. If an operation system complies with the Single UNIX Specification Version 3, then these APIs will be available to the C++ developer. These APIs are discussed in Chapters 3 and 4. They are used in many of the examples in this book. In addition to system-level libraries, user-level libraries that implement other international standards such as the MPI (Message Passing Interface), PVM (Parallel Virtual Machine), and CORBA (Common Object Request Broker Architecture) can be used to support parallelism with C++.
2 MPI Standard
The MPI is the standard specification for message passing. The MPI was designed for high performance on both massively parallel machines and on workstation clusters. This book uses the MPICH implementation of the MPI standard. MPICH is a freely available, portable implementation of MPI. The MPICH provides the C++ programmer with a set of APIs and libraries that support parallel programming. The MPI is especially useful for SPMD and MPMD programming. The authors use the MPICH implementation of MPI on a 32-node cluster running Linux and an 8-node cluster running Solaris and Linux. Although C++ doesn't have parallel primitives built in, it can take advantage of power libraries such as MPICH that does support parallelism. This is one of the benefits of C++. It is designed for flexibility.
3 PVM: A Standard for Cluster Programming
The PVM is a software package that permits a heterogeneous collection of computers hooked together by a network to be used as a single large parallel computer. The overall objective of the PVM system is to enable a collection of computers to be used cooperatively for concurrent or parallel computation. A PVM library implementation supports:
Heterogeneity in terms of machines, networks, and applications
Explicit message-passing model
Multiprocessor support (MPP, SMP)
Translucent access to hardware (applications can either ignore or take advantage of hardware differences)
Dynamically configurable host pool (processors can be added and deleted at runtime and can include processor mixes)
The PVM is the easiest to use and most flexible environment available for basic parallel programming tasks that require the involvement of different types of computers running different operating systems. The PVM library is especially useful for several single processor systems that can be networked together to form a virtual parallel processor machine. We discuss techniques for using PVM with C++ in Chapter 6. The PVM is the de facto standard for implementing heterogeneous clusters and is freely available and widely used. The PVM has excellent support for MPMD (MIMD) and SPMD (SIMD) models of parallel programming. The authors use PVM for small- to medium-size parallel programming tasks and the MPI for larger, more complex MPI tasks. PVM and MPI are both libraries that can be used with C++ to do cluster programming.
4 The CORBA Standard
CORBA is the standard for distributed cross-platform object-oriented programming. We mention CORBA here under parallelism because implementations of the CORBA standard can be used to develop multiagent systems. Multiagent systems offer important models of peer-to-peer distributed programming. Multiagent systems can work concurrently. This is one of the areas where parallel programming and distributed programming overlap. Although the agents are executing on different computers, they are executing during the same time period, working cooperatively on a common problem. The CORBA standard provides an open, vendor-independent architecture and infrastructure that computer applications use to work together over networks. Using the standard protocol IIOP, a CORBA-based program from any vendor, on almost any computer, operating system, programming language, and network, can interoperate with a CORBA-based program from the same or another vendor on almost any other computer operating system, programming language, and network. In this book we use the MICO implementation. MICO is a freely available and fully compliant implementation of the CORBA standard. MICO supports C++.
5 Library Implementations Based on Standards
MPICH, PVM, MICO, and POSIX threads are each library implementations based on standards. This means that software developers can rely on these implementations to be widely available and portable across multiple platforms. These libraries are freely available and used by software developers worldwide. The POSIX threads library can be used with C++ to do multithreaded programming. If the program is running on a computer that has multiple processors, then each thread can possibly run on a separate processor and thereby execute concurrently. If only a single processor is available, then the illusion of parallelism is provided and concurrency is achieved through the process of context switching. POSIX threads are perhaps the easiest way to introduce parallelism within a C++ program. Whereas the MPICH, PVM, and MICO libraries will have to be downloaded or obtained (they are readily available), any operating system environment that is client with the POSIX standard or the new UNIX Specification Version 3 will have a POSIX threads implementation. Each library offers a slightly different model of parallelism. Figure shows how each library can be used with C++.
Supports large-scale, complex cluster programming. Strong support for SPMD model. Also supports SMP, MPP, and multiuser configurations.
Supports cluster programming of heterogeneous environments. Easy to use for single-user, small to medium cluster applications. Also supports MPP.
Supports either distributed or object-oriented parallel programming. Contains nice support for agent and multiagent programming.
Supports parallel processing within a single application at the function or object level. Can be used to take advantage of SMP or MPP.
Whereas languages that depend on built-in support for parallelism are restricted to the models supplied, the C++ developer is free to mix and match parallel programming models. As the nature of the applications change, a C++ developer can select different libraries to match the scenario.