Merge pull request #25 from AerialMantis/d0796r1-dynamic-process-management

AerialMantis · web-flow · commit e935e4d0b888 · 2018-02-06T17:56:24.000Z
CP013: Add discussion of dynamic process management in PVM and MPI
diff --git a/affinity/cpp-20/d0796r1.md b/affinity/cpp-20/d0796r1.md
@@ -245,21 +245,27 @@ for (int i = 0; i < resource.partition_size(); i++) {
 | Should the interface provide a way of creating an execution context from an execution resource? |
 | *Is what is defined here a suitable solution?* |
 
-### Importance of topology discovery
+### Topology Discovery & Fault Tolerance
 
-For traditional single CPU systems the execution resources reasoned about using standard constructs such as std::thread, std::this_thread and thread local storage. This is because the C++ memory model requires that  a system have **at least one thread of execution, some memory and some I/O capabilities**. This means that for these systems some assumptions can be made about the topology could be made during at compile-time, for example the fact that developers can query always the hardware concurrency available as there is always at least 1 thread or the fact that you can always use thread local storage.
+In traditional single CPU systems the execution resources can be reasoned about using standard constructs such as `std::thread`, `std::this_thread` and `thread_local`. This is because the C++ machine model requires that a system have **at least one thread of execution, some memory and some I/O capabilities**. This means that for these systems some assumptions can be made about the system resource topology can be made as part of the language and supporting std library. For example the fact that developers can query always the hardware concurrency available as there is always at least one thread or the fact that you can always use thread local storage.
 
-This assumption, however, does not hold on newer more complex systems, and is particularly false in heterogeneous systems. In these systems, the even the available high level resources such as the number and type of devices available in a particular **system** is not known until the **system’s resource topology** has been discovered which often happens as part of a runtime API [19] [20]. Furthermore the level of support these for querying the resource topology these devices may vary. This means the previous assumption that you can query thread concurrency at any stage of the program or the availability of a **std::thread** with local storage is no longer valid: Different devices may have different capabilities.
+This assumption, however, does not hold on newer more complex systems, and is particularly false in heterogeneous systems. In these systems, even the availabiliy of high level resources available in a particular **system** (the type and number of resources) is not known until the physical hardware attached to a particular system has been identified by the program. This often happens as part of a runtime initialisation API [19] [20] which the resources available through som software abstraction. Furthermore the resources which are identified often have different levels of parallel and concurrenct execution capabilities. This process of identifying resources and their capabilities is often refered to as **topology discovery** and the point at the point at which this occurs as the **point of discovery**.
 
-An interesting question which arises here is whether the system topology of an execution resource should be fixed on initialisation or allowed to be dynamic. Allowing a dynamic system topology allows components to go offline and become unavailable at runtime. If we do allow the system topology to be dynamic then we will need to provide a mechanism by which users can be notified of a topology change. However, providing this interface is out of the scope of this initial document.
+An interesting question which arises here is whether the **system resource topology** should be fixed at the **point of discovery** or be allowed to be dynamic and alter during the course of the program. We can identify two main reasons for allowing the **system resource topology** to be dynamic after the *point of discovery*: (A) **online resource discovery** and **fault tolerance**.
 
-Note that this is different from devices that go online or offline during execution: The devices themselves are online, they have not been found (or used) by the program until the appropriate discovery stage has been executed.
+In some systems, hardware can be attached to the system while the program is executing, for example, a [USB-compute device][movidius] that can be plugged in while the application is running to add additional computational power, or a remote hardware connected over a network that can be enabled over specific periods of time. The ability of supporting **online resource discovery** allows programs to directly target these situations natively and be reactive to changes to the resources available to a system.
+
+Other applications, such as those designed for safety critical enviroments, require the ability to recover from hardware failures. This requires that the resources available within a system can be queried and can be expected to change at any point during the execution of a program. For example GPU may encounter encounter exceptional behaviour or overheat and need to be disabled, yet the program must continue at all costs. **Fault tolerance** allows programs to query the availability of resources and handle failures, which could facilitate reliable programming of heterogeneous and distributed systems.
+
+From a historic perspective, many different programming models have tackled the problem of **dynamic resource discovery** following various approaches. [MPI (Message Passing Interface)][mpi] originally (in MPI-1) did not support **dynamic resource discovery**. All processes which were capable of communicating with each other would be identified and fixed during at the **point of discovery**. [PVM (Parallel Virtual Machine)][pvm] enabled resources to be discovered at runtime since its conception, using an alternative execution model of manually spawning processes from the main process. This lead MPI to introduce the feature it in later MPI-2. However as far as we know, despite being available this feature is not widely used in HPC environments and the execution model of having all processes fixed on initialisation is generally still the prefered approach. Other programming models for HPC environments support a fixed set of processors on initialization library time, such as SHMEM, Fortran coarrays and UPC++.
+
+Some of these programming models also address **fault tolerance**, in particular, PVM has native support for this, providing a [mechanism][pvm-callback] which can notify a program when a resource is added or removed from a system . MPI whilst it does not have native support for a PVM-like **fault tolerance** mechanism can be [implemented on top of MPI][mpi-post-failure-recovery] or provided via [extensions][mpi-fault-tolerance].
+
+Due to the complexity involved in standardising **dynamic resource discovery** and **fault tolerance** these are outside currently out of the scope of this paper.
 
 | Straw Poll |
 |------------|
-| Should the interface allow a system’s resource topology to be updated dynamically after initial initialisation? |
-| *When do we enable the device discovery process? Can we change the system topology after executors have been created?* |
-| *Should be provide an interface for providing a call-back on topology change?* |
+| Should the interface support **dynamic resource discovery**? |
 
 ### Lifetime considerations
 
@@ -399,3 +405,10 @@ Euro-Par 2011 Parallel Processing: 17th International
 
 [22] Portable Hardware Locality Istopo
 https://www.open-mpi.org/projects/hwloc/lstopo/
+
+[pvm]: http://www.csm.ornl.gov/pvm/
+[pvm-callback]: http://etutorials.org/Linux+systems/cluster+computing+with+linux/Part+II+Parallel+Programming/Chapter+11+Fault-Tolerant+and+Adaptive+Programs+with+PVM/11.2+Building+Fault-Tolerant+Parallel+Applications/
+[mpi]: http://mpi-forum.org/docs/
+[mpi-fault-tolerance]: http://www.mcs.anl.gov/~lusk/papers/fault-tolerance.pdf
+[mpi-post-failure-recovery]: http://journals.sagepub.com/doi/10.1177/1094342013488238
+[movidius]: https://developer.movidius.com/