You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: affinity/cpp-20/d0796r1.md
+3-2Lines changed: 3 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -257,9 +257,9 @@ In some systems, hardware can be attached to the system while the program is exe
257
257
258
258
Other applications, such as those designed for safety critical enviroments, require the ability to recover from hardware failures. This requires that the resources available within a system can be queried and can be expected to change at any point during the execution of a program. For example GPU may encounter encounter exceptional behaviour or overheat and need to be disabled, yet the program must continue at all costs. **Fault tolerance** allows programs to query the availability of resources and handle failures, which could facilitate reliable programming of heterogeneous and distributed systems.
259
259
260
-
From a historic perspective, many different programming models have tackled the problem of **dynamic resource discovery** following various approaches. [MPI (Message Passing Interface)][mpi] originally (in MPI-1) did not support it, all processes which were capable of communicating with each other would be identified and fixed during at the **point of discovery**. [PVM (Parallel Virtual Machine)][pvm] enabled resources to be discovered at runtime since its conception, using an alternative execution model of manually spawning processes from the main process. This lead MPI to introduce the feature it in later MPI-2. However as far as we know, despite being available this feature is not widely used in HPC environments and the execution model of having all processes fixed on initialisation is generally still the prefered approach. Other programming models for HPC environments support a fixed set of processors on initialization library time, such as SHMEM, Fortran coarrays and UPC++.
260
+
From a historic perspective, many different programming models have tackled the problem of **dynamic resource discovery** following various approaches. [MPI (Message Passing Interface)][mpi] originally (in MPI-1) did not support **dynamic resource discovery**. All processes which were capable of communicating with each other would be identified and fixed during at the **point of discovery**. [PVM (Parallel Virtual Machine)][pvm] enabled resources to be discovered at runtime since its conception, using an alternative execution model of manually spawning processes from the main process. This lead MPI to introduce the feature it in later MPI-2. However as far as we know, despite being available this feature is not widely used in HPC environments and the execution model of having all processes fixed on initialisation is generally still the prefered approach. Other programming models for HPC environments support a fixed set of processors on initialization library time, such as SHMEM, Fortran coarrays and UPC++.
261
261
262
-
Some of these programming models also address **fault tolerance**, in particular, PVM has native support for this. MPI whilst it does not have native support for a PVM-like **fault tolerance** mechanism can be [implemented on top of MPI][mpi-post-failure-recovery] or provided via [extensions][mpi-fault-tolerance].
262
+
Some of these programming models also address **fault tolerance**, in particular, PVM has native support for this, providing a [mechanism][pvm-callback] which can notify a program when a resource is added or removed from a system . MPI whilst it does not have native support for a PVM-like **fault tolerance** mechanism can be [implemented on top of MPI][mpi-post-failure-recovery] or provided via [extensions][mpi-fault-tolerance].
263
263
264
264
Due to the complexity involved in standardising **dynamic resource discovery** and **fault tolerance** these are outside currently out of the scope of this paper.
265
265
@@ -407,6 +407,7 @@ Euro-Par 2011 Parallel Processing: 17th International
0 commit comments