You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CP013: Apply proposals for relative affinity and allocation.
* Introduce the affinity_query class, with operators.
* Add the execution_context::memory_resource() member function.
* Change this_system::resource() to this_system::resources() and have it
return a vector of execution_resource instead.
* Add requirements on this_system::resources().
* Add capability to migrate memory and work between resources to list of
problem space challenges.
* Add minor reformatting to proposed wording.
* Introduce interface for querying the relative affinity between execution resources.
21
+
* Introduce `execution_context::memory_resource` returning a `pmr::memory_resource` to allow affinity-based allocation.
20
22
* Have `execution_context::resource()` and `this_system::resource()` return a `const execution_resource &`.
23
+
* Make `execution_resource` copy-able and move-able.
24
+
* Rename `this_system::resource()` to `this_system::resources()` and have it return `vector<execution_resource>`.
21
25
22
26
# Abstract
23
27
@@ -33,7 +37,7 @@ One strategy to improve applications' performance, given the importance of affin
33
37
34
38
Operating systems (OSes) traditionally take responsibility for assigning threads or processes to run on processing units. However, OSes may use high-level policies for this assignment that do not necessarily match the optimal usage pattern for a given application. Application developers must leverage the placement of memory and **placement of threads** for best performance on current and future architectures. For C++ developers to achieve this, native support for **placement of threads and memory** is critical for application portability. We will refer to this as the **affinity problem**.
35
39
36
-
The affinity problem is especially challenging for applications whose behavior changes over time or is hard to predict, or when different applications interfere with each other's performance. Today, most OSes already can group processing units according to their locality and distribute processes, while keeping threads close to the initial thread, or even avoid migrating threads and maintain first touch policy. Nevertheless, most programs can change their work distribution, especially in the presence of nested parallelism.
40
+
The affinity problem is especially challenging for applications whose behavior changes over time or is hard to predict, or when different applications interfere with each other's performance. Today, most OSes already can group processing units according to their locality and distribute processes, while keeping threads close to the initial thread, or even avoid migrating threads and maintain first touch policy. Nevertheless, most pro grams can change their work distribution, especially in the presence of nested parallelism.
37
41
38
42
Frequently, data is initialized at the beginning of the program by the initial thread and is used by multiple threads. While automatic thread migration has been implemented in some OSes, migration may have high overhead. In an optimal case, the OS may automatically detect which thread access which data most frequently, or it may replicate data which is read by multiple threads, or migrate data which is modified and used by threads residing on remote locality groups. However, the OS often does a reasonable job, if the machine is not overloaded, if the application carefully used first-touch allocation, and if the program does not change its behavior with respect to locality.
39
43
@@ -93,6 +97,7 @@ In this paper we describe the problem space of affinity for C++, the various cha
93
97
* How to represent, identify and navigate the topology of execution resources available within a heterogeneous or distributed system.
94
98
* How to query and measure the relative affininty between different execution resources within a system.
95
99
* How to bind execution and allocation particular execution resource(s).
100
+
* How to migrate memory work and memory allocations between execution resources.
96
101
* What kind of and level of interface(s) should be provided by C++ for affinity.
97
102
98
103
Wherever possible, we also evaluate how an affinity based solution could be scaled to support both distributed and heterogeneous systems.
@@ -112,19 +117,20 @@ There are some additional challenges which we have been investigating but are no
@@ -167,55 +202,56 @@ There are some additional challenges which we have been investigating but are no
167
202
168
203
The `execution_resource` class provides an abstraction over a system's hardware capable of memory allocation, execution of light weight exeution agents or both.
169
204
170
-
### `execution_resource` constructors
205
+
> [*Note:* The `execution_resource` is required to be implemented such that the underlying software abstraction is initialised on when the `execution_resource` is constructed, maintained through reference counting and cleaned up on destruction of the final reference. *--end note*]
171
206
172
-
execution_resource() = delete;
207
+
### `execution_resource` constructors
173
208
209
+
execution_resource() = delete;
174
210
175
-
[*Note:* An implementation of `execution_resource` is permitted to provide non-public constructors to allow other objects to construct them. *--end note*]
211
+
> [*Note:* An implementation of `execution_resource` is permitted to provide non-public constructors to allow other objects to construct them. *--end note*]
176
212
177
213
### `execution_resource` assignment
178
214
179
215
The `execution_resource` class is not `CopyConstructible` (C++Std [copyconstructible]).
The `affinity_query` class template provides an abstraction for a relative affinity value between two `execution_resource`s, derived from a particular `affinity_operation` and `affinity_metric`.
> [*Note:* The comparison operators rely on the availability of the `expected` class template (see [P0323r4: std::expected][p0323r4]), if this does not become available then an alternative error/value construct will be adopted instead. *--end note*]
*Returns:* A std::vector containing all system level resources.
336
+
337
+
*Requires:* If `resources().size() > 0`, `resources()[0]` be the `execution_resource` corroponding to the current thread of execution. The value returned by `resources()` be the same at any point after the invocation of `main`.
338
+
339
+
> [*TODO:* Returning a `std::vector` allows users to potentially manipulate the container of `execution_resource`s after it is returned, we may want to replace this with an alternative type which is more restrictive at a later date. *--end TODO*]
340
+
257
341
## Querying a System’s Topology
258
342
259
343
The first task in allowing C++ applications to leverage memory locality is to provide the ability to query a **system** for its **resource topology** (commonly represented as a tree or graph) and traverse its **execution resources**.
0 commit comments