You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Minor fixes for typos.
* Have execution_context::memory_resource() return a reference instead
of a pointer.
* Add execution_context::allocator().
* Add execution_context::pmr_memory_resource_type alias.
* Fill in requirements for execution_resource, execution_context,
affinity_query and this_system::resources().
Copy file name to clipboardExpand all lines: affinity/cpp-20/d0796r1.md
+78-52Lines changed: 78 additions & 52 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,7 +37,7 @@ One strategy to improve applications' performance, given the importance of affin
37
37
38
38
Operating systems (OSes) traditionally take responsibility for assigning threads or processes to run on processing units. However, OSes may use high-level policies for this assignment that do not necessarily match the optimal usage pattern for a given application. Application developers must leverage the placement of memory and **placement of threads** for best performance on current and future architectures. For C++ developers to achieve this, native support for **placement of threads and memory** is critical for application portability. We will refer to this as the **affinity problem**.
39
39
40
-
The affinity problem is especially challenging for applications whose behavior changes over time or is hard to predict, or when different applications interfere with each other's performance. Today, most OSes already can group processing units according to their locality and distribute processes, while keeping threads close to the initial thread, or even avoid migrating threads and maintain first touch policy. Nevertheless, most pro grams can change their work distribution, especially in the presence of nested parallelism.
40
+
The affinity problem is especially challenging for applications whose behavior changes over time or is hard to predict, or when different applications interfere with each other's performance. Today, most OSes already can group processing units according to their locality and distribute processes, while keeping threads close to the initial thread, or even avoid migrating threads and maintain first touch policy. Nevertheless, most programs can change their work distribution, especially in the presence of nested parallelism.
41
41
42
42
Frequently, data is initialized at the beginning of the program by the initial thread and is used by multiple threads. While automatic thread migration has been implemented in some OSes, migration may have high overhead. In an optimal case, the OS may automatically detect which thread access which data most frequently, or it may replicate data which is read by multiple threads, or migrate data which is modified and used by threads residing on remote locality groups. However, the OS often does a reasonable job, if the machine is not overloaded, if the application carefully used first-touch allocation, and if the program does not change its behavior with respect to locality.
43
43
@@ -128,10 +128,10 @@ There are some additional challenges which we have been investigating but are no
@@ -200,110 +206,124 @@ There are some additional challenges which we have been investigating but are no
200
206
201
207
## Class `execution_resource`
202
208
203
-
The `execution_resource` class provides an abstraction over a system's hardware capable of memory allocation, execution of light weight exeution agents or both.
209
+
The `execution_resource` class provides an abstraction over a system's hardware capable of memory allocation, execution of light weight exeution agents or both. An `execution_resource` can represent further `execution_resource`s, these `execution_resource`s are said to be *members of* this `execution_resource`.
204
210
205
211
> [*Note:* The `execution_resource` is required to be implemented such that the underlying software abstraction is initialised on when the `execution_resource` is constructed, maintained through reference counting and cleaned up on destruction of the final reference. *--end note*]
206
212
207
213
### `execution_resource` constructors
208
214
209
-
execution_resource() = delete;
215
+
execution_resource();
210
216
211
217
> [*Note:* An implementation of `execution_resource` is permitted to provide non-public constructors to allow other objects to construct them. *--end note*]
212
218
213
219
### `execution_resource` assignment
214
220
215
-
The `execution_resource` class is not `CopyConstructible` (C++Std [copyconstructible]).
The `execution_resource` class is not `Destructible` (C++Std [destructible]).
225
-
226
-
~execution_resource() = delete;
228
+
~execution_resource();
227
229
228
230
### `execution_resource` operations
229
231
230
232
size_t concurrency() const noexcept;
231
233
232
-
*Returns:*
234
+
*Returns:* The collective concurrency available to this resource. More pecifically, the number of *threads of execution* collectively available to this `execution_resource` and any resources which are *members of*, recursively.
*Returns:* The `execution_resource` which this resource is a *member of*.
245
243
246
244
std::string name() const noexcept;
247
245
248
-
*Returns:*
246
+
*Returns:* An implementation defined string.
249
247
250
248
bool can_place_memory() const noexcept;
251
249
252
-
*Returns:*
250
+
*Returns:* If this resource is capable of allocating memory with affinity, 'true'.
253
251
254
252
bool can_place_agent() const noexcept;
255
253
256
-
*Returns:*
254
+
*Returns:* If this resource is capable of execute with affinity, 'true'.
257
255
258
256
## Class `execution_context`
259
257
260
-
The `execution_context` class provides an abstraction for managing a number of light weight execution agents executing work on an `execution_resource` and any `execution_resource`s encapsulated by it.
258
+
The `execution_context` class provides an abstraction for managing a number of light weight execution agents executing work on an `execution_resource` and any `execution_resource`s encapsulated by it. The `execution_resource` which an `execution_context` encapsulates is refered to as the *contained resource*.
259
+
260
+
### `execution_context` types
261
+
262
+
using executor_type = see-below;
263
+
264
+
*Requires:* `executor_type` is an implementation defined class which satifies the general executor requires, as specified by P0443r5.
261
265
262
-
### `execution_context` member aliases
266
+
using pmr_memory_resource_type = see-below;
263
267
264
-
using executor_type = __unspecfied__;
268
+
*Requires:* `pmr_memory_resource_type` is an implementation defined class which inherits from `std::pmr::memory_resource`.
265
269
266
-
*Requires:*
270
+
using allocator_type = see-below;
271
+
272
+
*Requires:* `allocator_type` is an implementation defined allocator class.
*Returns:* A const-reference to the *contained resource*.
281
291
282
292
executor_type executor() noexcept;
283
293
284
-
*Returns:*
294
+
*Returns:* An executor of type `executor_type` capable of executing work with affinity to the *contained resource*.
295
+
296
+
*Throws:* An exception `!this->resource().can_place_agents()`.
297
+
298
+
pmr::memory_resource &memory_resource() noexcept;
285
299
286
-
pmr::memory_resource *memory_resource() noexcept;
300
+
*Returns:* A reference to a polymorphic memory resource of type `pmr_memory_resource_type` capable of allocating with affinity to the *contained resource*.
287
301
288
-
*Returns:*
302
+
*Throws:* If `!this->resource().can_place_memory()`.
303
+
304
+
allocator_type allocator() const;
305
+
306
+
*Returns:* An allocator of type `allocator_type` capable of allocating with affinity to the *contained resource*.
307
+
308
+
*Throws:* If `!this->resource().can_place_memory()`.
289
309
290
310
## Class template `affinity_query`
291
311
292
312
The `affinity_query` class template provides an abstraction for a relative affinity value between two `execution_resource`s, derived from a particular `affinity_operation` and `affinity_metric`.
293
313
294
-
### `affinity_query` member aliases
314
+
### `affinity_query` types
295
315
296
-
using native_affinity_type = __unspecfied__;
316
+
using native_affinity_type = see-below;
297
317
298
-
*Requires:*
318
+
*Requires:* `native_affinity_type` is an implementation defined integral type capable of storing a native affinity value.
299
319
300
-
using error_type = __unspecfied__;
320
+
using error_type = see-below;
301
321
302
-
*Requires:*
322
+
*Requires:* `error_type` is an implementation defined integral type capable of storing the an error code value.
*Returns:* An `expected<size_t, error_type>` where,
348
+
* if the affinity query was succesful, the value of type `size_t` represents the magnitude of the relative affinity;
349
+
* if the affinity query was not successful, the error is an error of type `error_type` which represents the reason for affinity query failed.
350
+
351
+
> [*Note:* An affinity query is permitted to fail if affinity between the two execution resources cannot be calculated for any reason, such as the resources are of different vendors or communication between the resources is not possible. *--end note*]
328
352
329
353
> [*Note:* The comparison operators rely on the availability of the `expected` class template (see [P0323r4: std::expected][p0323r4]), if this does not become available then an alternative error/value construct will be adopted instead. *--end note*]
330
354
331
355
## Free functions
332
356
357
+
The free function `this_system::resources` is provided for retrieving the `execution_resource`s which encapsualte the hardware platforms available within the system, these are refered to as the *system level resources*.
*Returns:* A std::vector containing all system level resources.
361
+
*Returns:* A std::vector containing all *system level resources*.
336
362
337
363
*Requires:* If `resources().size() > 0`, `resources()[0]` be the `execution_resource` corroponding to the current thread of execution. The value returned by `resources()` be the same at any point after the invocation of `main`.
0 commit comments