Skip to content

Commit d8379cb

Browse files
author
Gordon Brown
committed
CP013: Apply proposals for relative affinity and allocation.
* Introduce the affinity_query class, with operators. * Add the execution_context::memory_resource() member function. * Change this_system::resource() to this_system::resources() and have it return a vector of execution_resource instead. * Add requirements on this_system::resources(). * Add capability to migrate memory and work between resources to list of problem space challenges. * Add minor reformatting to proposed wording.
1 parent 79ebbf0 commit d8379cb

File tree

1 file changed

+119
-34
lines changed

1 file changed

+119
-34
lines changed

affinity/cpp-20/d0796r1.md

Lines changed: 119 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,11 @@
1717
### Revision 1
1818

1919
* Introduce proposed wording.
20+
* Introduce interface for querying the relative affinity between execution resources.
21+
* Introduce `execution_context::memory_resource` returning a `pmr::memory_resource` to allow affinity-based allocation.
2022
* Have `execution_context::resource()` and `this_system::resource()` return a `const execution_resource &`.
23+
* Make `execution_resource` copy-able and move-able.
24+
* Rename `this_system::resource()` to `this_system::resources()` and have it return `vector<execution_resource>`.
2125

2226
# Abstract
2327

@@ -33,7 +37,7 @@ One strategy to improve applications' performance, given the importance of affin
3337

3438
Operating systems (OSes) traditionally take responsibility for assigning threads or processes to run on processing units. However, OSes may use high-level policies for this assignment that do not necessarily match the optimal usage pattern for a given application. Application developers must leverage the placement of memory and **placement of threads** for best performance on current and future architectures. For C++ developers to achieve this, native support for **placement of threads and memory** is critical for application portability. We will refer to this as the **affinity problem**.
3539

36-
The affinity problem is especially challenging for applications whose behavior changes over time or is hard to predict, or when different applications interfere with each other's performance. Today, most OSes already can group processing units according to their locality and distribute processes, while keeping threads close to the initial thread, or even avoid migrating threads and maintain first touch policy. Nevertheless, most programs can change their work distribution, especially in the presence of nested parallelism.
40+
The affinity problem is especially challenging for applications whose behavior changes over time or is hard to predict, or when different applications interfere with each other's performance. Today, most OSes already can group processing units according to their locality and distribute processes, while keeping threads close to the initial thread, or even avoid migrating threads and maintain first touch policy. Nevertheless, most pro grams can change their work distribution, especially in the presence of nested parallelism.
3741

3842
Frequently, data is initialized at the beginning of the program by the initial thread and is used by multiple threads. While automatic thread migration has been implemented in some OSes, migration may have high overhead. In an optimal case, the OS may automatically detect which thread access which data most frequently, or it may replicate data which is read by multiple threads, or migrate data which is modified and used by threads residing on remote locality groups. However, the OS often does a reasonable job, if the machine is not overloaded, if the application carefully used first-touch allocation, and if the program does not change its behavior with respect to locality.
3943

@@ -93,6 +97,7 @@ In this paper we describe the problem space of affinity for C++, the various cha
9397
* How to represent, identify and navigate the topology of execution resources available within a heterogeneous or distributed system.
9498
* How to query and measure the relative affininty between different execution resources within a system.
9599
* How to bind execution and allocation particular execution resource(s).
100+
* How to migrate memory work and memory allocations between execution resources.
96101
* What kind of and level of interface(s) should be provided by C++ for affinity.
97102
98103
Wherever possible, we also evaluate how an affinity based solution could be scaled to support both distributed and heterogeneous systems.
@@ -112,19 +117,20 @@ There are some additional challenges which we have been investigating but are no
112117
113118
/* Execution resource */
114119
115-
struct execution_resource {
120+
class execution_resource {
121+
public:
116122
117123
execution_resource() = delete;
118-
execution_resource(const execution_resource &) = delete;
119-
execution_resource(execution_resource &&) = delete;
120-
execution_resource &operator=(const execution_resource &) = delete;
121-
execution_resource &operator=(execution_resource &&) = delete;
124+
execution_resource(const execution_resource &);
125+
execution_resource(execution_resource &&);
126+
execution_resource &operator=(const execution_resource &);
127+
execution_resource &operator=(execution_resource &&);
122128
~execution_resource();
123129
124130
size_t concurrency() const noexcept;
125131
size_t partition_size() const noexcept;
126132
127-
const execution_resource &partition(size_t i) const noexcept;
133+
const execution_resource &partition(size_t) const noexcept;
128134
const execution_resource &member_of() const noexcept;
129135
130136
std::string name() const noexcept;
@@ -136,25 +142,54 @@ There are some additional challenges which we have been investigating but are no
136142
137143
/* Execution context */
138144
139-
struct execution_context {
145+
class execution_context {
146+
public:
140147
141148
using executor_type = __unspecfied__;
142149
143-
template <typename ExecutionResource>
144-
execution_context(ExecutionResource &&execResource);
150+
execution_context(const execution_resource &);
145151
146152
~execution_context();
147153
148154
const execution_resource &resource() const noexcept;
149155
150156
executor_type executor() noexcept;
151157
158+
pmr::memory_resource *memory_resource() noexcept;
159+
160+
};
161+
162+
/* Affinity query */
163+
164+
enum class affinity_operation { read, write, copy, move, map };
165+
enum class affinity_metric { latency, bandwidth, capacity, power_consumption };
166+
167+
template <affinity_operation Operation, affinity_metric Metric>
168+
class affinity_query {
169+
public:
170+
171+
using native_affinity_type = __unspecified__;
172+
using error_type = __unspecified__
173+
174+
affinity_query(execution_resource &&, execution_resource &&);
175+
176+
~affinity_query();
177+
178+
native_affinity_type native_affinity() const noexcept;
179+
180+
friend expected<size_t, error_type> operator==(const affinity_query&, const affinity_query&);
181+
friend expected<size_t, error_type> operator!=const affinity_query&, const affinity_query&);
182+
friend expected<size_t, error_type> operator<(const affinity_query&, const affinity_query&);
183+
friend expected<size_t, error_type> operator>(const affinity_query&, const affinity_query&);
184+
friend expected<size_t, error_type> operator<=(const affinity_query&, const affinity_query&);
185+
friend expected<size_t, error_type> operator>=(const affinity_query&, const affinity_query&);
186+
152187
};
153188
154189
/* This system */
155190
156191
namespace this_system {
157-
const execution_resource &resource();
192+
std::vector<execution_resource> resources() noexcept;
158193
}
159194
160195
} // execution
@@ -167,55 +202,56 @@ There are some additional challenges which we have been investigating but are no
167202
168203
The `execution_resource` class provides an abstraction over a system's hardware capable of memory allocation, execution of light weight exeution agents or both.
169204
170-
### `execution_resource` constructors
205+
> [*Note:* The `execution_resource` is required to be implemented such that the underlying software abstraction is initialised on when the `execution_resource` is constructed, maintained through reference counting and cleaned up on destruction of the final reference. *--end note*]
171206
172-
execution_resource() = delete;
207+
### `execution_resource` constructors
173208
209+
execution_resource() = delete;
174210
175-
[*Note:* An implementation of `execution_resource` is permitted to provide non-public constructors to allow other objects to construct them. *--end note*]
211+
> [*Note:* An implementation of `execution_resource` is permitted to provide non-public constructors to allow other objects to construct them. *--end note*]
176212
177213
### `execution_resource` assignment
178214
179215
The `execution_resource` class is not `CopyConstructible` (C++Std [copyconstructible]).
180216
181-
execution_resource(const execution_resource &) = delete;
182-
execution_resource(execution_resource &&) = delete;
183-
execution_resource &operator=(const execution_resource &) = delete;
184-
execution_resource &operator=(execution_resource &&) = delete;
217+
execution_resource(const execution_resource &);
218+
execution_resource(execution_resource &&);
219+
execution_resource &operator=(const execution_resource &);
220+
execution_resource &operator=(execution_resource &&);
185221
186222
### `execution_resource` destructor
187223
188224
The `execution_resource` class is not `Destructible` (C++Std [destructible]).
189225
190-
~execution_resource() = delete;
226+
~execution_resource() = delete;
191227
192228
### `execution_resource` operations
193229
194-
size_t concurrency() const noexcept;
230+
size_t concurrency() const noexcept;
195231
196232
*Returns:*
197233
198-
size_t partition_size() const noexcept;
234+
size_t partition_size() const noexcept;
199235
200236
*Returns:*
201237
202-
const execution_resource &partition(size_t i) const noexcept;
238+
const execution_resource &partition(size_t) const noexcept;
203239
204240
*Returns:*
205241
206-
const execution_resource &member_of() const noexcept;
242+
const execution_resource &member_of() const noexcept;
207243
208244
*Returns:*
209245
210-
std::string name() const noexcept;
246+
std::string name() const noexcept;
211247
212248
*Returns:*
213249
214-
bool can_place_memory() const noexcept;
250+
bool can_place_memory() const noexcept;
215251
216252
*Returns:*
217253
218-
bool can_place_agent() const noexcept;
254+
bool can_place_agent() const noexcept;
219255
220256
*Returns:*
221257
@@ -225,35 +261,83 @@ The `execution_context` class provides an abstraction for managing a number of l
225261
226262
### `execution_context` member aliases
227263
228-
using executor_type = __unspecfied__;
264+
using executor_type = __unspecfied__;
229265
230266
*Requires:*
231267
232268
### `execution_context` constructors
233269
234-
template <typename ExecutionResource>
235-
execution_context(ExecutionResource &&execResource);
270+
execution_context(const execution_resource &);
236271
237272
### `execution_context` destructor
238-
273+
239274
~execution_context();
240275
241276
### `execution_context` operators
242277
243-
const execution_resource &resource() const noexcept;
278+
const execution_resource &resource() const noexcept;
244279
245280
*Returns:*
246281
247-
executor_type executor() noexcept;
282+
executor_type executor() noexcept;
248283
249284
*Returns:*
250285
251-
## Free functions
286+
pmr::memory_resource *memory_resource() noexcept;
287+
288+
*Returns:*
289+
290+
## Class template `affinity_query`
252291
253-
const this_system::execution_resource &resource();
292+
The `affinity_query` class template provides an abstraction for a relative affinity value between two `execution_resource`s, derived from a particular `affinity_operation` and `affinity_metric`.
293+
294+
### `affinity_query` member aliases
295+
296+
using native_affinity_type = __unspecfied__;
297+
298+
*Requires:*
299+
300+
using error_type = __unspecfied__;
301+
302+
*Requires:*
303+
304+
### `affinity_query` constructors
305+
306+
affinity_query(const execution_resource &, const execution_resource &);
307+
308+
### `affinity_query` destructor
309+
310+
~affinity_query();
311+
312+
### `affinity_query` operators
313+
314+
native_affinity_type native_affinity() const noexcept;
315+
316+
*Returns:* Unspecified native affinity value.
317+
318+
### `affinity_query` comparisons
319+
320+
friend expected<size_t> operator==(const affinity_query&, const affinity_query&);
321+
friend expected<size_t> operator!=const affinity_query&, const affinity_query&);
322+
friend expected<size_t> operator<(const affinity_query&, const affinity_query&);
323+
friend expected<size_t> operator>(const affinity_query&, const affinity_query&);
324+
friend expected<size_t> operator<=(const affinity_query&, const affinity_query&);
325+
friend expected<size_t> operator>=(const affinity_query&, const affinity_query&);
254326
255327
*Returns:*
256328
329+
> [*Note:* The comparison operators rely on the availability of the `expected` class template (see [P0323r4: std::expected][p0323r4]), if this does not become available then an alternative error/value construct will be adopted instead. *--end note*]
330+
331+
## Free functions
332+
333+
std::vector<execution_resource> resources() noexcept;
334+
335+
*Returns:* A std::vector containing all system level resources.
336+
337+
*Requires:* If `resources().size() > 0`, `resources()[0]` be the `execution_resource` corroponding to the current thread of execution. The value returned by `resources()` be the same at any point after the invocation of `main`.
338+
339+
> [*TODO:* Returning a `std::vector` allows users to potentially manipulate the container of `execution_resource`s after it is returned, we may want to replace this with an alternative type which is more restrictive at a later date. *--end TODO*]
340+
257341
## Querying a System’s Topology
258342
259343
The first task in allowing C++ applications to leverage memory locality is to provide the ability to query a **system** for its **resource topology** (commonly represented as a tree or graph) and traverse its **execution resources**.
@@ -568,6 +652,7 @@ https://www.open-mpi.org/projects/hwloc/lstopo/
568652
569653
[//]: Links
570654
655+
[p0323r4]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0323r4.html
571656
[hwloc]: https://www.open-mpi.org/projects/hwloc/
572657
[sycl-1-2-1]: https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf
573658
[opencl-2-2]: https://www.khronos.org/registry/OpenCL/specs/opencl-2.2.pdf

0 commit comments

Comments
 (0)