Skip to content

Commit d9f10e2

Browse files
author
Gordon Brown
committed
CP013: Add minor modifications to D0796r1.
* Add changelog. * Add proposed wording. * Add Chapel, X10 and UPC++ to the list of background research. * Add inline links to references.
1 parent bf7bed3 commit d9f10e2

File tree

1 file changed

+104
-17
lines changed

1 file changed

+104
-17
lines changed

affinity/cpp-20/d0796r1.md

Lines changed: 104 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,12 @@
1212

1313
**Reply to: michael@codeplay.com**
1414

15+
# Changelog
16+
17+
### Revision 1
18+
19+
* Introduce proposed wording.
20+
1521
# Abstract
1622

1723
This paper provides an initial meta-framework for the drives toward memory affinity for C++, given the direction from Toronto 2017 SG1 meeting that we should look towards defining affinity for C++ before looking at inaccessible memory as a solution to the separate memory problem towards supporting heterogeneous and distributed computing.
@@ -62,18 +68,20 @@ The goal was that this would enable scaling up for heterogeneous and distributed
6268
6369
The problem of effectively partitioning a system’s topology is one which has been so for some time, and there are a range of third party libraries / standards which provides APIs to solve the problem. In order to standardise this process for the C++ standard we must carefully look at all of these. Below is a list of the libraries and standards which define an interface for affinity:
6470
65-
Portable Hardware Locality: https://www.open-mpi.org/projects/hwloc/
66-
SYCL 1.2: https://www.khronos.org/registry/SYCL/specs/sycl-1.2.pdf
67-
OpenCL 2.2: https://www.khronos.org/registry/OpenCL/specs/opencl-2.2.pdf
68-
HSA: http://www.hsafoundation.com/standards/
69-
OpenMP 4.0: https://www.cct.lsu.edu/mardigras14/abstracts#Wong
70-
cpuaff: https://github.com/dcdillon/cpuaff
71-
OpenMP 5.0: http://www.openmp.org/wp-content/uploads/openmp-TR5-final.pdf
72-
Persistent Memory Programming: http://pmem.io/
73-
MEMKIND: https://github.com/memkind/memkind
74-
Solaris pbind(): https://docs.oracle.com/cd/E26502_01/html/E29031/pbind-1m.html
75-
Linux sched_setaffinity(): https://linux.die.net/man/2/sched_setaffinity
76-
Windows SetThreadAffinityMask(): https://msdn.microsoft.com/en-us/library/windows/desktop/ms686247(v=vs.85).aspx
71+
* [Portable Hardware Locality][hwloc]
72+
* [SYCL 1.2][sycl-1-2-1]
73+
* [OpenCL 2.2][opencl-2-2]
74+
* [HSA][HSA]
75+
* [OpenMP 5.0][openmp-5]
76+
* [cpuaff][cpuaff]
77+
* [Persistent Memory Programming][pmem]
78+
* [MEMKIND][memkid]
79+
* [Solaris pbind()][solaris-pbind]
80+
* [Linux sched_setaffinity()][linux-sched-setaffinity]
81+
* [Windows SetThreadAffinityMask()][windows-set-thread-affinity-mask]
82+
* [Chapel][chapel]
83+
* [X10][x10]
84+
* [UPC++][upc++]
7785
7886
Libraries such as the Portable Hardware Locality (hwloc) [9] provide a low level of hardware abstraction and offer a solution for the portability problem by supporting many platforms and operating systems. This and similar approaches may provide detailed hardware information in a tree-like structure. However, even some current systems cannot be represented correctly by a tree, where the number of hops between two sockets vary between socket pairs [14].
7987
@@ -94,6 +102,66 @@ There are some additional challenges which we have been investigating but are no
94102
* Migrating data from memory allocated in one partition to another
95103
* Defining memory placement algorithms or policies
96104
105+
106+
## Proposed Wording
107+
108+
### Header synopsis
109+
110+
```cpp
111+
namespace std {
112+
namespace experimental {
113+
namespace execution {
114+
115+
/* Execution resource */
116+
117+
struct execution_resource {
118+
119+
execution_resource() = delete;
120+
execution_resource(const execution_resource &) = delete;
121+
execution_resource(execution_resource &&) = delete;
122+
execution_resource &operator=(const execution_resource &) = delete;
123+
execution_resource &operator=(execution_resource &&) = delete;
124+
125+
size_t concurrency() const noexcept;
126+
size_t partition_size() const noexcept;
127+
128+
const execution_resource &partition(size_t i) const noexcept;
129+
const execution_resource &member_of() const noexcept;
130+
131+
std::string name() const noexcept;
132+
133+
bool can_place_memory() const noexcept;
134+
bool can_place_agent() const noexcept;
135+
136+
};
137+
138+
/* Execution context */
139+
140+
struct execution_context {
141+
142+
using executor_type = __unspecfied__;
143+
144+
template <typename ExecutionResource>
145+
execution_context(ExecutionResource &&execResource);
146+
147+
execution_resource &resource();
148+
149+
executor_type executor() noexcept;
150+
151+
};
152+
153+
/* This system */
154+
155+
namespace this_system {
156+
execution_resource &resource();
157+
}
158+
159+
} // execution
160+
} // experimental
161+
} // std
162+
```
163+
*Listing 2: Header synopsis*
164+
97165
### Querying a System’s Topology
98166

99167
The first task in allowing C++ applications to leverage memory locality is to provide the ability to query a **system** for its **resource topology** (commonly represented as a tree or graph) and traverse its **execution resources**.
@@ -157,7 +225,7 @@ struct execution_resource {
157225

158226
};
159227
```
160-
*Listing 2: Proposed extended execution resource interface*
228+
*Listing 3: Proposed extended execution resource interface*
161229
162230
The interface described above describes an execution resource as an object which cannot be user constructed, copied or moved, only referenced. It provides an interface for recursively querying the partitions and concurrency of it’s child execution resources via the member functions `concurrency`, `partition_size` and `partition` and it’s parent execution resource via the member function `member_of`. This interface is designed to match the design of `thread_execution_resource_t` [8]. Note that the resource is not limited to be an **execution resource**, but also a general resource where no execution can take place but memory can be allocated such as off-chip memory.
163231
@@ -185,7 +253,7 @@ namespace std::this_system {
185253
execution_resource &resource();
186254
}
187255
```
188-
*Listing 3: Interface for querying the execution resources available within a system*
256+
*Listing 4: Interface for querying the execution resources available within a system*
189257

190258
The **resource** function in the `this_system` namespace will return the root **execution resource** of the current system.
191259

@@ -198,7 +266,7 @@ for (int i = 0 ; i < partition_size(); i++) {
198266
std::cout << resource.partition(i).name() << std::endl;
199267
}
200268
```
201-
*Listing 4: Example of querying the execution resources available within a system*
269+
*Listing 5: Example of querying the execution resources available within a system*
202270

203271
| Straw Poll |
204272
|------------|
@@ -217,7 +285,7 @@ struct execution_context {
217285
...
218286
};
219287
```
220-
*Listing 5: Extension to execution_context interface*
288+
*Listing 6: Extension to execution_context interface*
221289
222290
The **execution context** constructor described above allows constructing an **execution context** from any **execution resource** within a **system’s resource topology**. The constructed **execution context** can then execute work on any resource under that **execution resource**.
223291
@@ -238,7 +306,7 @@ for (int i = 0; i < resource.partition_size(); i++) {
238306
std::cout << resource.partition(i).name() << std::endl;
239307
}
240308
```
241-
*Listing 6: Example of constructing an execution context from an execution resource*
309+
*Listing 7: Example of constructing an execution context from an execution resource*
242310

243311
| Straw Poll |
244312
|------------|
@@ -399,3 +467,22 @@ Euro-Par 2011 Parallel Processing: 17th International
399467
400468
[22] Portable Hardware Locality Istopo
401469
https://www.open-mpi.org/projects/hwloc/lstopo/
470+
471+
472+
[//]: Links
473+
474+
[hwloc]: https://www.open-mpi.org/projects/hwloc/
475+
[sycl-1-2-1]: https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf
476+
[opencl-2-2]: https://www.khronos.org/registry/OpenCL/specs/opencl-2.2.pdf
477+
[HSA]: http://www.hsafoundation.com/standards/
478+
[openmp-4]: https://www.cct.lsu.edu/mardigras14/abstracts#Wong
479+
[openmp-5]: http://www.openmp.org/wp-content/uploads/openmp-TR5-final.pdf
480+
[cpuaff]: https://github.com/dcdillon/cpuaff
481+
[pmem]: http://pmem.io/
482+
[memkid]: https://github.com/memkind/memkind
483+
[solaris-pbind]: https://docs.oracle.com/cd/E26502_01/html/E29031/pbind-1m.html
484+
[linux-sched-setaffinity]: https://linux.die.net/man/2/sched_setaffinity
485+
[windows-set-thread-affinity-mask]: https://msdn.microsoft.com/en-us/library/windows/desktop/ms686247(v=vs.85).aspx
486+
[chapel]: https://chapel-lang.org/
487+
[x10]: http://x10-lang.org/
488+
[upc++]: https://bitbucket.org/berkeleylab/upcxx/wiki/Home

0 commit comments

Comments
 (0)