d0796r1: Fix Listing 1 to have correct syntax

mhoemmen · Ruyman · commit 24299a480a35 · 2018-02-08T15:49:26.000Z
Listing 1 in d0796r1 had incorrect syntax.
Fix the example so readers won't object.
Reword surrounding text for correctness and clarity.
diff --git a/affinity/cpp-20/d0796r1.md b/affinity/cpp-20/d0796r1.md
@@ -37,28 +37,27 @@ The affinity problem is especially challenging for applications whose behavior c
 
 Frequently, data is initialized at the beginning of the program by the initial thread and is used by multiple threads. While automatic thread migration has been implemented in some OSes, migration may have high overhead. In an optimal case, the OS may automatically detect which thread access which data most frequently, or it may replicate data which is read by multiple threads, or migrate data which is modified and used by threads residing on remote locality groups. However, the OS often does a reasonable job, if the machine is not overloaded, if the application carefully used first-touch allocation, and if the program does not change its behavior with respect to locality.
 
-Consider a code example using the C++ STL container `valarray` and the latest C++17 parallel STL algorithm `for_each`. The example applies a loop body in a lambda to container entry in the iterator range `[begin, end)`, using a parallel execution policy such that the workload is distributed in parallel across multiple cores on the CPU. We might expect the work to be fast, but since `valarray` containers are initialized automatically and automatically allocated on the master thread’s memory, we find that it is actually quite slow even when we have more than one thread. 
+Consider a code example (Listing 1) that uses the C++17 parallel STL algorithm `for_each` to modify the entries of a `valarray` `a`.  The example applies a loop body in a lambda to each entry of the `valarray` `a`, using a parallel execution policy that distributes work in parallel across multiple CPU cores. We might expect this to be fast, but since `valarray` containers are initialized automatically and automatically allocated on the master thread's memory, we find that it is actually quite slow even when we have more than one thread. 
 
 ```cpp
-// C++ valarray STL containers are initialized
-// automatically and allocated on the master's memory
-valarray<double> a(N), b(N), c(N);
-//saxpying is slow
-//Parallel foreach
+// C++ valarray STL containers are initialized automatically.
+// First-touch allocation thus places all of a on the master.
+std::valarray<double> a(N);
+
+// Data placement is wrong, so parallel update is slow.
 std::for_each(par, std::begin(a), std::end(a),
-[=](double b, double c){b[i]+scalar*c[i]});
-// if we can migrate data at next usage and move pages close to next accessing thread 
-//using the affinity interface in future
+  [=] (double& a_i) { scalar * a_i; });
+
+// Use future affinity interface to migrate data at next
+// use and move pages closer to next accessing thread.
 ...
-//now faster, because data is local now
+// Faster, because data are local now.
 std::for_each(par, std::begin(a), std::end(a),
-[=](double b, double c){b[i]+scalar*c[i]});
+  [=] (double& a) { scalar * a_i; } );
 ```
-*Listing 1: Motivational example*
-
-Now with the affinity interface we propose below and in future, we will hopefully find that there is significant increase in memory bandwidth when we have multiple threads.
+*Listing 1: Parallel vector update example*
 
-The goal was that this would enable scaling up for heterogeneous and distributed computing in future. Indeed OpenMP [14] where one of the author participated in the design of its affinity model, has plans to integrate its affinity model with its heterogeneous model.[21]
+The affinity interface we propose should help computers achieve a much higher fraction of peak memory bandwidth when using parallel algorithms. In the future, we plan to extend this to heterogeneous and distributed computing. This follows the lead of OpenMP [14], which has plans to integrate its affinity model with its heterogeneous model [21]. (One of the authors of this document participated in the design of OpenMP's affinity model.)
 
 # Background Research: State of the Art
 
@@ -592,4 +591,4 @@ https://www.open-mpi.org/projects/hwloc/lstopo/
 [mpi]: http://mpi-forum.org/docs/
 [mpi-fault-tolerance]: http://www.mcs.anl.gov/~lusk/papers/fault-tolerance.pdf
 [mpi-post-failure-recovery]: http://journals.sagepub.com/doi/10.1177/1094342013488238
-[movidius]: https://developer.movidius.com/
+[movidius]: https://developer.movidius.com/