By James Reinders (Intel)
It is time to make Parallelism a full First Class Citizen in C and C++. Hardware is once again ahead of software, and we need to close the gap so that application development is better able to utilize the hardware without low level programming.
The time has come for high level constructs for task and data parallelism to be explicitly added to C and C++. This will enable Parallel Programming in C and C++ to be fully portable, easily intelligible, and consistently decipherable by a compiler.
We should tackle task and data parallelism both, and as an industry we know how.
* Task parallelism: the goal is to abstract this enough that a programmer is not explicitly mapping work to individual processor cores. Mapping should be the job of tools (including run time schedulers), not explicit programming. Hence, the goal is to shift all programming to tasks, not threads. This has many benefits that have been demonstrated often. A simple fork/join support is fundamental (spawn/sync in Cilk Plus terminology). Looping with a “parallel for” avoids looping to spawn iterations serially and thereby expresses parallelism well.
* Data parallelism: the goal is to abstract this enough that a programmer is not explicitly mapping work to SIMD instructions vs. multiple processor cores vs. attached computing (GPUs or co-processors). Mapping should be the job of tools, not explicit programming. Hence, the goal is to shift all programming back to mathematical expressions, not intrinsics or explicitly parallel algorithm decompositions.