By Rob Farber
CUDA, Supercomputing for the Masses Part 20 focuses on the analysis capability of Parallel Nsight v1.0 coupled with the NVIDIA Tools extension (NVTX) library to illustrate asynchronous I/O, hybrid CPU/GPU computing, and the performance of primitive restart to dramatically accelerate OpenGL rendering in CUDA applications. (Note that Parallel Nsight 1.5 has been released, which is now compatible with Visual Studio 2010 and further refines the Parallel Nsight experience.)
This article will focus on Fermi and the architectural changes that significantly broaden the types of applications that map well to GPGPU computing while maintaining the performance benefits provided by previous generations of CUDA-enabled GPUs. Particular attention will be paid to how the Fermi architecture affects CUDA memory spaces. Also discussed will be how the Fermi architecture moves GPU computing into mainstream 24/7 production computing with error correction and other robustness features.