by Aater Suleman, Intel
This post is a follow up on the previous post titled why parallel programming is hard. To demonstrate parallel programming, this article presents a case study of parallelizing a kernel which computes a histogram using Open MP for parallelization. The post first introduces some basic parallel programming concepts and then deep dives into performance optimizations.
Problem: Count the number of times each ASCII character occurs on a page of text.
Input: ASCII text stored as an array of characters.
Output: A histogram with 128 buckets –one for each ascii character– where each entry stores the number of occurrences of the corresponding ascii character on the page.