MulticoreInfo.com header image 2

Utilizing Performance Monitoring Events to find Problematic Loads Due to Latency in the Memory Hierarchy

October 1st, 2010 · 1 Comment




By Michael Chynoweth
The most common bottleneck found across applications is stalls on loads due to latencies in the memory hierarchy. Admittedly this is one of the most difficult issues to fix as well. This blog is to help users identify the issue but will follow this blog with another on methodologies to alleviate the issue.

The first problem is to determine what sort of view can help flush out issues with load latencies. An interesting view to flush out problematic loads is to see the load presented in context of the surrounding instructions in the order as they were most typically retired along with the clocks tagged to each instruction. The problematic loads are then presented within these common streams of execution with a breakdown of where the load was satisfied in the memory hierarchy (L1, L2, L3, etc). This view allows us to relate the interactions between events firing during execution and allows each load to be presented alongside a cost estimate of the load in the workload.

Full Story

  • Share/Save/Bookmark

Tags: MulticoreInfo

Like what you're reading? Come back every day for multicore news, or subscribe to RSS updates.



Stumble It!     


1 response so far ↓

You must log in to post a comment.