MulticoreInfo.com header image 4

Entries from May 2011

It’s not just about hardware anymore

May 26th, 2011 · No Comments

by Donald Cramb, EETimes
The poor person tasked with designing and delivering a processor today has a tough job. Much tougher than his or her forebears of just a couple decades ago.
In the hierarchy of systems, the processor has always played a central role. When making architectural decisions on a system, one of the first considerations [...]

[Read more →]

Tags: MulticoreInfo

Writing and Optimizing Parallel Programs — A complete example

May 26th, 2011 · No Comments

by Aater Suleman, Intel
This post is a follow up on the previous post titled why parallel programming is hard. To demonstrate parallel programming, this article presents a case study of parallelizing a kernel which computes a histogram using Open MP for parallelization. The post first introduces some basic parallel programming concepts and then deep dives [...]

[Read more →]

Tags: MulticoreInfo

What makes parallel programming hard?

May 26th, 2011 · No Comments

by Aater Suleman, Intel
Multi-cores are here, and they are here to stay. Industry trends show that each individual core is likely to become smaller and slower. Improving performance of a single program with multi-core requires that the program be split into threads that can run on multiple cores concurrently. In effect, this pushes the problem [...]

[Read more →]

Tags: MulticoreInfo

CUDA Toolkit 4.0 and Parallel Nsight 2.0 Now Production Release

May 26th, 2011 · No Comments

Two Nvidia products have are now in production release:
CUDA 4.0- Features include Unified Virtual Addressing (UVA), Thrust C++ Template Performance Primitives Libraries and GPUDirect 2.0 GPU peer-to-peer communication technology. Download at www.nvidia.com/getcuda

[Read more →]

Tags: MulticoreInfo

Cray Unveils Its GPU Supercomputer

May 25th, 2011 · No Comments

by Michael Feldman, HPCWire
Cray has released the details of its GPU-equipped supercomputer: the XK6. The machine is a derivative of the XE6, an AMD Opteron-based machine that the company announced a year ago. Although Cray is calling this week’s announcement the XK6 launch, systems will not be available until the second half of the year.

[Read more →]

Tags: MulticoreInfo

Fast, High-Quality, Parallel Random Number Generators

May 25th, 2011 · No Comments

By Mark A. Overton
Very fast, parallel random number generation is possible on modern PCs, but it requires new algorithms to meet the most stringent randomness tests
The random number generators presented in this article are simple, fast, and suitable for 32-bit and 64-bit software and hardware. They are also parallelizable, improving performance significantly in existing CPUs [...]

[Read more →]

Tags: MulticoreInfo

Top Down Methodology for Software Performance Analysis

May 24th, 2011 · No Comments

By Charlie Hewett (Intel)
Top Down Methodology
The “Top Down” methodology is an ordered and structured way to analyze application performance. You look at higher order performance issues/indicators first, then based on that data you can follow up for additional investigation and/or dig deeper into the lower tiers of analysis. Below are the 3 main tiers of [...]

[Read more →]

Tags: MulticoreInfo

Nvidia Offers Peek Into Advanced Design Evaluation

May 23rd, 2011 · No Comments

by Joel Hruska, HotHardware
Heather Mackey of Nvdia has written a new blog post discussing the company’s hardware emulation equipment, thus affording us an opportunity to discuss a little-mentioned aspect of microprocessor development. Although we’ll be discussing Nvidia products in particular, both software tools (aka, simulation) and hardware emulation are vital to all microprocessor design [...]

[Read more →]

Tags: MulticoreInfo

Optimizing Software Applications for NUMA: Part 7 (of 7)

May 23rd, 2011 · No Comments

By David Ott (Intel)
Summary
NUMA, or Non-Uniform Memory Access, is a shared memory architecture that describes the placement of main memory modules with respect to processors in a multiprocessor system. The advantage of the NUMA architecture as a hierarchical shared memory scheme is its potential to improve average case access time through the introduction of fast, [...]

[Read more →]

Tags: MulticoreInfo

Optimizing Software Applications for NUMA: Part 6 (of 7)

May 23rd, 2011 · No Comments

By David Ott (Intel)
3.3 Data Placement Using Explicit Memory Allocation Directives
Another approach to data placement in NUMA-based systems is to make use of system APIs that explicitly configure the location of memory page allocations. An example of such APIs is the libnuma library for Linux.[1]

[Read more →]

Tags: MulticoreInfo

Westmere-EX: Intel’s Flagship Benchmarked

May 23rd, 2011 · No Comments

by Johan De Gelas, AnandTech
Intel’s Best x86 Server CPU
The launch of the Nehalem-EX a year ago was pretty spectacular. For the first time in Intel’s history, the high-end Xeon did not have any real weakness. Before the Nehalem-EX, the best Xeons trailed behind the best RISC chips in either RAS, memory bandwidh, or raw processing [...]

[Read more →]

Tags: MulticoreInfo

From CUDA to OpenCL: Towards a Performance-portable Solution

May 23rd, 2011 · No Comments

by Peng Du, Rick Weber, et al.
Abstract
In this work, we evaluate OpenCL as a programming tool for developing performance-portable applications for GPGPU. While the Khronos group developed OpenCL with programming portability in mind, performance is not necessarily portable. OpenCL has required performance-impacting initializations that do not exist in other languages such as CUDA. Understanding these [...]

[Read more →]

Tags: MulticoreInfo

Optimizing Software Applications for NUMA: Part 5 (of 7)

May 23rd, 2011 · No Comments

By David Ott (Intel)
3.2. Data Placement Using Implicit Memory Allocation Policies
In the simple case, many operating systems transparently provide support for NUMA-friendly data placement. When a single-threaded application allocates memory, the processor will simply assign memory pages to the physical memory associated with the requesting thread’s node (CPU package), thus insuring that it is local [...]

[Read more →]

Tags: MulticoreInfo

Intel’s Silvermont: A New Atom Architecture

May 23rd, 2011 · No Comments

by Anand Lal Shimpi, AnandTech
Brooke Crothers broke a very important story today - he published the name Silvermont. Atom’s first incarnation came to us in 2008 as a Pentium-like dual-issue in-order microprocessor. The CPU core was named Bonnell, after the tallest point in Austin at around 750 feet. Small mountain, small core. Get it?

[Read more →]

Tags: MulticoreInfo

Power Reduction with AMD Embedded G-Series APUs

May 23rd, 2011 · No Comments

AMD today announced immediate availability of two new AMD Embedded G-Series APUs (Accelerated Processing Units) with thermal design power (TDP) ratings of 5.5 and 6.4 watts, up to a 39 percent power savings compared to earlier versions1. The very low power consumption and small 361mm² package is ideal for compact, fanless embedded systems like digital [...]

[Read more →]

Tags: MulticoreInfo

A New Processor For Big Data

May 23rd, 2011 · No Comments

By Stacey Higginbotham, Gigaom
We are moving from the Information Age to the Insight Age, as Parthasarathy Ranganathan, an HP Labs distinguished technologist told me. As part of that shift we need a computing architecture that will handle the storage of data, and the heavy processing power required to analyze that data, and we need to [...]

[Read more →]

Tags: MulticoreInfo

Intel Research: Understanding Factors Affecting Intel QPI Integrity

May 23rd, 2011 · No Comments

by Dave Coleman and Michael Mirmak, Intel Corporation
The Intel® QuickPath Interconnect operates at extremely high frequencies and it is essential that circuit designers understand the myriad factors that effect differential signal integrity. This article introduces those factors and explains the basics of transmitting and receiving signals without significant distortion.

[Read more →]

Tags: MulticoreInfo

Intel Research: Time Domain Modeling and Simulation of Intel QPI Circuits

May 23rd, 2011 · No Comments

by Dave Coleman and Michael Mirmak, Intel Corporation
Analysis of an Intel® QuickPath Interconnect system design involves creating models and performing simulations in either the time or frequency domain or both. This article introduces the methods for analysis, model creation and simulation in the Intel QuickPath Interconnect time domain.

[Read more →]

Tags: MulticoreInfo

Cortus exhibiting at Design Automation Conference (DAC) 2011

May 23rd, 2011 · No Comments

Cortus S.A., the technology leader in ultra low power, silicon efficient 32-bit processor IP cores will be presenting their APS3 technology at the 48th Design Automation Conference in San Diego. The Design Automation Conference (DAC) exhibition runs from 6th - 8th June, at the San Diego Convention Center, is recognised as the world’s largest industry [...]

[Read more →]

Tags: MulticoreInfo

Teaching the Parallel Future: Finding Promise in a Sea of Cores

May 19th, 2011 · No Comments

By Daniel Ernst, EAPF, Published at Computing Research News, May 2011
The recent National Academies report, “The Future of Computing Performance: Game Over or Next Level?”lays out several broad landscape changes computing researchers must address to sustain growth in system performance. Indeed, we hear about little else in the parade of articles, op-eds, and conference sessions [...]

[Read more →]

Tags: MulticoreInfo