MulticoreInfo.com header image 3

Top Story

Multicore Research Papers - 2008

2009 2008 2007 2006 2005 2004 2003 2002 2001
2000 1999 1998 1997 1996 Prior to 1995    Whitepapers

Papers listed here are either freely available on the web or obtained legally. Please respect the various copyright stipulations placed on these documents. If any author would like us to add or to remove their paper from here, please contact us at info@multicoreinfo.com .

Multicore Papers 2008

Interaction of Many-core Computer Architecture and Operating Systems: Guest Editors’ Introduction
Sangyeun Cho, Tao Li and Onur Mutlu
IEEE Micro Special Issue (IEEE MICRO), Vol. 28, No. 3, pages 2-5, May 2008

Scheduling Algorithms for Unpredictably Heterogeneous CMP Architectures
J.A. Winter and D.H. Albonesi
38th International Conference on Dependable Systems and Networks, June 2008

Multicore Devices: A New Generation of Reconfigurable Architectures [Slides]
Steven A. Guccione, CMPWare, Inc.
new Engineering of Reconfigurable Systems and Architectures, 2008

Hardware / Software Tradeoffs in Multicore Architectures [Slides]
Steven A. Guccione, CMPWare, Inc.
Austin Conference on Integrated Circuits and Systems (ACISC), 2008

A Massively Parallel Digital Learning Processor
Hans Peter Graf, Srihari Cadambi, Igor Durdanovic, Venkata Jakkula, Murugan Sankaradass, Eric Cosatto, Srimat Chakradhar
Neural Information Processing Systems (NIPS) 2008

The Synchronization Power of Coalesced Memory Accesses
Phuong Hoai Ha, Philippas Tsigas & Otto J. Anshus
Proceedings of the 22nd International Symposium on Distributed Computing (DISC ‘08)

Wait-free Programming for General Purpose Computations on Graphics Processors [ACM Login required]
Phuong Hoai Ha, Philippas Tsigas & Otto J. Anshus
Proceedings of the 27th Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC ‘08)

A Unified Model for Multicore Architectures
John E. Savage, Mohammad Zubair
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

The Software Challenges of Multicore: Lessons from Supercomputing
Keynote Speech by Tarek El-Ghazawi
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

Reevaluating Amdahl’s Law in the Multicore Era
Xian-He Sun, Yong Chen
Technical Report, Illinois Institute of Technology

Scalable Computing in Multicore Era
X.-H. Sun, Y. Chen and S. Byna
International Symposium on Parallel Algorithms, Architectures and Programming (PAAP’08), 2008

Multi-word Atomic Read/Write Registers on Multiprocessor Systems
Andreas Larsson, Anders Gidenstam, Phuong H Ha, Marina Papatriantafilou & Philippas Tsigas
The ACM Journal of Experimental Algorithmics (JEA), 2008

Non-blocking Programming on Multi-core Graphics Processors [Author webpage]
Phuong Hoai Ha, Philippas Tsigas & Otto J. Anshus
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

A Domain-specic Approach for Software Development on Multicore Platforms [Slides]
Jerker Bengtsson and Bertil Svensson
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

On Sorting and Load-Balancing on GPUs [Slides]
Daniel Cederman and Philippas Tsigas
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

OpenDF - A Dataflow Toolset for Reconfigurable Hardware and Multicore Systems [Slides]
Shuvra S. Bhattacharyya, Gordon Brebner, Johan Eker, et al.
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

Automatic Parallelization of Simulation Code for Equation-based Models with Software Pipelining and Measurements on Three Platforms [Slides]
Håkan Lundvall, Kristian Stavåker, Peter Fritzson, and Christoph Kessler
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

A Scalable Directory Architecture for Distributed Shared-Memory Chip Multiprocessors [Slides]
Huan Fang and Mats Brorsson
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

State-Space Exploration for Concurrent Algorithms under Weak Memory Orderings [Slides]
Bengt Jonsson
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

Model Checking Race-Freeness [Slides]
Parosh Aziz Abdulla, Frédéric Haziza, and Mats Kindahl
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

NOBLE: Non-Blocking Programming Support via Lock-Free Shared Abstract Data Types [Slides]
Håkan Sundell and Philippas Tsigas
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

LFTHREADS: A lock-free thread library [Slides]
Anders Gidenstam and Marina Papatriantafilou
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

Wool - A Work Stealing Library [Slides]
Karl-Filip Faxén
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (MCC ‘08)

An Efficient in-Place 3D Transpose for Multicore Processors with Software Managed Memory Hierarchy [Slides .ppt]
Ali El-Moursy, Ahmed El-Mahdy, Hisham El-Shishiny
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

Supporting Pthreads and OpenMP Programming Models on Cell BE
Invited Talk by Duc Vianney (IBM Corp.)
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

Automatic generation of a Parallel Tile Processing Unit for algorithms with non-affine array references [Slides]
Rosilde Corvino, Stephane Mancini, Roberto Guizzetti
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

Massive Parallelization of SPICE Device Model Evaluation on GPU-Based SIMD Architectures [Slides]
Amr M. Bayoumi, Yasser Y. Hanafy
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

A Global Interconnect Link Design for Many-Core Microprocessors [Slides]
DiaaEldin Khalil, Yehea Ismail
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

Dynamic power management framework for multi-core portable embedded system [ACM Portal Login required] [Slides .ppt]
Chen Tianzhou, Huang Jiangwei, Xiang Lingxiang, Shi Qingsong
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

SPENK: Adding Another Level of Parallelism on the Cell Broadband Engine [ACM Portal Login required] [Slides .ppt]
Mohamed F. Ahmed, Reda A. Ammar, Sanguthevar Rajasekaran
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

On the Potential of Latency Tolerant Execution in Speculative Multithreading [Slides .ppt]
Haitham Akkary, Komal Jothi, Renjith Retnamma, Satyanarayana Nekkalapu, Doug Hall, Shahrokh Shahidzadeh
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

Sharing-Aware OS Scheduling Algorithms for Multi-socket Multicore Servers [Slides .ppt]
Murthy Durbhakula
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

Performance Analysis and Visualization Tools for Cell/B.E. Multicore Environment
Duc Vianney, Gad Haber, Andre Heilper, Marcel Zalmanovici
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

An Efficient Load-Balancing Algorithm for Image Processing Applications on Multicore Processors [Slides .ppt]
Ahmed El-Mahdy (Alexandria University), Hisham El-Shishiny
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

Bounds on the Theoretical Efficiency of Multicore Cache Hierarchies
Invited Speech by John E. Savage (Brown University)
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

Many-Core: Another Major Change in Technology?
Keynote Speech by Ian Watson (Manchester University)
The International Forum on Next-Generation Multicore/Manycore Technologies, IFMT’08

New parallel programming language design: a bridge between brain models and multi-core/many-core computers?
Gheorghe Stefanescu, Camelia Chira
“From Natural Language to Soft Computing: New Paradigms in Artificial Intelligence,”, L.A. Zadeh et.al (Eds.), Editing House of Romanian Academy, 2008

How to Simulate 1000 Cores [Tech Report version]
Monchiero, Matteo; Ahn, Jung Ho; Falcón, Ayose; Ortega, Daniel; Faraboschi, Paolo
dasCMP 2008

SlackSim: A Platform for Parallel Simulations of CMPs on CMPs
Jianwei Chen, Murali Annavaram, and Michel Dubois
dasCMP 2008

Entering the Petaflop Era: The Architecture and Performance of Roadrunner [Abstract]
Kevin Barker, Kei Davis, Adolfy Hoisie, Darren Kerbyson, Michael Lang, Scott Pakin, Jose Carlos Sancho
SuperComputing 2008 (SC08)

High Performance Discrete Fourier Transforms on Graphics Processors [Abstract]
Naga Govindaraju, Brandon Lloyd, Yuri Dotsenko, Burton Smith, John Manferdelli
SuperComputing 2008 (SC08)

Stencil Computation Optimization and Autotuning on State-of-the-Art Multicore Architectures [Abstract]
Kaushik Datta, Mark Murphy, Vasily Volkov, et al.
SuperComputing 2008 (SC08)

Bandwidth Intensive 3-D FFT kernel for GPUs using CUDA [Abstract]
Akira Nukada, Yasuhiko Ogata, Toshio Endo, and Satoshi Matsuoka
SuperComputing 2008 (SC08)

Adapting a Message-Driven Parallel Application to GPU-Accelerated Clusters [Abstract]
James C. Phillips, John E. Stone, and Klaus Schulten
SuperComputing 2008 (SC08)

Toward Loosely-Coupled Programming on Petascale Systems [Abstract]
Ioan Raicu, Zhao Zhang, Mike Wilde, Ian Foster, Pete Beckman, Kamil Iskra, and Ben Clifford
SuperComputing 2008 (SC08)

SMARTMAP: Operating System Support for Efficient Data Sharing among Processes on a Multi-Core Processor [Abstract]
Ron Brightwell, Trammell Hudson, and Kevin T. Pedretti
SuperComputing 2008 (SC08)

Lessons Learned at 208K: Toward Debugging Millions of Cores [Abstract]
Gregory L. Lee, Dong H. Ahn, Dorian C. Arnold, et al.
SuperComputing 2008 (SC08)

A Novel Migration-Based NUCA Design for Chip Multiprocessors [Abstract]
Mahmut Kandemir, Feihui Li, Mary Jane Irwin, and Seung Woo Son
SuperComputing 2008 (SC08)

Hiding I/O Latency with Pre-execution Prefetching for Parallel Applications [Abstract] [Best Paper Finalist] [Best Student Paper Finalist]
Yong Chen, Surendra Byna, Xian-He Sun, Rajeev Thakur, and William Gropp
SuperComputing 2008 (SC08)

Programming the Intel 80-Core Network-on-a-Chip Terascale Processor [Abstract]
Timothy G. Mattson, Rob van der Wijngaart, and Michael Frumkin
SuperComputing 2008 (SC08)

Parallel I/O Prefetching Using MPI File Caching and I/O Signatures [Abstract]
Surendra Byna, Yong Chen, Xian-He Sun, Rajeev Thakur, and William Gropp
SuperComputing 2008 (SC08)

PAM: A Novel Performance/Power Aware Meta-scheduler for Multi-core Systems [Abstract]
Mohammad Banikazemi, Dan Poff, Bulent Abali
SuperComputing 2008 (SC08)

Parallel Exact Inference on the Cell Broadband Engine Processor [Abstract]
Yinglong Xia and Viktor K. Prasanna
SuperComputing 2008 (SC08)

Prefetch Throttling and Data Pinning for Improving Performance of Shared Caches [Abstract]
Ozcan Ozturk, Seung Woo Son, Mahmut Kandemir, and Mustafa Karakoy
SuperComputing 2008 (SC08)

Hybrid Analytical Modeling of Pending Cache Hits, Data Prefetching, and MSHRs
Xi E. Chen and Tor M. Aamodt
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

A Novel Cache Architecture with Enhanced Performance and Security
Zhenghong Wang and Ruby B. Lee
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

Microarchitecture in the System-level Integration Era [Keynote Speech Slides]
Charles R. Moore, Senior Fellow, AMD
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

Facelift: Hiding and Slowing Down Aging in Multicores
Abhishek Tiwari and Josep Torrellas
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

The StageNet Fabric for Constructing Resilient Multicore Systems [Slides]
S. Gupta, S. Feng, A. Ansari, J. Blome and S. Mahlke
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

Prefetch-Aware DRAM Controllers
Chang Joo Lee, Onur Mutlu, Veynu Narasiman, Yale N. Patt
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

Cache Bursts: A New Approach for Eliminating Dead Blocks and Increasing Cache Efficiency
Haiming Liu, Michael Ferdman, Jaehyuk Huh, Doug Burger
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

Verification of Chip Multiprocessor Memory Systems Using A Relaxed Scoreboard
Ofer Shacham, Megan Wachs, Alex Solomatnikov, Amin Firoozshahian, Stephen Richardson, Mark Horowitz
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

A Performance-Correctness Explicitly-Decoupled Architecture
Alok Garg and Michael Huang
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

Copy Or Discard Execution Model For Speculative Parallelization On Multicores
Chen Tian, Min Feng, Vijay Nagarajan, Rajiv Gupta
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors: A Machine Learning Approach
R. Bitirgen, E. İpek, J.F. Martínez.
41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2008

Experiences with Numerical Codes on the Cell Broadband Engine Architecture [Slides]
M. Stürmer, D. Ritter, H. Köstler, U. Rüde; Univ. Erlangen-Nürnberg
New Frontiers in High-performance and Hardware-aware Computing (HipHaC’08)

Original 45nm Intel® Core™2 Processor Performance
Asim Nisar, Mongkol Ekpanyapong, Antonio C Valles, Kuppuswamy Sivakumar
Intel Technology Journal, # Volume 12, # Issue 03

Improvements in the Intel® Core™2 Processor Family Architecture and Microarchitecture
James Coke, Harikrishna Baliga, Niranjan Cooray, Edward Gamsaragan, et al.
Intel Technology Journal, # Volume 12, # Issue 03

Performance comparison of different parallel lattice Boltzmann implementations on multi-core multi-socket systems
S. Donath, K. Iglberger, G. Wellein, T. Zeiser, A. Nitsure, U. Rude
International Journal of Computational Science and Engineering (IJCSE)

On the generic parallelisation of iterative solvers for the finite element method
P. Bastian, M. Blatt
International Journal of Computational Science and Engineering (IJCSE)

Off-loading application controlled data prefetching in numerical codes for multi-core processors
J. Weidendorfer, C. Trinitis
International Journal of Computational Science and Engineering (IJCSE)

Harmony: A Execution Model and Runtime for Heterogeneous Many Core Architectures
G. Diamos and S. Yalamanchili
High Performance Distributed Computing (HPDC ‘08)

Runtime Techniques for Dynamic Concurrency Inference, Resource Constrained Hierarchical Scheduling, and Online Optimization in Heterogeneous Multiprocessor Systems
Tech Report, Full version of HPDC 08 Submission

IdlePower: Application-Aware Management of Processor Idle States
Hrishikesh Amur, Ripal Nathuji, Mrinmoy Ghosh, Karsten Schwan, Hsien-Hsin S. Lee
Workshop on Managed Multi-Core Systems (MMCS’08)

MAESTRO: Dynamic Runtime Power and Concurrency Adaptation
Allan Porterfield, Rob Fowler, Mark Neyer
Workshop on Managed Multi-Core Systems (MMCS’08)

Lightweight Kernel Support for Direct Shared Memory Access on a Multi-Core Processor
Ron Brightwell
Workshop on Managed Multi-Core Systems (MMCS’08)

Embracing diversity in the Barrelfish manycore operating system
Adrian Schüpbach, Simon Peter, Andrew Baumann, Timothy Roscoe, Paul Barham, Tim Harris, Rebecca Isaacs
Workshop on Managed Multi-Core Systems (MMCS’08)

Scheduling on Heterogeneous Multicore Processors Using Architectural Signatures
Daniel Shelepov and Alexandra Fedorova
Workshop on the Interaction between Operating Systems and Computer Architecture

Cypress: A Scheduling Infrastructure for a Many-Core Hypervisor
Alexandra Fedorova, Viren Kumar, Vahid Kazempour, Suprio Ray, and Pouya Alagheband
Workshop on Managed Multi-Core Systems (MMCS’08)

Performance Implications of Cache Affinity on Multicore Processors
Vahid Kazempour, Alexandra Fedorova, and Pouya Alagheband
Euro-Par 2008

Network Processing on an SPE Core in Cell Broadband Engine
Kawamura Y., Yamazaki T., Ishiwata T., Horie K., Kyusojin, H.
16th IEEE Symposium on High Performance Interconnects, 2008. HOTI ‘08.

Taming Single-Thread Program Performance on Many Distributed On-Chip L2 Caches
Lei Jin and Sangyeun Cho
Proceedings of the Int’l Conference on Parallel Processing (ICPP)

TPTS: A Novel Framework for Very Fast Manycore Processor Architecture Simulation
Sangyeun Cho, Socrates Demetriades, Shayne Evans, Lei Jin, Hyunjin Lee, Kiyeon Lee, and Michael Moeng
Proceedings of the Int’l Conference on Parallel Processing (ICPP)

Corollaries to Amdahl’s Law for Energy
Sangyeun Cho and Rami Melhem
IEEE Computer Architecture Letters (CAL) 7(1):25~28, January 2008

Software Thermal Management of DRAM Memory for Multicore Systems
Jiang Lin, Hongzhong Zheng, Zhichun Zhu, Eugene Gorbatov, Howard David and Zhao Zhang
International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), 2008

Memory Access Scheduling Schemes for Systems with Multi-Core Processors
Hongzhong Zheng, Jiang Lin, Zhao Zhang, and Zhichun Zhu
International Conference on Parallel Processing (ICPP-08), Partland, Oregon, September 8-12, 2008

Self-Optimizing Memory Controllers: A Reinforcement Learning Approach
E. İpek, O. Mutlu, J.F. Martínez, and R. Caruana
In Intl. Symp. on Computer Architecture, Beijing, China, June 2008.

Can Hardware Performance Counters be Trusted?
Vincent M. Weaver, Sally A. McKee
IEEE International Symposium on Workload Characterization (IISWC) 2008

Evaluating the Impact of Dynamic Binary Translation Systems on Hardware Cache Performance
Arkaitz Ruiz Alvarez, Kim Hazelwood
IEEE International Symposium on Workload Characterization (IISWC) 2008

Support for Dynamic Management of Parallelism in Chip Multiprocessors
Gilberto Contreras
Dissertation Presented to the Faculty of Princeton University in Candidacy for the Degree of Doctor of Philosophy, July 2008

Full-System Chip Multiprocessor Power Evaluations Using FPGA-Based Emulation
Abhishek Bhattacharjee, Gilberto Contreras, and Margaret Martonosi
International Symposium on Low Power Electronics and Design (ISLPED), August 2008

The PARSEC Benchmark Suite: Characterization and Architectural Implications
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li
Princeton University Technical Report TR-811-08, January 2008

STAMP: Stanford Transactional Applications for Multi-Processing [Website]
Chi Cao Minh, JaeWoong Chung, Christos Kozyrakis, Kunle Olukotun
IEEE International Symposium on Workload Characterization (IISWC) 2008

Energy-Aware Application Scheduling on a Heterogeneous Multi-core System
Jian Chen and Lizy K. John
IEEE International Symposium on Workload Characterization (IISWC) 2008

Parallelization and Characterization of SIFT on Multi-Core Systems
Hao Feng, Eric Li, Yurong Chen, Yimin Zhang
IEEE International Symposium on Workload Characterization (IISWC) 2008

Current State of Java for HPC [Website]
Brian Amedro, Vladimir Bodnartchouk, Denis Caromel, Christian Delbe, Fabrice Huet, Guillermo L. Taboada
Internal Note, INRIA

A Parallel Algorithm for Advanced Video Motion Estimation on Multicore Architectures [Requires IEEE Xplore login]
Svetislav Momcilovic, and Leonel Sousa
2008 International Workshop on Multi-Core Computing Systems (MuCoCoS ‘08)

Multi-Variant Program Execution: Using Multi-Core Systems to Defuse Buffer-Overflow Vulnerabilities [Requires IEEE Xplore login]
Babak Salamat, Andreas Gal, Todd Jackson, Karthikeyan Manivannan, Gregor Wagner, and Michael Franz
2008 International Workshop on Multi-Core Computing Systems (MuCoCoS ‘08)

Efficient Implementation of Wireless Applications on Multi-core Platforms based on Dynamically Reconfigurable Processors [Requires IEEE Xplore login]
Wei Han, Ying Yi, Mark Muir, Ioannis Nousias, Tughrul Arslan, and Ahmet Erdogan
2008 International Workshop on Multi-Core Computing Systems (MuCoCoS ‘08)

On the Potential of NoC Virtualization for Multicore Chips [Requires IEEE Xplore login]
Jose Flich, Samuel Rodrigo, Jose Duato, Thomas Sødring, Åshild Grønstad Solheim, Tor Skeie, and Olav Lysne
2008 International Workshop on Multi-Core Computing Systems (MuCoCoS ‘08)

Petascale Computing for Large-Scale Graph Problems [Keynote Address Slides]
David Bader
2008 International Workshop on Multi-Core Computing Systems (MuCoCoS ‘08)

A Graph-Theoretic Analysis of the Human Protein-Interaction Network Using Multi-core Parallel Algorithms [Tech Report]
D.A. Bader and K. Madduri
Parallel Computing 2008

Upcoming Conference: Euro-Par 2008 [Program]

(When) Will CMPs hit the Power Wall? [Older TechReport]
Cor Meenderinck and Ben Juurlink
2nd Workshop on Highly Parallel Processing on a Chip (HPPC 2008) at Euro-Par 2008

Compile-time and Run-time Issues in an Auto-parallelisation system for the Cell BE Processor
Alastair Donaldson, Paul Keir, and Anton Lokhmotov
2nd Workshop on Highly Parallel Processing on a Chip (HPPC 2008) at Euro-Par 2008

Towards an Intelligent Environment for Programming Multi-core Computing Systems
S. Pllana, S. Benkner, E. Mehofer, L. Natvig, and F. Xhafa
2nd Workshop on Highly Parallel Processing on a Chip (HPPC 2008) at Euro-Par 2008

A unified runtime system for heterogeneous multicore architectures
Cédric Augonnet and Raymond Namyst
2nd Workshop on Highly Parallel Processing on a Chip (HPPC 2008) at Euro-Par 2008

The Roofline Model: A Pedagogical Tool for Program Analysis and Optimization [Slides]
S. Williams and D. Patterson
ParLab Summer Retreat, 2008

PERI: Auto-tuning Memory Intensive Kernels for Multicore
S. Williams, K. Datta, J. Carter, L. Oliker, J. Shalf, K. Yelick, and D. Bailey
SciDAC PI conference, Journal of Physics: Conference Series, 2008

A Third-Generation 65nm 16-Core 32-Thread Plus 32-Scout-Thread CMT SPARC(R) Processor
Marc Tremblay, Shailender Chaudhry
International Solid-State Circuits Conference (ISSCC) 2008

BSGP: Bulk-Synchronous GPU Programming
Qiming Hou, Kun Zhou, and Baining Guo
SIGGRAPH 2008

Larrabee: A Many-Core x86 Architecture for Visual Computing [Weblink from Intel]
Larry Seiler, Doug Carmean, Eric Sprangle, Tom Forsyth, et al.
SIGGRAPH 2008


A Mapping Framework for Guided Design Space Exploration of Heterogeneous MP-SoCs [IEEE Xplore]
Bastian Ristau, Torsten Limberg, and Gerhard Fettweis
In Proceedings of Design, Automation & Test in Europe (DATE 08). Munich, Germany. Mar ‘08

A general model of concurrency and its implementation as many-core dynamic RISC processors
T. Bernard, K. Bousias, L. Guang, C. R. Jesshope, M. Lankamp, M. W. van Tol and L. Zhang
International Symposium on Systems, Architectures, MOdeling and Simulation (SAMOS) 2008

Operating systems in silicon and the dynamic management of resources in many-core chips
C. R. Jesshope
Parallel Processing Letters (PPL), 18, (2), pp257 - 274.

Transactional Memory
James Larus, Christos Kozyrakis
Communications of the ACM, Vol. 51, No. 7, pp 80–88, July 2008.

Spending Moore’s Dividend
James Larus
Microsoft Research Technical Report MSR-TR-2008-69, May 2008

Towards automatic proofs of lock-free algorithms

Loic Fejoz and Stephan Merz
Exploiting Concurrency Efficiently and Correctly — (EC)2, CAV 2008 Workshop

Are Concurrent Programs That Are Easier to Write Also Easier to Check?
Kedar S. Namjoshi
Exploiting Concurrency Efficiently and Correctly — (EC)2, CAV 2008 Workshop

Assembling Concurrent Programs Correctly from Data-Parallel Program Bricks
Kai Trojahner
Exploiting Concurrency Efficiently and Correctly — (EC)2, CAV 2008 Workshop

The Relevance of New Data Structure Approaches for Dense Linear Algebra in the new Multi-Core/Many Core Environments
Fred G. Gustavson
Exploiting Concurrency Efficiently and Correctly — (EC)2, CAV 2008 Workshop

An Abort-Aware Model of Transactional Programming
Kousha Etessami and Patrice Godefroid
Exploiting Concurrency Efficiently and Correctly — (EC)2, CAV 2008 Workshop

Model Checking Transactional Memories
Rachid Guerraoui, Thomas A. Henzinger, Barbara Jobstmann, and Vasu Singh
Exploiting Concurrency Efficiently and Correctly — (EC)2, CAV 2008 Workshop

SMT-based Symbolic Model Checking for Multi-Threaded Programs
Zijiang Yang and Karem Sakallah
Exploiting Concurrency Efficiently and Correctly — (EC)2, CAV 2008 Workshop

Intra-Disk Parallelism: An Idea Whose Time Has Come
S. Sankar, S. Gurumurthi, M.R. Stan
35th ACM International Symposium on Computer Architecture (ISCA), June 21-25, 2008

PEEP: Exploiting Predictability of Memory Dependences in SMT Processors
Samantika Subramaniam, Milos Prvulovic, Gabriel H. Loh
14th International Symposium on High-Performance Computer Architecture (HPCA), 2008

Dynamic Classification of Program Memory Behaviors in CMPs
Yuejian Xie, Gabriel H. Loh
2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects (CMP-MSI), June 22, 2008

Deconstructing the Inefficacy of Global Cache Replacement Policies
Rahul Garde, Samantika Subramaniam, Gabriel H. Loh
7th Workshop on Duplicating, Deconstructing, and Debunking (WDDD), June 22, 2008

3D-Stacked Memory Architectures for Multi-Core Processors
Gabriel H. Loh
35th ACM International Symposium on Computer Architecture (ISCA), June 21-25, 2008

Modeling Mulitigrain Parallelism on Heterogeneous Multicore Processors: A Case Study of the Cell BE
Filip Blagojevic; Xizhou Feng, Kirk W. Cameron; Dimitris Nikolopoulos
International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2008)

Automated Dynamic Analysis of CUDA Programs
Michael Boyer, Kevin Skadron, Westley Weimer
Third Workshop on Software Tools for MultiCore Systems (STMCS 2008)

Synchronization Aware Conflict Resolution for Runtime Monitoring Using Transactional Memory
Chen Tian, Vijay Nagarajan, Rajiv Gupta
Third Workshop on Software Tools for MultiCore Systems (STMCS 2008)

Simulation of Streaming Application on Multicore Systems
Saurabh Gayen, Mark Franklin, Eric Tyson, Roger Chamberlain
Third Workshop on Software Tools for MultiCore Systems (STMCS 2008)

MATS: Multicore Adaptive Trace Selection
Jason Mars and Mary Lou Soffa
Third Workshop on Software Tools for MultiCore Systems (STMCS 2008)

Numerical Algorithms with Tunable Parallelism
Aparna Chandramowlishwaran, Abhinav Kahru, Ketan Umare, Richard Vuduc
Third Workshop on Software Tools for MultiCore Systems (STMCS 2008)

Why Should I Rewrite My Software When Dynamic Compilation Can Be Good Enough?
Nathan Clark
Third Workshop on Software Tools for MultiCore Systems (STMCS 2008)

A Map Reduce Framework for Programming Graphics Processors
Bryan Catanzaro, Narayanan Sundaram and Kurt Keutzer
Third Workshop on Software Tools for MultiCore Systems (STMCS 2008)

A two-level Load/Store Queue based on Execution Locality
Miquel Pericas, Adrian Cristal, Ruben Gonzalez, Alex Veidenbaum, Daniel A. Jimenez and Mateo Valero
ISCA 2008

A Comprehensive Memory Modeling Tool and its Application to the Design and Analysis of Future Memory Hierarchies
Shyamkumar Thoziyoor, Jung Ho Ahn, Atteo Monchiero, Jay B. Brockman, and Norman P. Jouppi
ISCA 2008

Programming Multicore Systems Using Hierarchically Tiled Arrays [Slides]
Diego Andrade, James Brodman, Basilio B. Fraguela and David Padua
5th HiPEAC Industrial Workshop 2008

MAPS: An Integrated Framework for MPSoC Application Parallelization [Slides]
Weihua Sheng, Jeronimo Castrillon, Jianjiang Ceng, et al.
5th HiPEAC Industrial Workshop 2008

On-chip memories, the OS perspective [Slides]
Carlos Villavieja, Isaac Gelado, Alex Ramirez and Nacho Navarro
5th HiPEAC Industrial Workshop 2008

Memoizing Multi-Threaded Transactions
Lukasz Ziarek and Suresh Jagannathan
Declarative Aspects of Multicore Programming (DAMP) 2008

Toward a parallel implementation of Concurrent ML
John Reppy and Yinqi Xiao
Declarative Aspects of Multicore Programming (DAMP) 2008

Partial Vectorisation of Haskell Programs
Manuel Chakravarty, Roman Leshchinskiy, Simon Peyton Jones and Gabriele Keller
Declarative Aspects of Multicore Programming (DAMP) 2008

Intel 64 Architecture Memory Ordering [Whitepaper version]
Bratin Saha, Intel
Declarative Aspects of Multicore Programming (DAMP) 2008

PiPA: Pipelined Profiling and Analysis on Multi-core Systems
Qin Zhao, Ioana Cutcutache, Weng-Fai Wong

International Symposium on Code Generation and Optimization (CGO), 2008

Revisiting the Sequential Programming Model for the Multicore Era
Matthew J. Bridges, Neil Vachharajani, Yun Zhang, Thomas Jablin, and David I. August
IEEE Micro, January 2008

Automated Dynamic Analysis of CUDA Programs
Michael Boyer, Kevin Skadron, Westley Weimer
Workshop on Software Tools for MultiCore Systems (STMCS 2008)

Synchronization Aware Conflict Resolution for Runtime Monitoring Using Transactional Memory
Chen Tian, Vijay Nagarajan, Rajiv Gupta
Workshop on Software Tools for MultiCore Systems (STMCS 2008)

Simulation of Streaming Application on Multicore Systems
Saurabh Gayen, Mark Franklin, Eric Tyson, Roger Chamberlain
Workshop on Software Tools for MultiCore Systems (STMCS 2008)

MATS: Multicore Adaptive Trace Selection
Jason Mars and Mary Lou Soffa
Workshop on Software Tools for MultiCore Systems (STMCS 2008)

Adapting to Intermittent Faults in Multicore Systems
Philip M. Wells, Koushik Chakraborty, Gurindar S. Sohi
Architectural Support for Programming Languages and Operating Systems (ASPLOS ‘08)

Merge: A Programming Model for Heterogeneous Multi-core Systems
Michael D. Linderman, Jamison D. Collins, Hong Wang, Teresa H. Meng
Architectural Support for Programming Languages and Operating Systems (ASPLOS ‘08)

Streamware: Programming General-Purpose Multicore Processors Using Streams
Jayanth Gummaraju, Joel Coburn (Stanford), Yoshio Turner (HP Labs), Mendel Rosenblum
Architectural Support for Programming Languages and Operating Systems (ASPLOS ‘08)

Adaptive Set-Pinning: Managing Shared Caches in Chip Multiprocessors
Shekhar Srikantaiah, Mahmut Kandemir, Mary Jane Irwin
Architectural Support for Programming Languages and Operating Systems (ASPLOS ‘08)

Exploiting Access Semantics and Program Behavior to Reduce Snoop Power in Chip Multiprocessors
Chinnakrishnan Ballapuram, Ahmad Sharif, Hsien-Hsin S. Lee
Architectural Support for Programming Languages and Operating Systems (ASPLOS ‘08)

Amdahl’s Law in the Multicore Era [Keynote Speech] [verstion to appear n IEEE Computer 08]
Mark D. Hill, University of Wisconsin, Madison
International Symposium on High-Performance Computer Architecture (HPCA 08)

Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems
Jiang Lin, Qingda Lu, Xiaoning Ding, Zhao Zhang, Xiaodong Zhang, and P. Sadayappan
International Symposium on High-Performance Computer Architecture (HPCA 08)

Automatic Data Movement and Computation Mapping for Multi-level Parallel Architectures with Explicitly Managed Memories
Muthu Baskaran, Uday Bondhugula, Sriram Krishnamoorthy, J Ramanujam, Atanas Rountev, and P Sadayappan
Symposium on Principles and Practice of Parallel Programming (PPoPP 2008)

Scalable Packet Classification Using Interpreting — A Cross-platform Multi-core Solution
Haipeng Cheng, Zheng Chen, Bei Hua and Xinan Tang
Symposium on Principles and Practice of Parallel Programming (PPoPP 2008)

A Portable Runtime Interface For Multi-Level Memory Hierarchies
Mike Houston, Ji-Young Park, Manman Ren, et al.
Symposium on Principles and Practice of Parallel Programming (PPoPP 2008)

Application Optimization and Performance Evaluation of a Multithreaded GPU Using CUDA
Shane Ryoo, Christopher Rodrigues, Sara Baghsorkhi, Sam Stone, David Kirk and Wen- mei Hwu
Symposium on Principles and Practice of Parallel Programming (PPoPP 2008)

Cache-efficient Dynamic Programming Algorithms for Multicores
Rezaul Chowdhury and Vijaya Ramachandran
Symposium on Parallelism in Algorithms and Architectures (SPAA 2008)

Fundamental Parallel Algorithms for Private-Cache Chip Multiprocessors
Lars Arge, Michael T. Goodrich, Michael Nelson, and Nodari Sitchinava
Symposium on Parallelism in Algorithms and Architectures (SPAA 2008)

Utilizing Shared Data in Chip Multiprocessors with the Nahalal Architecture [Best Paper Award Winner]
Zvika Guz, Idit Keidar, Avinoam Kolodny, and Uri C. Weiser
Symposium on Parallelism in Algorithms and Architectures (SPAA 2008)

Optimal Speedup on a Low-Degree Multi-Core Parallel Architecture (LoPRAM)
Reza Dorrigiv, Alejandro Lopez-Ortiz, and Alejandro Salinger
Symposium on Parallelism in Algorithms and Architectures (SPAA 2008)

Extending Amdahl’s Law for Energy-Efficient Computing in Many-Core Era
Dong Hyuk Woo and Hsien-Hsin S. Lee.
To appear in IEEE Computer, 2008.

Intermediate Checkpointing with Conflicting Access Prediction in Transactional Memory Systems
M M Waliullah, Per Stenstrom
IPDPS 2008

DiCo-CMP: Efficient Cache Coherency in Tiled CMP Architectures
Alberto Ros, Manuel E. Acacio and José M. García
IPDPS 2008

HelperCoreDB: Exploiting Multicore Technology to Improve Database Performance
Kostas Papadopoulos, Kyriakos Stavrou, Pedro Trancoso
IPDPS 2008

Analysis of Double Buffering on two Different Multicore Architectures: Quad-core Opteron and the Cell-BE
Jose Sancho and Darren Kerbyson
IPDPS 2008

Scaling alltoall collective on multi-core systems [Slides]
Rahul Kumar, Amithrajith Mamidala, Dhabaleswar K. Panda
Workshop on Communication Architecture for Clusters 2008

A multithreaded communication engine for multicore architectures [Slides]
Francois Trahay, Elisabeth Brunet, Alexandre Denis, Raymond Namyst
Workshop on Communication Architecture for Clusters 2008

Scalable Directory Organization for Tiled CMP Architectures
Alberto Ros and Manuel E. Acacio and José M. García
Int’l Conference on Computer Design (CDES), Las Vegas (USA), July 2008

Efficient Architectural Design Space Exploration via Predictive Modeling
E. Ipek, S.A. McKee, K. Singh,R. Caruana, B.R. de Supinski, M. Schulz
ACM Transactions on Architecture and Code Optimization

2009 2008 2007 2006 2005 2004 2003 2002 2001
2000 1999 1998 1997 1996 Prior to 1995    Whitepapers

  • Share/Save/Bookmark