2010 2009 2008 2007 2006 2005 2004 2003 2002 2001
2000 1999 1998 1997 1996 Prior to 1995 Whitepapers
Papers listed here are either freely available on the web or obtained legally. Please respect the various copyright stipulations placed on these documents. If any author would like us to add or to remove their paper from here, please contact us at info@multicoreinfo.com .
Multicore Papers 2010
HPCA 2010
Operating System Support for Overlapping-ISA Heterogeneous Multi-Core Architectures
OTong Li, Paul Brett, Rob Knauerhase, David Koufaty, et al.
High-Performance Computer Architecture, (HPCA-16) 2010
ATLAS: A Scalable and High Performance Scheduling Algorithm for Multiple Memory Controllers
Yoongu Kim, Dongsu Han, Onur Mutlu, Mor Harchol-Balter
High-Performance Computer Architecture, (HPCA-16) 2010
Understanding How Off-Chip Memory Bandwidth Partitioning in Chip Multiprocessors Affects System Performance
Fang Liu, Xiaowei Jiang, and Yan Solihin
High-Performance Computer Architecture, (HPCA-16) 2010
CHOP:Adaptive Filter-Based DRAM Caching for CMP Server Platforms
Xiaowei Jiang, Niti Madan, Li Zhao, Mike Upton, Ravishankar Iyer, Srihari Makineni, Donald Newell, Yan Solihin and Rajeev Balasubramonian
High-Performance Computer Architecture, (HPCA-16) 2010
LeadOut: Composing Low-Overhead Frequency-Enhancing Techniques for Single Thread Performance in Configurable Multicores
Brian Greskamp, R. Ulya Karpuzcu, Josep Torrellas
High-Performance Computer Architecture, (HPCA-16) 2010
LiteTM: Reducing Transactional State Overhead
Syed Ali Raza Jafri, Mithuna Thottethodi, T. N. Vijaykumar
High-Performance Computer Architecture, (HPCA-16) 2010
A Bandwidth-Aware Memory Subsytem Resource Management Using Non-Invasive Resource Profilers for Large CMP Systems
Dimitris Kaseridis, Jeffrey Stuecheli, Lizy K. John
High-Performance Computer Architecture, (HPCA-16) 2010
StimulusCache: Boosting Performance of Chip Multiprocessors with Excess Cache
Hyunjin Lee, Sangyeun Cho, and Bruce R. Childers
High-Performance Computer Architecture, (HPCA-16) 2010
ESP-NUCA: A Low-Cost Adaptive Non-Uniform Cache Architecture
Javier Merino, Valentin Puente, Jose-Angel Gregorio
High-Performance Computer Architecture, (HPCA-16) 2010
Towards Scalable, Energy-Efficient Bus-Based On-Chip Networks
Aniruddha N. Udipi, Naveen Muralimanohar, Rajeev Balasubramonian
High-Performance Computer Architecture, (HPCA-16) 2010
DMA Cache: Using On-Chip Storage to Architecturally Separate I/O Data from CPU Data for Improving I/O Performance
Dan Tang, Yungang Bao, Weiwu Hu, Mingyu Chen
High-Performance Computer Architecture, (HPCA-16) 2010
Graphite: A Distributed Parallel Simulator for Large-Scale Multicores
Jason Miller, Harshad Kasture, George Kurian, et al.
High-Performance Computer Architecture, (HPCA-16) 2010
Application Performance Modeling in a Virtualized Environment
Sajib Kundu, Raju Rangaswami, Kaushik Dutta, Ming Zhao
High-Performance Computer Architecture, (HPCA-16) 2010
COMIC++: A Software SVM System for Heterogeneous Multicore Accelerator Clusters
Jaejin Lee, Jun Lee, Sangmin Seo, Jungwon Kim, Seungkyun Kim
High-Performance Computer Architecture, (HPCA-16) 2010
BOLT: An Energy-Efficient Latency-Tolerant Processor
Andrew Hilton, Amir Roth
High-Performance Computer Architecture, (HPCA-16) 2010
An Optimized 3D-Stacked Memory Architecture by Exploiting Excessive, High-Density TSV Bandwidth
Dong Hyuk Woo, Nak Hee Seong, Dean L. Lewis, and Hsien-Hsin S. Lee
High-Performance Computer Architecture, (HPCA-16) 2010
PPoPP 2010
Structure-driven Optimizations for Amorphous Data-parallel Programs
Mario Mendez-Lojo, Donald Nguyen, Dimitrios Prountzos, Xin Sui, M. Amber Hassaan, Milind Kulkarni, Martin Burtscher and Keshav Pingali
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Debugging Programs that use Atomic Blocks and Transactional Memory
Ferad Zyulkyarov, Tim Harris, Osman S. Unsal, Adrián Cristal, Mateo Valero
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Gambit: Effective Unit Testing of Concurrency Libraries
Katherine Coons, Sebastian Burckhardt and Madanlal Musuvathi
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Featherweight X10: a Core Calculus for Async-Finish Parallelism
Jonathan Lee and Jens Palsberg
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Compiler Aided Selective Lock Assignment for Improving the Performance of Software Transactional Memory
Sandya Mannarswamy, Dhruva Chakrabarti, Kaushik Rajan and Sujoy Saraswati
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Is Transactional Programming Really Easier?
Christopher Rossbach, Owen Hofmann and Emmett Witchel
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Debugging Programs that use Atomic Blocks and Transactional Memory
Ferad Zyulkyarov, Tim Harris, Osman Unsal, Adrian Cristal and Mateo Valero
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Scheduling Support for Transactional Memory Contention Management
Walther Maldonado, Patrick Marlier, Pascal Felber, Julia Lawall, Gilles Muller, Adi Suissa, Danny Hendler and Alexandra Fedorova
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
An Adaptive Performance Modeling Tool for GPU Architectures
Sara Baghsorkhi, Matthieu Delahaye, Sanjay Patel, William Gropp and Wen-mei Hwu
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Model-driven Autotuning of Sparse Matrix-Vector Multiply on GPUs
Jee Choi, Amik Singh and Richard Vuduc
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Fast Tridiagonal Solvers on GPU
Yao Zhang, Jonathan Cohen and John Owens
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
CUDAlign: Using GPU to Accelerate the Comparison of Megabase Genomic Sequences
Edans Flávius de O. Sandes and Alba Cristina M. A. Melo
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Load Balancing on Speed
Steven Hofmeyr, Costin Iancu and Filip Blagojevic
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Scalable Communication Protocols for Dynamic Sparse Data Exchange
Torsten Hoefler, Christian Siebert and Andrew Lumsdaine
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
The LOFAR Correlator: Implementation and Performance Analysis
John W. Romein, P. Chris Broekema, Jan David Mol and Rob V. van Nieuwpoort
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Lazy Binary-Splitting: A Run-Time Adaptive Work-Stealing Scheduler
Alexandros Tzannes, George C. Caragea, Rajeev Barua and Uzi Vishkin
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Thread to Strand Binding of Parallel Network Applications in Massive Multi-Threaded Systems [Author website]
Petar Radojkovic, Vladimir Cakarevic, Javier Verdu, Alex Pajuelo, Francisco J. Cazorla, et al.
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Does Cache Sharing on Modern CMP Matter to the Performance of Contemporary Multithreaded Programs?
Eddy Z. Zhang, Yunlian Jiang and Xipeng Shen
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Improving Parallelism and Locality with Asynchronous [Presentation Slides]
Lixia Liu and Zhiyuan Li
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Scaling LAPACK Panel Operations Using Parallel Cache Assignment
Anthony M. Castaldo and R. Clint Whaley
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Modeling Advanced Collective Communication Algorithms on Cell-based Systems
Qasim Ali, Samuel Midkiff and Vijay Pai
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Phantom: Predicting Performance of Parallel Applications on Large-Scale Parallel Machines Using a Single Node
Jidong Zhai, Wenguang Chen and Weimin Zheng
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
Input-Driven Dynamic Execution Behavior Prediction of Streaming Applications
Farhana Aleen, Monirul Sharif and Santosh Pande
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010)
hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications
François Broquedis, Jérôme Clet-Ortega, Stéphanie Moreaud, et al.
My Love: 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2010)
Efficient Parallel Programming in Poly/ML and Isabelle/ML
David C. J. Matthews and Makarius Wenzel
ACM SIGPLAN Workshop on Declarative Aspects of Multicore Programming (DAMP 2010)
COMPASS: A Programmable Data Prefetcher Using Idle GPU Shaders
Dong Hyuk Woo and Hsien-Hsin S. Lee
Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2010
2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 Prior to 1995 Whitepapers

