Software Challenges in the Multicore Era (9:15 - 10:45 AM)
Software for High Performance Embedded Multiprocessors
Kenneth Mackenzie (Reservoir)
Chip multiprocessors (CMPs) are displacing application-specific integrated circuits (ASICs) in domains where designers must balance very high performance/power requirements against flexibility and programmer productivity requirements. The software development for such applications is challenging because the performance/power is being compared to purpose-built ASICs. A key problem for the software is that such CMPs are built with relatively low memory/core in order to devote as much chip area as possible to computation. In this talk, I will describe experiences Reservoir Labs has had developing software and tools in such domains (e.g. radar, networking, physics modeling). In particular, Reservoir is developing R-Stream, a parallelizing C compiler that uses a "polyhedral" representation of programs and architectures that allows it to explicitly model space and other constraints as part of the mapping problem.
Ken Mackenzie is a Consulting Engineer at Reservoir Labs, Inc. Ken joined Reservoir Labs in 2003 after five years as an Assistant Professor in the College of Computing at Georgia Tech. He received a B.S. in Electrical Engineering, M.S. and Ph.D. in Electrical Engineering and Computer Science from MIT in 1990, 1990 and 1998. As a graduate student, Ken worked as a member of the Alewife project, a novel, large-scale, distributed shared memory multiprocessor. His Ph.D. thesis work was to develop a fast, protected messaging system for Fugu, a machine that extended Alewife with multitasking, virtual memory and an Exokernel operating system. At Georgia Tech, Ken was granted an NSF CAREER award in 1999 for a project in which he and his students developed a form of software caching using dynamic binary rewriting targeted at embedded systems. At Reservoir Labs, Ken has worked as a consultant on customer projects involving network processors and a custom supercomputer for a molecular dynamics application as well as serving as PI for a DOE-funded, high-speed network intrusion detection project.
Doing real science on a Petaflop MPP
John Levesque (Cray)
This talk to will discuss how several organizations have been able to perform "break-through" science on the Cray XT system. With the advent of the XT architecture, many researchers have scaled their applications to 100s of sustained Teraflops of computation and have been able to perform much more accurate simulations that have shown an improved understanding of the science being investigated.
Scaling to 10s of thousands of processors can be difficult; however, there have been surprisingly large numbers of applications that were scaled without much work. This talk will discuss difficulties that needed to be addressed to enable these scientific break-throughs. Much of the experiences have been achieved with a coordinated effort between the scientific researcher and Cray experts from Cray's Supercomputing Center of Excellence, directed by the speaker, John Levesque.
John Levesque is the director of the Cray Supercomputing Center of Excellence at the Department of Energy's Oak Ridge National Laboratory (ORNL), the world's most powerful supercomputer for open (non-classified) scientific research. Levesque leads a team of engineers providing application and high performance computing expertise to researchers using ORNL systems. The Center of Excellence offers scientists and engineers from around the globe access to deep HPC application performance knowledge, allowing them to focus on their science or engineering challenge. Levesque also works closely with ORNL staff members to optimize application performance on ORNL's Cray X1E and XT3 supercomputers. Through this work, ORNL is running Parallel Ocean Program (POP) at speeds up to 1.5 times faster than it runs on the Earth Simulator in Japan.
From 2001 to 2003, Levesque was a senior principal engineer responsible for the benchmarking initiatives for the Cray XI system, playing a key role in helping CINECA, Italy's national supercomputer center and the Department of Defense Modernization Program optimize their applications. Prior to Cray, Levesque was the director of the Advanced Computer Technology Center at IBM's Watson Research Center where his team received the Scientific Achievement Award for contributing more than $200 million in new business sales in 2000.
With more than 35 years experience in high performance computing, Levesque is a recognized expert in the optimization of FORTRAN programs and parallel vector system architectures. He has been involved in the Advanced Strategic Computing Initiative at the US Department of Energy, the National Academy of Sciences, and is a regular contributor to the Cray User Conference. Levesque has also contributed to numerous publications on the optimization of programs for advanced scientific computers including a collection of papers entitled "Concurrency and Computations" in 2003 and a book entitled "Guidebook to FORTRAN on Supercomputers" published in 1989. Levesque has a Bachelor of Arts and Master of Arts in Mathematics from the University of New Mexico in Albuquerque New Mexico.
Blue Gene System Software Design and Implementation
Robert W. Wisniewski (IBM)
Four years ago, Blue Gene
made a significant impact by introducing an ultra-scalable computer with a
focus on low power. Since them, BG/L has maintained the number 1 spot on
the top500 list for 7 lists in a row. In the recent November 2007 list,
the number 2 spot was obtained by BG/P, the next line of Blue Gene
computers. Blue Gene/P is design to scale to several PetaFlops at 256
racks of 256Ki nodes and 1Mi processor cores. There are unique challenges
to designing software to run at this scale. In this talk I will describe
the software system stack that runs on the Blue Gene/P machines and the
motivation behind the design. I will focus on the strategies the team has
used to get software to run in this ultrascale environment. I will also
spend a little time describing some of the multi-core challenges we will
face in next generation.
Robert W. Wisniewski received his PhD from the University of Rochester. Prior to coming to IBM Research, he worked at SGI on Operating System design and bring-up for their high-end Origin servers as well as real-time performance on parallel machines. He started at IBM Research working on the K42 project, a research effort aimed at designing from the ground up, a scalable customizable operating system for small parallel machines up to large-scale machines used in scientific computing. As part of the K42 effort, he made contributions to Linux including LTT (the Linux Trace Toolkit) and relayfs. He was involved in Phase I and II of the IBM PERCS DARPA HPCS project. For Phase II of PERCS he worked on CPO (Continuous Program Optimization), which is aimed at using vertically integrated performance data to automatically improve the performance of both applications and the underlying system. Following his work on PERCS, he contributed to the CSO (Commercial Scale Out) project in the area of performance understanding. He is currently a research scientist and manager of the Blue Gene Software Team at IBM Research. His research interests include scalable parallel systems, first-class system customization, performance monitoring, and using performance monitoring information for continuous programming and system optimization.