Keynote Speaker 1: John Shalf, Lawrence Berkeley National Laboratory
Title: Exascale Interconnect Architecture: emerging developments and opportunities for PGAS
Date: September 17th
Until approximately 2004, a single model of parallel programming— bulk synchronous using the message passing interface (MPI 1.x) model using a 2-sided messaging semantics—was usually sufficient for translating complex applications into reasonable parallel programs. However, the end of Dennard scaling in 2004 ended historical exponential gains in per-processor clock rates, and has given way to exponential increases in explicit parallelism for the foreseeable future. Furthermore, there is increasing evidence that merely doubling the number of existing cores per generation is insufficient to meet performance and energy targets for the future. Rather, the HPC architectures must move towards even simpler throughput-oriented cores as the primary HPC workforce. This is consistent with recent pre-exascale procurements such as APEX (selecting an Intel KNL lightweight core architecture) and CORAL( selecting a GPU+CPU and another Intel lightweight core architecture). This poses challenges to our current message passing approach. Simpler lightweight processor cores are challenged to process complex messaging protocol stacks and handling the more complicated MPI1x tag matching semantics. The traditional MPI+x model where one MPI thread serves many threads presents a first-order Amdahl serialization bottleneck when thread counts are growing exponentially (not to mention the challenge of message endpoint indeterminacy). The move to explicit parallelism for future performance increases, puts more pressure on strong-scaling and consequent smaller message sizes. Furthermore, reduction in the size of available high-bandwidth memory at node endpoints (typically low capacity in-package memory) further forces codes towards finer-grained messaging, which puts more pressure on reducing overheads for sending small messages. All run counter to traditional scaling laws for interconnect technologies and the messaging libraries that go with them.
This talk will walk through the coming architectural challenges, and review emerging responses to these challenges from the MPI standards and the PGAS communities. There is broad support for extending support for communication based on virtual memory addresses across multi-node systems because it is architecturally simpler to support. In a broad sense, PGAS is well positioned to play in this space, but there are many different choices available in terms of semantics for memory consistency. It is imperative that we settle on consistent semantics to guide the future development of hardware and evolution of an efficient and performance messaging model for future computing systems.
John Shalf is Chief Technology Officer at NERSC. His background is in electrical engineering: he spent time in graduate school at Virginia Tech working on a C-compiler for the SPLASH-2 FPGA-based computing system, and at Spatial Positioning Systems Inc. (now ArcSecond) he worked on embedded computer systems. John first got started in HPC at the National Center for Supercomputing Applications (NCSA) in 1994, where he provided software engineering support for a number of scientific applications groups. While working for the General Relativity Group at the Albert Einstein Institute in Potsdam Germany, he helped develop the first implementation of the Cactus Computational Toolkit, which is used for numerical solutions to Einstein's equations for General Relativity and which enables modeling of black holes, neutron stars, and boson stars. He also developed the I/O infrastructure for Cactus, including a high performance self-describing file format for storing Adaptive Mesh Refinement data called FlexIO. John joined Berkeley Lab in 2000 and has worked in the Visualization Group, on the RAGE robot, which won an R&D100 Award in 2001, and led international high performance networking teams to win to consecutive SciNET bandwidth challenges in 2001-2002. He is a member of the DOE Exascale Steering committee, and is a co-author of the landmark "View from Berkeley" paper as well as the DARPA Exascale Software Report. He currently leads the NERSC Advanced Technology Group (ATG) that leads projects in exascale technology research such as CoDEx (CoDesign for Exascale), and the LBNL Green Flash project ( http://www.lbl.gov/cs/html/greenflash.html ) that seeks to develop energy efficient scientific computing systems using manycore and embedded technologies.
Keynote Speaker 2: Vivek Sarkar, Rice University
Title: The Role of Global Address/Name Spaces in Extreme Scale Computing and Analytics
Date: September 18th
It is widely recognized that radical changes are forthcoming in extreme scale systems for future scientific and commercial computing. In less than a decade, we will have exascale systems that contain billions of processor cores/accelerators, with performance driven by parallelism, and constrained by energy and data movement, while also being subject to frequent faults and failures. Unlike previous generations of hardware evolution, these extreme scale systems will have a profound impact on the software stack underlying future applications and algorithms. The software challenges are further compounded by the addition of new application domains that include, most notably, data-intensive computing and analytics, which represent new frontiers of innovation in both scientific and commercial computing.
The challenges across the entire software stack for extreme scale systems are driven by programmability, portability and performance requirements, and impose new constraints on programming models, languages, compilers, runtime systems, and system software. In this talk, we will focus on the role of global address spaces (as in PGAS systems) and global name spaces (as in Map-Reduce key-value pairs) in enabling future applications on future hardware. Examples will be drawn from recent research experiences in the Habanero Extreme Scale Software Research project at Rice University , including the Habanero-C++ and Habanero-Java programming models for scientific and commercial software respectively. Background material will also be drawn from the Open Community Runtime (OCR) system  being developed in the DOE X-Stack program, the DARPA Exascale Software Study report , and the DOE ASCAC study on Synergistic Challenges in Data-Intensive Science and Exascale Computing . We would like to acknowledge the contributions of all participants in the Habanero project, the OCR project, and the DARPA and DOE studies.
 Habanero Extreme Scale Software Research project. http://habanero.rice.edu.
 Open Community Runtime project.
 DARPA Exascale Software Study report, September 2009. http://users.ece.gatech.edu/~mrichard/ExascaleComputingStudyReports/ECS_reports.htm.
 DOE report on Synergistic Challenges in Data-Intensive Science and Exascale Computing, March 2013. (pdf)
Vivek Sarkar is Professor and Chair of Computer Science at Rice University. He conducts research in multiple aspects of parallel software including programming languages, program analysis, compiler optimizations and runtimes for parallel and high performance computer systems. He currently leads the Habanero Extreme Scale Software Research Laboratory at Rice University, and serves as Associate Director of the NSF Expeditions Center for Domain-Specific Computing and PI of the DARPA-funded Pliny project on "big code" analytics. Prior to joining Rice in July 2007, Vivek was Senior Manager of Programming Technologies at IBM Research. His responsibilities at IBM included leading IBM’s research efforts in programming model, tools, and productivity in the PERCS project during 2002- 2007 as part of the DARPA High Productivity Computing System program. His prior research projects include the X10 programming language, the Jikes Research Virtual Machine for the Java language, the ASTI optimizer used in IBM’s XL Fortran product compilers, the PTRAN automatic parallelization system, and profile-directed partitioning and scheduling of Sisal programs. In 1997, he was on sabbatical as a visiting associate professor at MIT, where he was a founding member of the MIT Raw multicore project. Vivek became a member of the IBM Academy of Technology in 1995, the E.D. Butcher Chair in Engineering at Rice University in 2007, and was inducted as an ACM Fellow in 2008. He holds a B.Tech. degree from the Indian Institute of Technology, Kanpur, an M.S. degree from University of Wisconsin-Madison, and a Ph.D. from Stanford University. Vivek has been serving as a member of the US Department of Energy’s Advanced Scientific Computing Advisory Committee (ASCAC) since 2009.