Paul Schreier reports from the International Supercomputing Conference in Hamburg, Germany
One topic that has attracted considerable attention at ISC 2011 in Hamburg has been the role and future of GPUs in HPC. A session entitled 'Heterogeneous Systems & Their Challenges to HPC Systems' led off with a talk by Steve Scott, CTO at Cray. He pointed out two major issues: structure and programming. These issues have been echoed all week in other sessions and presentations. As for the first, he feels that structural problems related to bandwidth and synchronisation time are actually short-lived issues due to the integration of both types of cores on one piece of silicon, as illustrated by Nvidia's Project Denver and the AMD Fusion. On the programming side, things he finds 'more worrisome' include the need to learn a new language and programming model, to maintain two code bases and tuning for a complex architecture.
'We need a portable programming model… and a directive-based approach makes sense.' He noted that Cray is working on a compiler for GPUs using a directive approach, and he claims that hybrid code scales better than pure MPI code. Much of his opinions stem from the fact that Cray has just announced the XK6, which includes a GPU blade. In his long-term prognosis, Scott believes that 'we’ll eventually stop talking about accelerated computing; rather, that’s just the way it will be.' He further believes that structural issues will go away, and programming environments for multicore systems, including those with both GPUs and CPUs, will be much improved.
At a Nvidia meeting, Sumit Gupta claimed that the biggest hurdle to the adoption of GPUs is the 'misconception that GPU computing is hard.' He also listed what he considers the three myths: first, that you must port an entire application to a GPU; second, that is really hard to accelerate an application; and third that there is a PCI-bottleneck. He then stated: 'I don’t think of the GPU as an accelerator; I think of the CPU as a decelerator.' The company’s roadmap includes increasing performance per Watt (instead of performance alone), making parallel programming easier and running more of an application on a GPU.
Later at a 'GPU Debate', Nvidia senior fellow David Kirk used much of the same material in his introductory slides, adding that 'Cuda is the fastest growing parallel programming development environment in the history of computing' and that 'GPUs are “the next X86” and not “the next accelerator.”' When asked to forecast when to expect an exascale system, he responded that he could see one this decade, but that it will include 75,000 GPUs in 600 racks and consume 20 MW. His opponent in the debate, Thomas Sterling of Louisiana State University, possibly sees a 'stunt' exascale system, perhaps in 2018 to 2020, that will consume 200 MW +/- 50 per cent. A 'for real' machine with practical applications won’t be here until 2022 to 2023 and will require 100 MW. He concluded that 'supercomputing cannot be limited by a widget dedicated to a specific task.'