As the HPC industry descended on Dallas, Texas for this year’s instalment of the US supercomputing conference SC18, there was significant enthusiasm for the future of the industry – which is making gains through the rise in the use of AI.
‘From our volunteers to our exhibitors to our students and attendees – SC18 was inspirational,’ said SC18 general chair Ralph McEldowney. ‘Whether it was in technical sessions or on the exhibit floor, SC18 inspired people with the best in research, technology, and information sharing.’
SC18 highlighted progress in HPC, the rise of AI technology and several new HPC technologies on display during the exhibition. AMD’s 2018 resurgence continued at SC18 announcing new CPU and GPU products alongside a number of contract wins for large-scale HPC systems. The US strengthened its position in the Top500 after regaining the top spot in June this year. And finally the SC cluster competition saw the Chinese Tsinghua University take the top spot in the overall category of the competition in Dallas.
Chinese students take top honours
The Student Cluster Competition (SCC) was first held in 2007 to provide high-performance computing experience to undergraduate and high school students. Today the student teams are comprised on six students and at least one advisor. Hardware is donated by vendor partners and the student teams design and build clusters, which need to be configured and optimised to run specific codes across the 48-hour completion.
This year the Tsinghua University from China took the top spot overall. The Linpack crown was taken by Nanyang Technological University in Singapore with a score of 56.51 Teraflops. The HPCG was also won by Tsinghua University with a score of 1,985.97 Gigaflops.
As the competition limits teams to 3,000 watts of power for their HPC cluster, most teams opt to use GPU technology. For example, Tsinghua University used eight Nvidia V100 GPUs in their competition entry.
In a blog post from Nvidia, Bu-sung Lee, team leader and faculty adviser at Nanyang University stated: ‘If you don’t have GPUs, best of luck. It’s essential.’
However, it was not all new technology at SC this year. There was also reflection on the past as this year marks the 30th anniversary of the annual international conference of high performance computing, networking, storage and analysis. It celebrates the contributions of researchers, scientists and HPC users who have helped to advance this industry over the last 30 years.
New technology on the show floor
AMD has had a particularly strong 12 months, from plucky outsider to a growing force within the HPC industry. The launch of the Epyc processors gave the company a good start but along with new products at SC18 the company also announced several new contracts for AMD-based HPC systems.
These systems provide the groundwork for stronger adoption of AMD in the HPC ecosystem as these new systems provide real benchmarking opportunities to give potential users an idea of how these systems work at scale in real production environments.
AMD announced that The US Department of Energy’s NERSC, Cray and HAAS F1 Racing and The High-Performance Computing Center of the University of Stuttgart (HLRS) will all be using AMD Epyc processors for their upcoming systems. AMD also pushed its technologies into the cloud with announcements around the deployment of Epyc processors in MicrosoftAzure.
‘It’s been a fantastic year in the supercomputing space as we further expanded the ecosystem for AMD Epyc processors while securing multiple wins that leverage the benefits AMD Epyc processors have on HPC workloads,’ said Mark Papermaster, senior vice president and chief technology officer, AMD. ‘As the HPC industry approaches exascale systems, we’re at the beginning of a new era of heterogeneous computer that requires a combination of CPU, GPU and software that only AMD can deliver. We’re excited to have fantastic customers leading the charge with our Radeon Instinct accelerators, AMD EPYC processors and the ROCm open software platform.
Cray announced a new computing platform called Shasta which is set to replace its XC50 systems. Shasta is an entirely new design aimed at exascale performance, data-centric workloads, and increased competition of processor architectures. The National Energy Research Scientific Computing Center (NERSC) announced just ahead of SC18 that it has chosen a Cray ‘Shasta’ supercomputer for its NERSC-9 system, named ‘Perlmutter,’ in 2020. The program contract is valued at $146 million, one of the largest in Cray’s history, and will feature a 32-cabinet Shasta system.
Cray seems to acknowledge the shifting demands of HPC applications as the new system is positioned to take advantage of the growing trend for a single system to handle converged modelling, simulation, AI, and analytics workloads. The system allows users to mix and match processor architectures in the same system (X86, ARM, GPUs), as well as choose system interconnects from Cray (Slingshot), Intel (Omni-Path) or Mellanox (Infiniband).
‘Our scientists gather massive amounts of data from scientific instruments like telescopes and detectors that our supercomputers analyse every day,’ said Dr Sudip Dosanjh, director of the NERSC Center at Lawrence Berkeley National Laboratory. ‘The Shasta system’s ease of use and adaptability to modern workflows and applications will allow us to broaden access to supercomputing and enable a whole new pool of users. The ability to bring this data into the supercomputer will allow us to quickly and efficiently scale and reduce overall time to discovery.’
‘Cray is widely seen as one of only a few HPC vendors worldwide that is capable of aggressive technology innovation at the system architecture level,’ said Steve Conway, Hyperion Research senior vice president of research. ‘Cray’s Shasta architecture closely matches the wish list that leading HPC users have for the exascale era, but didn’t expect to be available this soon. This is truly a breakthrough achievement.’
Top500 shows US gathering pace at the top
The newest release of the Top500 list of the fastest supercomputers also took place at SC. While the DOS system Summit system from Oak Ridge National Laboratory (ORNL) was still first place there was significant movement in the top ten places. The US DOE system, Sierra, at Lawrence Livermore National Laboratory (LLNL) took second place however the overall list showed that China is still edging out the US in the total number of systems that made the list.
Summit widened its lead as the number one system, improving its High Performance Linpack (HPL) performance from 122.3 to 143.5 Pflops since its debut in June 2018. Sierra also added to its HPL results from the previous list. The system increased its score from 71.6 to 94.6 Pflops, enough to take the number two position. Both are IBM-built supercomputers, powered by Power9 CPUs and Nvidia V100 GPUs.
Sierra’s ascendance pushed China’s Sunway TaihuLight supercomputer, installed at the National Supercomputing Center in Wuxi, into third place. Prior to last June, it had held the top position on the Top500 list for two years with its HPL performance of 93.0 petaflops. TaihuLight was developed by China’s National Research Center of Parallel Computer Engineering & Technology (NRCPC).
The share of Top500 installations in China continues to rise, with the country now claiming 227 systems (45 per cent of the total). The number of supercomputers that call the US home continues to decline, reaching an all-time low of 109 (22 per cent of the total). However, systems in the US are, on average, more powerful, resulting in an aggregate system performance of 38 per cent, compared to 31 per cent for China.