With the growth of AI and DL comes new opportunities for emerging applications, finds Robert Roe
As artificial intelligence (AI) and deep learning (DL) technologies mature, there are increasing numbers of applications available to scientists and researchers who are adopting these methodologies to increase research output.
In addition to emerging applications in AI, the accelerator technologies developed for AI or machine learning are now finding new applications in more traditional HPC and scientific computing use cases. Nvidia has recently announced a collaboration with biopharmaceutical company AstraZeneca and the University of Florida’s academic health centre, UF Health, on new AI research projects using transformer neural networks.
Transformer-based neural network architectures – which have become available only in the last several years – allow researchers to leverage massive datasets using self-supervised training methods, avoiding the need for manually labelled examples during pre-training. These models, equally adept at learning the syntactic rules to describe chemistry as they are at learning the grammar of languages, are finding applications across research domains and modalities.
Nvidia is collaborating with AstraZeneca on a transformer-based generative AI model for chemical structures used in drug discovery that will be among the very first projects to run on Cambridge-1, which is soon to go online as the UK’s largest supercomputer.
The model will be open-sourced, available to researchers and developers in the Nvidia NGC software catalogue, and deployable in the Nvidia Clara Discovery platform for computational drug discovery.
Separately, UF Health is harnessing Nvidia’s state-of-the-art Megatron framework and BioMegatron pre-trained model – available on NGC – to develop GatorTron, the largest clinical language model to date.
New NGC applications include AtacWorks, a deep learning model that identifies accessible regions of DNA, and MELD, a tool for inferring the structure of biomolecules from sparse, ambiguous or noisy data.
This is just one example that highlights the success of Nvidia’s drive to capture the AI and DL markets. So far they have been incredibly successful but there is mounting pressure from other accelerator technology providers. One such example is Graphcore, the UK-based company developing its own brand of general-purpose accelerators known as intelligence processing units (IPU).
Graphcore released the second generation of IPU products in 2020 and is quickly gaining pace, with exciting benchmarks in both ML and scientific computing.
There are several examples on the Graphcore website, for example in the areas of drug discovery and life sciences where the IPU has already been deployed for several different applications.
In the example of BERT-BASE training, the IPU achieved 25 per cent faster training time at 20 per cent lower power, meaning the algorithm will run faster at a lower cost. BERT-BASE inference training against a V100 GPU showed that IPU provides 2x higher throughput, making it possible to use BERT in an interactive setting where scalability is a priority. There are also several examples with EfficientNet-B0 and Markov Chain Monte Carlo (MCMC) simulations.
Matt Fyles, SVP Software at Graphcore, explains that while the first generation proved the use case for the technology and got it into the hands of developers, the second generation provides a boost in performance and scalability. ‘The first generation of IPU products was really to prove the IPU as a computing technology that could be applied to problems in the machine learning (ML) space primarily, but also as a general programming platform,’ said Fyles.
‘We started out with a view that we are making a general-purpose compute accelerator. Our initial focus is on ML and that is where a lot of the new capabilities that we have added can be deployed. But IPU is also equally applicable to problems in scientific computing and HPC. And while that was secondary to our plans to begin with, it was more about building out the ecosystem to support both aspects,’ states Fyles.
Fyles notes that getting IPUs into data centres and the hands of scientists as being the primary goal for the first generation. This then allows Graphcore to begin engaging and developing an ecosystem of developers and applications.
‘A lot of it was about bringing IPU to the world and getting people to understand how it worked and how we could move forward with it,’ states Fyles.
The second generation of IPU technology drives performance and allows users to scale to much higher levels than was previously possible. Scalability is something that Graphcore have spent a lot of time on to ensure that both software and hardware can make use of the additional resources.
‘We have now got this scalable system that is built around our m2000 platform, which is a 1U server blade containing four IPUs supported by the Poplar software stack. We can build a POD16, as we call it, which is four of our M2000 connected together – but we can also scale it up a lot further to have 64IPUs or 128 IPUs and allow the software stack to programme it in a similar way,’ added Fyles. But driving adoption is not just about hardware. Fyles stressed that the software and hardware have been designed from the ground up to ensure ease of use, scalability and performance: ‘The common theme is the software development kit and all the great work that the team has done on it to deliver a mature product has continued across both generations and will continue in the next generation that follows.’
‘We had the opportunity to redesign the software stack alongside the hardware from the beginning. To augment and try to steer the hardware and software from where it came about out of the HPC space,’ added Fyles. ‘We have had the opportunity for a clean slate approach to both software and hardware that has meant that we can potentially find performance where others cannot and we have a much simpler legacy software to maintain over time. We have built the software and hardware together in a way that allows us to very quickly realise performance.’
Fyles stressed that huge amounts of time and resources have been spent on the Graphcore developer portal and populating that with examples, videos and worked examples of how to use and get the most out of IPU products.
‘We are trying to build an ecosystem of IPU developers that starts small and grows over time. We do that in a number of ways, one is relatively organically, we have interesting people come to us with interesting problems for the IPU to solve and that is more of a traditional sales engagement,’ states Fyles. ‘Then there is the academic programme that we have introduced recently which starts to work with key academic institutes around the world. The IPU is a platform for research and development that is different from something that they have now but it is also easy to use, well supported and documented. People can take applications now and they can do things without asking for our help. There are a number of great projects I have seen, for example, the University of Bristol.’
The University of Bristol’s High-Performance Computing group, led by Professor Simon McIntosh-Smith, have been investigating how Graphcore’s IPUs can be used at the convergence of AI and HPC compute for scientific applications in computational fluid dynamics, electromechanics and particle simulations.
‘Traditionally that seems like something that we would have had to give a lot of help with and that is the good thing. People are starting to solve challenges and use IPU themselves. We see a lot of great feedback which helps to push the software forward,’ explains Fyles.
There are many different technologies available to AI and ML researchers including GPUs, IPUs and FPGAs to name a few. However, Graphcore’s Fyles believes that only technologies that have been designed around ease of use for software engineers can succeed. As such, technologies such as FPGAs will most likely never see wide adoption in the AI and ML space as they require skills that are not suited to HPC or AI.
‘People have spent a lot of time trying to turn it [FPGAs] into a platform that software engineers can use but the best platform for a software engineer is a processor that uses a standard software stack with software libraries that they are familiar with,’ states Fyles.
‘You can abstract a lot of it away but that is why it is hard to get wide adoption on such a platform because the people using the product are software engineers. Now it is all about Python and high-level libraries and ultimately people do not always care about the hardware underneath, they want their application to run and they want it to go fast,’ added Fyles. ‘There is still the low level, close to bare metal developers that have always existed in HPC but the wider audience is using Python and high-level machine learning frameworks. That is where a lot of the new developers that come out of university have been trained. They do not expect to go down to the hardware level.’
Case Study: The European Organisation for Nuclear Research delves into particle physics with Gigabyte servers
The European Organisation for Nuclear Research (CERN) uses Gigabyte's high-density GPU Servers outfitted with 2nd Gen AMD EPYC processors. Their purpose: to crunch the massive amount of data produced by subatomic particle experiments conducted with the Large Hadron Collider (LHC).
The impressive processing power of the GPU Servers’ multi-core design has propelled the study of high energy physics to new heights.
The biggest and most energy-intensive CERN project is its particle accelerator: the Large Hadron Collider (LHC). In order to detect the subatomic particle known as the beauty (or bottom) quark more quickly, CERN has decided to invest in additional computing equipment to analyse the massive quantities of raw data produced by the LHC.
When CERN looked for ways to expand their data processing equipment, the top priority was to acquire High Performance Computing (HPC) capabilities. They specifically wanted servers equipped with 2nd Gen AMD EPYC™ processors and multiple graphics accelerators. The servers should also support PCIe Gen 4.0 Computing Cards. Gigabyte was the only company with the solution to match their demand. CERN selected the Gigabyte G482-s51, a model that supports up to eight PCIe Gen 4.0 GPGPU cards in a 4U chassis.
Gigabyte has plenty of experience in HPC applications. When AMD began offering PCIe Gen 4.0 technology on their x86 platforms, Gigabyte immediately responded by leveraging technological know-how and expertise to design the first GPU Servers capable of supporting PCIe Gen 4.0. Gigabyte optimised the server’s integrated hardware design, from the electronic components and PCB all the way down to the high-performance power delivery system.
Signal integrity was maximised by minimising signal loss in high-speed transmissions between CPU and GPU, and between GPUs. This resulted in a GPU Server that feature slower latency, higher bandwidth, and unsurpassed reliability.
To effectively process huge quantities of data with HPC technology, more than the CPU and GPU's combined computing power comes into play. High-speed transmission is crucial; whether it relates to the computing and storage of data between multiple server clusters, or the accelerated processing and communication of data between devices linked through the internet. The Gigabyte G482-s51 overcome this challenge with its PCIe Gen 4.0 interface, which supports high performance network cards. The increased bandwidth enables high-speed data transmission, which in turn enhances the performance of the entire HPC system, making it possible to process the 40 terabytes of raw data that the particle accelerator generates every second.
Custom-designed servers provide CERN with cutting-edge computing power
In order to quickly analyse the massive quantities of data generated by experiments conducted with the LHC, CERN independently developed its own powerful computer cards to handle all the calculations. They paired these cards with graphics cards designed for image processing. The combined might of these specialised tools became the last word in cutting-edge computing.
Gigabyte customised the G482-s51 to meet the client's specific requirements which included: Specially designed expansion slots and minute adjustments to the BIOS and an advanced heat dissipation solution.
Specially designed expansion slots and minute adjustments to the BIOS: To account for CERN's self-developed computer cards, Gigabyte customised the expansion slots of the G48-s51. Gigabyte also ran data simulations and made minute adjustments to the BIOS to better link all the computer cards to the motherboard, so that each card could achieve the maximum PCIe Gen 4.0 speed.
An advanced heat dissipation solution: CERN had special specifications for just about everything, from the power supply to the network interface cards to the arrangement of the eight GPGPU cards. The heat consumption of these interwoven I/O devices is not all the same. Gigabyte leveraged its expertise in heat dissipation designs and integration techniques to successfully channel airflow inside the servers, so excessive heat would not become a problem.
Gigabyte Worked Closely with AMD to Expand the Horison for HPC Applications
One of the most noteworthy advantages of the AMD CPU is its multi-core design. In the push to solidify the AMD EPYC processor's position in the server market, Gigabyte has an important part to play. By creating an AMD EPYC™ Server that showcases top performance, system stability, and steadfast quality, Gigabyte was able to satisfy CERN's need for a solution capable of analysing large amounts of data and completing HPC workloads.
Gigabyte's responsive customisation services and in-depth experience in research and development were just what the client needed to meet their specific requirements. By pushing computing power to the limit with state-of-the-art technological prowess, Gigabyte has taken an impressive stride forward in the application of HPC solutions to academic research and scientific discovery.
To learn more about Gigabyte server solutions, please visit https://www.Gigabyte.com/Enterprise
For direct contact, please sent an email to server.grp@Gigabyte.com