Nvidia has announced at the GPU Technology conference 2014 that it plans to integrate a high-speed interconnect, called Nvidia NVLink, into its future GPUs, enabling GPUs and CPUs to share data five to 12 times faster than they can today.
This will eliminate a longstanding bottleneck between CPUs and GPUs and represents a new step on the road to next-generation exascale computing through heterogeneous architectures.
Nvidia will add NVLink technology into its Pascal GPU architecture – expected to be introduced in 2016 as the successor to this year’s Maxwell compute architecture. The new interconnect was co-developed with IBM, which will incorporate it in future versions of IBM’s Power CPUs.
With NVLink technology tightly coupling IBM Power CPUs with GPUs, the Power data centre system will be able to use GPU acceleration for a diverse set of applications, such as high performance computing, data analytics, and machine learning.
Today’s GPUs are connected to x86-based CPUs through the PCI Express (PCIe) interface, which limits the GPU’s ability to access the CPU memory system and can be five-times slower than typical CPU memory systems. PCIe is an even greater bottleneck between the GPU and IBM Power CPUs, which have more bandwidth than x86 CPUs. As the NVLink interface will match the bandwidth of typical CPU memory systems, it will enable GPUs to access CPU memory at its full bandwidth.
Accelerated computing applications typically move data from the network or disk storage to CPU memory and then copy the data to GPU memory before it can be crunched by the GPU. With NVLink, the data moves between the CPU memory and GPU memory at much faster speeds, making GPU-accelerated applications run much faster.
‘NVLink enables fast data exchange between CPU and GPU, thereby improving data throughput through the computing system and overcoming a key bottleneck for accelerated computing today,’ said Bradley McCredie, vice president and IBM Fellow at IBM. ‘NVLink makes it easier for developers to modify high-performance and data analytics applications to take advantage of accelerated CPU-GPU systems. We think this technology represents another significant contribution to our OpenPower ecosystem.’