VMware and NVIDIA have announced the expansion of their strategic partnership to ready the hundreds of thousands of enterprises that run on VMware’s cloud infrastructure for the era of generative AI.
VMware Private AI Foundation with NVIDIA will enable enterprises to customise models and run generative AI applications, including intelligent chatbots, assistants, search and summarisation. The platform will be a fully integrated solution featuring generative AI software and accelerated computing from NVIDIA, built on VMware Cloud Foundation and optimized for AI.
“Generative AI and multi-cloud are the perfect match,” said Raghu Raghuram, CEO, VMware. “Customer data is everywhere — in their data centres, at the edge, and in their clouds. Together with NVIDIA, we’ll empower enterprises to run their generative AI workloads adjacent to their data with confidence while addressing their corporate data privacy, security and control concerns.”
“Enterprises everywhere are racing to integrate generative AI into their businesses,” said Jensen Huang, founder and CEO, NVIDIA. “Our expanded collaboration with VMware will offer hundreds of thousands of customers — across financial services, healthcare, manufacturing and more — the full-stack software and computing they need to unlock the potential of generative AI using custom applications built with their own data.”
To achieve business benefits faster, enterprises are seeking to streamline development, testing and deployment of generative AI applications. McKinsey estimates that generative AI could add up to $4.4 trillion annually to the global economy.
VMware Private AI Foundation with NVIDIA will enable enterprises to harness this capability, customizing large language models; producing more secure and private models for their internal usage; and offering generative AI as a service to their users; and, more securely running inference workloads at scale.
The platform is expected to include integrated AI tools to empower enterprises to run proven models trained on their private data in a cost-efficient manner.
The platform will feature NVIDIA NeMo, an end-to-end, cloud-native framework included in NVIDIA AI Enterprise — the operating system of the NVIDIA AI platform — that allows enterprises to build, customize and deploy generative AI models virtually anywhere. NeMo combines customization frameworks, guardrail toolkits, data curation tools and pretrained models to offer enterprises an easy, cost-effective and fast way to adopt generative AI.
For deploying generative AI in production, NeMo uses TensorRT for Large Language Models (TRT-LLM), which accelerates and optimizes inference performance on the latest LLMs on NVIDIA GPUs. With NeMo, VMware Private AI Foundation with NVIDIA will enable enterprises to pull in their own data to build and run custom generative AI models on VMware’s hybrid cloud infrastructure.