● LIVE   Breaking News & Analysis
Igorfit
2026-05-02
Technology

NVIDIA and Google Cloud Unveil Next-Gen AI Infrastructure Aimed at Agentic and Physical AI

NVIDIA and Google Cloud unveil A5X instances with Vera Rubin GPUs, delivering 10x lower inference cost and 10x higher token throughput, advancing agentic and physical AI.

Breaking: Google Cloud Announces NVIDIA Vera Rubin-Powered A5X Instances

At Google Cloud Next in Las Vegas, the collaboration between NVIDIA and Google Cloud reached a new milestone with the announcement of the A5X bare-metal instance, powered by NVIDIA Vera Rubin NVL72 rack-scale systems. This new infrastructure delivers up to 10x lower inference cost per token and 10x higher token throughput per megawatt compared to the previous generation.

NVIDIA and Google Cloud Unveil Next-Gen AI Infrastructure Aimed at Agentic and Physical AI
Source: blogs.nvidia.com

The A5X instances utilize NVIDIA ConnectX-9 SuperNICs combined with next-generation Google Virgo networking, enabling scaling to 80,000 NVIDIA Rubin GPUs in a single site cluster and up to 960,000 GPUs in a multisite cluster. This allows customers to run their largest AI workloads on NVIDIA-optimized infrastructure.

“At Google Cloud, we believe the next decade of AI will be shaped by customers’ ability to run their most demanding workloads on a truly integrated, AI‑optimized infrastructure stack,” said Mark Lohmeyer, vice president and general manager of AI and computing infrastructure at Google Cloud. “By combining Google Cloud’s scalable infrastructure and managed AI services with NVIDIA’s industry‑leading platforms, systems and software, we’re giving customers flexibility to train, tune and serve everything from frontier and open models to agentic and physical AI workloads — while optimizing for performance, cost and sustainability.”

Background

NVIDIA and Google Cloud have collaborated for over a decade, co-engineering a full-stack AI platform that spans every technology layer from performance-optimized libraries to enterprise-grade cloud services. This foundation has enabled developers, startups, and enterprises to push agentic and physical AI from the lab into production.

The partnership’s latest advancements expand the Google Cloud AI Hypercomputer for AI factories, powering the next frontier of agentic and physical AI. These include a preview of Google Gemini on Google Distributed Cloud running on NVIDIA Blackwell and Blackwell Ultra GPUs, confidential VMs with NVIDIA Blackwell GPUs, and agentic AI on Gemini Enterprise Agent Platform with NVIDIA Nemotron open models and the NVIDIA NeMo framework.

NVIDIA and Google Cloud Unveil Next-Gen AI Infrastructure Aimed at Agentic and Physical AI
Source: blogs.nvidia.com

What This Means

With the A5X instances and the broader NVIDIA Blackwell portfolio—ranging from A4 VMs with HGX B200 to rack-scale A4X and A4X Max systems—customers can right-size their acceleration capabilities. Whether using multiple interconnected NVL72 racks scaling to tens of thousands of Blackwell GPUs, a single rack with 72 GPUs via fifth-generation NVIDIA NVLink, or just one-eighth of a GPU, the platform offers unprecedented flexibility.

This infrastructure is critical for running advanced AI agents that manage complex workflows, and for physical AI like robots and digital twins on factory floors. The combination of Google Cloud’s scalable services and NVIDIA’s hardware and software stack gives enterprises the tools to optimize for performance, cost, and sustainability while deploying next-generation AI.

Key Announcements at Google Cloud Next

  • A5X Instances: Powered by NVIDIA Vera Rubin NVL72, delivering 10x improvement in inference cost and throughput per megawatt.
  • Gemini Preview: Google Gemini on Google Distributed Cloud with NVIDIA Blackwell and Blackwell Ultra GPUs.
  • Confidential VMs: With NVIDIA Blackwell GPUs for secure AI workloads.
  • Agentic AI: Integration with Gemini Enterprise Agent Platform using NVIDIA Nemotron and NeMo.

For more details on the infrastructure evolution from NVIDIA Blackwell to Vera Rubin, see the announcement at Google Cloud Next.