Google Cloud Next Recap — TPU, Gemini, and Agent Advances Ushering in a New AI Era

Google Cloud Next Recap

Google Cloud Next brought a wave of innovative announcements shaping the future of cloud computing and AI. Three areas stand out: infrastructure enhancements that dramatically accelerate AI training and inference; expanded capabilities and reach for the "Gemini" large language model; and a next-generation AI agent strategy designed to embed AI across every business process. Google made clear its direction — combining its planet-scale network infrastructure, cutting-edge AI chips, and sophisticated software platforms to help enterprises solve real challenges and create new value.

This article covers the major updates from Google Cloud Next — particularly Cloud WAN network enhancements, the seventh-generation TPU "Ironwood" and its remarkable performance, Gemini's evolution, and new features for AI agent development and deployment centered on Vertex AI — with a focus on what these developments mean for business professionals.

Infrastructure Revolution — Cloud WAN and the Seventh-Generation TPU Ironwood

Among the Google Cloud Next announcements, the evolution of enterprise infrastructure deserves particular attention. Leading the way is Cloud WAN (Cloud Wide Area Network) — Google opening its vast global private network to enterprise customers. This isn't simply a network service offering; it means that enterprises can now directly benefit from the planet-scale network infrastructure Google has developed for its own services. Cloud WAN is designed to optimize application performance and reportedly delivers over 40% speed improvements compared to conventional network solutions. A TCO reduction of up to 40% is also claimed — an attractive proposition for cost-conscious organizations. The fact that global companies like financial services firm Citadel Securities and food giant Nestlé are already using Cloud WAN to achieve faster and more reliable solutions validates its practical effectiveness. The service is expected to be available to all Google Cloud customers later in 2025.

The announcement that could be most consequential for the future of AI development is Ironwood — Google's seventh-generation TPU (Tensor Processing Unit). TPUs are application-specific integrated circuits (ASICs) Google developed to accelerate machine learning workloads, particularly neural network computation, and their evolution is tightly linked to advances in AI model performance. Ironwood, scheduled for release later this year, achieves performance that is genuinely extraordinary: a 3,600-fold improvement compared to Google's first generally available TPU. This suggests AI model training and large-scale inference can be dramatically faster, making it possible to develop models of scales and complexity that were previously out of reach. Google has described Ironwood as "the most powerful chip we've ever built, one that will unlock the next frontier of AI models."

The performance figures are even more striking in concrete terms. A single "pod" contains over 9,000 chips and delivers 42.5 ExaFLOPS of compute. To put that in context: the #1-ranked system on the list of the world's most powerful supercomputers delivers approximately 1.7 ExaFLOPS. A single Ironwood pod has more than 24 times that compute capacity. This is designed to meet the exponentially growing demands of next-generation thinking models like Gemini 2.5.

Equally noteworthy: this massive performance improvement is accompanied by a dramatic gain in energy efficiency — 29 times the efficiency of the original TPU. This reflects Google's genuine commitment to addressing the challenge of rising power consumption that comes with higher performance, and carries significant weight on both sustainability and cost grounds. High performance with a lower environmental footprint — exactly what next-generation computing infrastructure needs to deliver.

Cloud WAN's high-speed, efficient network combined with Ironwood TPU's extraordinary compute and energy efficiency together form a powerful foundation for enterprises to accelerate their businesses with AI.

AI Development and Inference — Comprehensive Software and Platform Enhancements

Google Cloud Next also brought important announcements about software and platforms designed to maximize the value of this powerful hardware infrastructure — streamlining the AI development, deployment, and operations lifecycle so enterprises can adopt and deploy generative AI more easily and cost-effectively.

In AI inference, new capabilities were announced for Google Kubernetes Engine (GKE). GKE automates the deployment, scaling, and management of containerized applications; adding Gen AI-specific scaling and load-balancing features will enable reductions in serving costs of up to 30%, tail latency reductions of up to 60%, and throughput improvements of up to 40%. These are direct improvements to user experience and operational cost.

Google DeepMind's distributed machine learning runtime "Pathways" — which has powered Google's own large-scale models like Gemini — is now being made available to cloud customers for the first time. Pathways efficiently executes complex inference tasks across multiple hosts (servers and accelerators) with dynamic scaling. This makes it possible to scale model serving across hundreds of accelerators (TPUs and GPUs), achieving an optimal balance between the often-competing goals of batch processing efficiency and low latency. This is a significant step forward in addressing the performance and cost challenges of running large AI models in production.

Google also announced work to make the open-source inference library vLLM available on TPUs. vLLM is widely used to accelerate and optimize LLM inference on GPUs. Customers who have GPU-optimized workloads using PyTorch and vLLM can now run them on TPUs easily and cost-effectively — increasing hardware flexibility and allowing them to benefit from TPUs while preserving existing investments.

These hardware (TPU Ironwood) and software (enhanced GKE, Pathways, vLLM on TPUs) enhancements are integrated under the "AI Hypercomputer" concept — delivering more intelligence (useful AI output) at consistently lower costs. As one data point: Gemini 3 Pro running on AI Hypercomputer reportedly delivers 24x the intelligence per dollar compared to competing GPT-5.2 models, and 5x more than DeepSeek-V2. This suggests the potential for a dramatic improvement in AI deployment cost-effectiveness.

Gemini's reach is also expanding. Gemini is now available to run locally in Google Distributed Cloud (GDC) environments — including both internet-isolated air-gapped environments and connected environments. For government agencies and enterprises that handle highly sensitive data, being able to use Gemini on-premises without data leaving the environment is a major advantage. This announcement includes support for NVIDIA's Confidential Computing technology, the latest Blackwell systems (DGX B200, HGX B200), and a Dell partnership. Gemini is also becoming available on the Google Distributed Cloud Air Gap product, which is already certified for U.S. government Secret and Top Secret missions — enabling AI deployment in environments with the highest security and compliance requirements.

Google Workspace — used in day-to-day work — is also getting new Gemini-powered features. Google Sheets gains a "Help me analyze" capability that guides users through completing expert-level analysis on their data. Google Docs gets an "Audio overviews" feature that generates a high-quality audio version of document content, enabling a new interaction mode through listening. "Google Workspace Flows" automates time-consuming repetitive tasks and supports more context-driven decision-making, contributing to productivity.

Additionally, "Lyria" — which generates 30-second music clips from text prompts — will be available on Google Cloud. This is a first among hyperscalers, opening new possibilities for content creation.

Open-source model support is expanding: Meta's Llama 4 is now generally available on Vertex AI, and AI21 Labs' open model portfolio is also accessible. Vertex AI's data connectivity capabilities are enhanced — able to connect to any data source, any vector database, any cloud — with a new ability to build agents directly on data stored in existing NetApp storage without copying the data. Connections to major enterprise applications including Oracle, SAP, ServiceNow, and Workday are also supported. On grounding — improving response reliability — Google describes its approach as "the most comprehensive in the market," combining Google Search, proprietary enterprise data, Google Maps, and third-party sources. These software and platform enhancements reflect a strong commitment to lowering the barriers to AI development so more enterprises can benefit.

Next-Generation AI Agent Strategy — From Development to Enterprise-Wide Deployment

One of the most significant areas at Google Cloud Next was the comprehensive and forward-looking strategy for AI agents. Beyond just providing AI models, a wide range of new tools and frameworks were introduced to accelerate the development and use of agents across entire enterprises — agents that execute concrete tasks, collaborate with humans, and coordinate with other agents to solve complex problems.

Following last year's introduction of Vertex AI Agent Builder, the new "Agent Development Kit (ADK)" was announced — a new open-source framework that simplifies building sophisticated multi-agent systems (systems where multiple AI agents work together). ADK makes it easier for developers to build advanced Gemini-powered agents, equip them with specific tools, and execute complex multi-step tasks involving reasoning and thinking. Agents can discover other agents, learn their capabilities, and work together — while developers maintain precise control over the process.

ADK supports the "Model Context Protocol" — a unified approach for AI models to access and interact with various data sources and tools. Rather than building custom integrations for each data source or tool separately, development efficiency improves substantially.

An "Agent-to-Agent Protocol" was also introduced to facilitate inter-agent coordination — enabling agents built on different underlying AI models or development frameworks to communicate and work together. Google is partnering with providers of other major agent frameworks like LangGraph and CrewAI to support this protocol, working toward an open multi-agent ecosystem.

Google Agent Space — Enabling Enterprise-Wide Agent Adoption

Beyond the development environment, providing an environment where every employee in an organization can easily use AI agents is equally important. "Google Agent Space" addresses this: a new workspace where employees can search and consolidate information across the organization, interact with AI agents, and have those agents take specific actions on enterprise applications on their behalf. Google Agent Space integrates enterprise search at Google quality, conversational AI (chat), Gemini, and third-party agents — with a broad toolset including dedicated connectors for searching documents, databases, and SaaS applications and executing transactions. Enterprise-grade security and compliance controls to protect company data and intellectual property are built in. Even employees without specialized knowledge can benefit from AI agents in their daily work.

Domain-Specific Agents — Customer Engagement, Data Analysis, Software Development

Beyond the general-purpose agent development and deployment platform, domain-specific agent solutions were also introduced.

In customer engagement, a next-generation suite was announced. Features include more human-sounding voice, the ability to understand emotion and adapt during conversations, streaming video support that enables real-time situational interpretation through customers' devices, guided custom agent building through a no-code interface, and the ability to execute specific tasks — product search, cart addition, checkout — through API calls. Customer experience across call centers and customer support is expected to improve substantially.

For data teams, specialized agents were announced by role:

Data Engineering teams: Agents that support every aspect of the data engineering lifecycle — data catalog automation, metadata generation, data quality maintenance, and data pipeline generation.

Data Science teams: A comprehensive coding partner AI agent that works within data science notebooks to accelerate every workflow step from data loading and feature engineering through predictive modeling.

Data Analysts and Business Users: A conversational analytics agent that executes powerful and reliable analyses through natural language dialogue — embeddable directly in enterprise web and mobile applications.

Code Assist Agents were announced to support the full software development lifecycle (SDLC) — from modernizing existing code to supporting tasks across the development process. Developers can interact with agents through a kanban board, checking what Code Assist is working on in real time and giving instructions. Integrations with many partners including Atlassian, Sentry, and Sneak are available, with more partners planned.

These announcements make clear that Google Cloud intends to move AI agents from a technical concept to a practical tool available to every department and every employee across the enterprise. ADK simplifies development; Agent Space enables company-wide deployment; domain-specific agents accelerate practical application. AI agents are set to become indispensable for improving enterprise productivity, accelerating innovation, and strengthening competitive position.

Summary

The announcements from Google Cloud Next clearly illustrate how AI is redefining every dimension of enterprise computing. Infrastructure innovations — Cloud WAN opening planet-scale network infrastructure to enterprises, and the seventh-generation TPU Ironwood taking AI compute to a new level — provide the foundation for developing and deploying more powerful and large-scale AI models.

At the software and platform layer, enhanced GKE inference capabilities, Pathways availability, and vLLM on TPUs improve the efficiency and performance of AI workloads, delivering cost-effective AI use under the AI Hypercomputer concept. Gemini's expansion into Google Distributed Cloud and Workspace means that organizations in high-security environments and in everyday work settings can benefit from leading-edge AI. Lyria, open model support, and Vertex AI's data connectivity and grounding enhancements improve AI development flexibility and reliability.

Most significant is the comprehensive AI agent strategy. ADK simplifies development; open protocols enable inter-agent coordination; Google Agent Space enables deployment across the entire workforce. Together, these transform AI from an analytics tool into an active partner that executes tasks and collaborates with humans. Domain-specific agents for customer engagement, data analysis, and software development will accelerate practical adoption further.

These announcements demonstrate that Google Cloud is not just an infrastructure provider — it's a strategic partner supporting enterprise AI transformation end-to-end. By leveraging these new tools and services, enterprises can accelerate data-driven decision-making, automate and streamline business processes, create new customer experiences, and drive faster innovation. The vision laid out at Google Cloud Next makes a strong impression: the era where AI sits at the heart of business is close at hand.

Reference: https://www.youtube.com/watch?v=2OpHbyN4vEM

TIMEWELL's AI Adoption Support

TIMEWELL is a professional team supporting business transformation in the AI agent era.

Services

ZEROCK: A high-security AI agent running on domestic Japanese servers
TIMEWELL BASE: An AI-native event management platform
WARP: An AI capability development program

In 2026, AI has moved from something you "use" to something you "work alongside." Let's think through your AI strategy together.

Schedule a free consultation →

Google Cloud Next Recap — TPU, Gemini, and Agent Advances Ushering in a New AI Era

Google Cloud Next Recap — TPU, Gemini, and Agent Advances Ushering in a New AI Era