Popular searches
//

KubeCon Europe 2026: AI agents go to production

31.3.2026 | 11 minutes reading time

tl;dr A summary of KubeCon Europe 2026: It is the year AI agents move from prototypes to production. This article covers what that means: giving agents verifiable identities, routing inference traffic with the new Gateway API Inference Extension, governing traffic through AI gateways, and building and managing agents with frameworks.

IMG_1281-boxed.jpeg

From 23. March 2026 till 26. March 2026 KubeCon and CloudNativeCon Europe 2026 happened in Amsterdam, Netherlands. In this article, I share my learnings from this year's conference. It was my first time at KubeCon. Therefore, I can’t compare it to previous years.

It’s a huge event, tons of companies, thousands of people and hundreds of talks. As with every conference, you run into the trouble that interesting talks run in parallel. You have to decide which one to attend. Luckily, the talks got recorded. You can look them up afterwards.

The great thing is that maintainers, companies, engineers, and more are in one place. It's great to meet people from the community you’ve met or worked with before. The benefit of this conference is that it brings everyone together in one place. Getting to know maintainers of the tools you use and run.

AI evolves so rapidly that by the time speakers submit their proposals and the conference takes place, the technology has already advanced. This created a challenge to deliver up-to-date talks.

Takeaway: Getting agents to production

The general theme regarding AI was: In 2025, many AI projects, initiatives, and experiments were kicked off. Now, 2026 is becoming the year of ROI. Investors are expecting tangible results. This means prototypes must move into production. They suddenly face much stricter requirements for security, compliance, and observability.

Use k8s to run AI agents

I used to assume that companies would run AI agents in tightly sandboxed environments like micro-VMs. LLMs are powerful, unpredictable, and easy to trick. That felt like a clear security risk. What I’ve learned instead is that many teams simply run these agents as containers in their existing Kubernetes clusters. Most of these agents serve internal use cases. This lets teams reuse the infrastructure and tooling they already know. But it also moves the security challenges into the k8s environment itself.

Agent identities in k8s with SPIFFE/SPIRE

Running agents on personal machines, it seems natural that they impersonate the user. But when an agent runs within the cluster, how do we identify the agent? It often serves multiple users and exhibits human-like interaction capabilities.

This is what some talks covered. One of them was When an Agent Acts on Your Behalf, Who Holds the Keys? - Mariusz Sabath & Maia Iyer, IBM Research In this talk, the speakers presented kagenti. It cryptographically binds an AI agent’s identity to the user's delegated identity.

All the talks relied on the agent identity on SPIFFE/SPIRE (SPIFFE as the standard, SPIRE the implementation). They showed how SPIRE’s workload attestation can be extended to create a verifiable agent identity. Then they used Keycloak, acting as an OAuth 2.0 server, to manage delegated user identity while preserving context across long, nested transactions.

New traffic patterns demand new infrastructure

AI chats and agent workloads have introduced a very different traffic pattern compared to traditional web services. In classic APIs, each endpoint typically receives requests of similar size. With chatbots and agents, however, every new message is sent along with the entire conversation history as context. As chats grow longer (the request body grows too), request sizes become highly variable and increasingly large, hitting the same endpoint. This shift puts new pressure on ingress, routing, and load balancing for inference workloads.

A whole wave of new projects emerged to address these challenges. They operate at different layers of the stack.

Inference routing: Gateway API Inference Extension

To address inference-specific routing challenges in Kubernetes, a new Inference Gateway API emerged. The Gateway API Inference Extension is an official Kubernetes SIG project that optimises routing and load-balancing for self-hosted generative AI models on Kubernetes. It is not a gateway itself, but a spec that gateway implementations adopt. Envoy AI Gateway, Agentgateway, and Istio all implement it.

llm-d builds on top of this standard. It uses the Inference Extension as its control plane API for intelligent scheduling. The talk Cloud Native Theater | Istio Day: Running State of the Art Inference with Istio and LLM-D - Jackie Maertens, Microsoft and Nili Guy, IBM covered inference routing with istio and llm-d in detail. It aims to improve throughput and latency for real-world LLM traffic.

AI Gateways: proxying LLM, MCP, and agent traffic

One layer up from inference routing sit the AI gateways. These proxy and govern the traffic between applications/agents and LLMs, between agents and tools (MCP), and between agents themselves (A2A). They handle concerns like authentication, rate limiting, observability, and policy enforcement.

Several projects occupy this space, with overlapping but distinct scopes:

Envoy AI Gateway extends Envoy Gateway to manage traffic to generative AI services. It supports routing to LLM providers through a unified interface. It provides token-aware rate limiting, provider failover, and an MCP gateway with stateless session management. It implements the Gateway API Inference Extension for self-hosted model routing. It is Kubernetes-native, built on Envoy Proxy, and maintained by the Envoy/CNCF community.

Agentgateway is a Rust-based AI-native proxy that covers the broadest scope. It handles LLM traffic, MCP (tool federation with stdio/HTTP/SSE transports), and A2A (agent-to-agent) protocol. This makes it the only gateway of the three that supports all three traffic types. It also provides built-in guardrails (regex filtering, moderation APIs), a Cedar/CEL policy engine, and tool poisoning protection. It is a Linux Foundation project (originated at Solo.io) and runs both standalone and on Kubernetes.

Kuadrant is a CNCF Sandbox project that takes a different approach: it is a policy layer on top of existing gateways, not a standalone gateway. It attaches auth, rate limiting, DNS, and TLS policies to Gateway API resources. Its mcp-gateway sub-project specifically handles MCP server aggregation and tool-level authorisation (e.g., via Keycloak groups).

These talks explored the gateway and policy space:

Building and managing agents on Kubernetes

Beyond the networking and gateway layer, another set of tools focuses on the agent lifecycle itself: building agents, deploying them, discovering tools, and orchestrating workflows.

Agent frameworks

kagent is a CNCF Sandbox project (created at Solo.io) that provides a Kubernetes-native framework for building AI agents. Agents, tools, and model configurations are all defined as Kubernetes CRDs. It ships with pre-built MCP tool servers for cloud-native operations (Kubernetes, Istio, Helm, Argo, Prometheus, Grafana, Cilium). Its engine is built on Google ADK. Note: this is a different project from kagenti (IBM), which focuses on securing and operating agents rather than building them.

Dapr Agents takes a different approach, focusing on resilient agent workflows. Built on the CNCF Dapr runtime, it uses a virtual actor model where each agent is a lightweight, stateful actor with scale-to-zero support. Its key differentiator is durable execution: agent workflows are guaranteed to complete through automatic retries and state recovery after crashes. Dapr provides built-in pub/sub, state management, and service invocation out of the box.

Both frameworks were featured in talks that showed them in action:

  • Day-2 Ready: Bringing agentic pilots to production - Idit Levine & Keith Babo, Solo.io: Solo.io presented their vision for taking agents from prototype to production. They introduced kagent as the framework for defining and running agents on Kubernetes, and agentregistry (covered below) as the missing discovery layer. The central argument was that production-readiness for agents requires the same declarative, GitOps-friendly patterns that platform teams already use for the rest of their infrastructure.
  • Orchestrating Document Data Extraction with Dapr Agents - Fabian Steinbach, Software Architect, ZEISS: ZEISS Vision Care shared how they built a document data extraction pipeline using Dapr Agents. Their challenge was processing highly variable, multi-lingual documents with different layouts and handwriting. By composing deterministic workflows that blend specialized OCR, LLM calls, and standard code, they went from concept to production in two months. The talk demonstrated a key advantage of Dapr Agents: because workflows are durable and vendor-neutral, the team could swap AI providers without rearchitecting the system.

Agent registries

agentregistry addresses a practical problem: MCP servers and AI tools are scattered across npm, PyPI, Docker Hub, GitHub repos, and random URLs. Nobody knows which ones are trustworthy, which versions work, or how to get them running. agentregistry is an open-source platform that gives you one place to find, manage, and run MCP servers, AI agents, and skills. You import or publish artifacts once, and then anyone on your team can discover them, deploy them with one command, and have their IDE automatically configured to use them.

The broader AI landscape

Beyond agents and gateways, a few other AI-related talks stood out:

  • MCP in 2026: Context is All You Need - David Soria Parra, Anthropic: David Soria Parra, one of the creators of MCP at Anthropic, gave an update on where the Model Context Protocol is heading. The key takeaway was that MCP is moving beyond simple tool calling toward richer context exchange between agents and their environment, making the protocol a foundational building block for agentic systems rather than just a tool integration spec.
  • The New AI Coding Fabric - Patrick Debois, Tessl: Patrick Debois drew a parallel between the DevOps movement and today's autonomous coding agents. Just as DevOps introduced patterns for continuous integration and delivery, he argues we need a similar "fabric" of practices, pipelines, and feedback loops for AI-driven software development. The talk outlined how teams can apply lessons from DevOps to govern, test, and trust code produced by coding agents at scale.
  • Is AIOps the Future of Operations? Real Use Cases From the Trenches - Iveri Prangishvili & Danilo Banjac, Adobe: Adobe shared real-world examples of using agentic workflows to automate on-call tasks, diagnose incidents, and drive continuous improvement. The practical takeaway was that AIOps is not about replacing operators but about reducing manual toil: agents handle the repetitive triage and data gathering so that humans can focus on the decisions that matter.

Surprise highlight talk: Virtual Power Plant

One of the standout sessions for me was Virtual Power Plants (VPP): How They Work and What They Are - LeRenzo Malcom & Mario Flores, Enpal

They started by explaining the power grid as a single giant machine that must remain at a stable 50 Hz frequency, with supply and demand matched every second. The European grid, the largest in the world, has essentially never been shut down. Yet, events like the coldest winter in Texas showed how close we can come to a total blackout, which operators avoided by only a few minutes.

From there, they showed how renewables disrupt this fragile balance. Solar power can flood the grid with energy during the day and then vanish in the evening. During the day, it pushes prices so low that providers effectively pay people to use electricity. This creates the famous duck curve: low net demand at midday and a steep ramp-up later.

Virtual power plants are their answer to this problem. A VPP connects many small assets like rooftop solar, home batteries, and electric cars. It controls them via software, so they act as one flexible power plant. It can shift when homes and cars use or store energy, soaking up excess solar when it is cheap and feeding power back when demand is high. That makes the grid more stable, helps integrate more renewables, and gives households a way to save money or even earn by offering their flexibility.

If you want to dive deeper, their website and the talk recording are well worth a look.

SWAG

As always with events, there is SWAG (stuff we all get).

kubecon-swag.jpg

These are the highlights I found:

  • JetBrains bag featuring a map of Amsterdam. Uniquely designed for this conference, paired with the Dutch tulip tradition.
  • Solo.io van Gogh shirt featuring the k8s mascots. Special image blending van Gogh art, known from Amsterdam, with the k8s brand.
  • ASML Wafer Waffles. ASML merged stroopwafels (thin caramel-filled waffle cookies that are a classic local treat) with their wafer lithography systems.

Besides the highlights, as always, you could get shirts, socks, bags, stickers, coffee, muffins and more at each booth. Other marketing themes this year included raffles for LEGO sets, vacations, and gaming consoles.

IMG_1438.jpeg

That's it. If you went to KubeCon, I hope you had fun.

Thank you for reading.

share post

//

More articles in this subject area

Discover exciting further topics and let the codecentric world inspire you.