KubeCon Europe 2026: AI agents go to production

31.3.2026 | 11 minutes reading time

tl;dr A summary of KubeCon Europe 2026: It is the year AI agents move from prototypes to production. This article covers what that means: giving agents verifiable identities, routing inference traffic with the new Gateway API Inference Extension, governing traffic through AI gateways, and building and managing agents with frameworks.

From 23. March 2026 till 26. March 2026 KubeCon and CloudNativeCon Europe 2026 happened in Amsterdam, Netherlands. In this article, I share my learnings from this year's conference. It was my first time at KubeCon. Therefore, I can’t compare it to previous years.

It’s a huge event, tons of companies, thousands of people and hundreds of talks. As with every conference, you run into the trouble that interesting talks run in parallel. You have to decide which one to attend. Luckily, the talks got recorded. You can look them up afterwards.

The great thing is that maintainers, companies, engineers, and more are in one place. It's great to meet people from the community you’ve met or worked with before. The benefit of this conference is that it brings everyone together in one place. Getting to know maintainers of the tools you use and run.

AI evolves so rapidly that by the time speakers submit their proposals and the conference takes place, the technology has already advanced. This created a challenge to deliver up-to-date talks.

Takeaway: Getting agents to production

The general theme regarding AI was: In 2025, many AI projects, initiatives, and experiments were kicked off. Now, 2026 is becoming the year of ROI. Investors are expecting tangible results. This means prototypes must move into production. They suddenly face much stricter requirements for security, compliance, and observability.

Use k8s to run AI agents

I used to assume that companies would run AI agents in tightly sandboxed environments like micro-VMs. LLMs are powerful, unpredictable, and easy to trick. That felt like a clear security risk. What I’ve learned instead is that many teams simply run these agents as containers in their existing Kubernetes clusters. Most of these agents serve internal use cases. This lets teams reuse the infrastructure and tooling they already know. But it also moves the security challenges into the k8s environment itself.

Agent identities in k8s with SPIFFE/SPIRE

Running agents on personal machines, it seems natural that they impersonate the user. But when an agent runs within the cluster, how do we identify the agent? It often serves multiple users and exhibits human-like interaction capabilities.

This is what some talks covered. One of them was When an Agent Acts on Your Behalf, Who Holds the Keys? - Mariusz Sabath & Maia Iyer, IBM Research In this talk, the speakers presented kagenti. It cryptographically binds an AI agent’s identity to the user's delegated identity.

All the talks relied on the agent identity on SPIFFE/SPIRE (SPIFFE as the standard, SPIRE the implementation). They showed how SPIRE’s workload attestation can be extended to create a verifiable agent identity. Then they used Keycloak, acting as an OAuth 2.0 server, to manage delegated user identity while preserving context across long, nested transactions.

New traffic patterns demand new infrastructure

AI chats and agent workloads have introduced a very different traffic pattern compared to traditional web services. In classic APIs, each endpoint typically receives requests of similar size. With chatbots and agents, however, every new message is sent along with the entire conversation history as context. As chats grow longer (the request body grows too), request sizes become highly variable and increasingly large, hitting the same endpoint. This shift puts new pressure on ingress, routing, and load balancing for inference workloads.

A whole wave of new projects emerged to address these challenges. They operate at different layers of the stack.

Inference routing: Gateway API Inference Extension

To address inference-specific routing challenges in Kubernetes, a new Inference Gateway API emerged. The Gateway API Inference Extension is an official Kubernetes SIG project that optimises routing and load-balancing for self-hosted generative AI models on Kubernetes. It is not a gateway itself, but a spec that gateway implementations adopt. Envoy AI Gateway, Agentgateway, and Istio all implement it.

llm-d builds on top of this standard. It uses the Inference Extension as its control plane API for intelligent scheduling. The talk Cloud Native Theater | Istio Day: Running State of the Art Inference with Istio and LLM-D - Jackie Maertens, Microsoft and Nili Guy, IBM covered inference routing with istio and llm-d in detail. It aims to improve throughput and latency for real-world LLM traffic.

AI Gateways: proxying LLM, MCP, and agent traffic

One layer up from inference routing sit the AI gateways. These proxy and govern the traffic between applications/agents and LLMs, between agents and tools (MCP), and between agents themselves (A2A). They handle concerns like authentication, rate limiting, observability, and policy enforcement.

Several projects occupy this space, with overlapping but distinct scopes:

Envoy AI Gateway extends Envoy Gateway to manage traffic to generative AI services. It supports routing to LLM providers through a unified interface. It provides token-aware rate limiting, provider failover, and an MCP gateway with stateless session management. It implements the Gateway API Inference Extension for self-hosted model routing. It is Kubernetes-native, built on Envoy Proxy, and maintained by the Envoy/CNCF community.

Agentgateway is a Rust-based AI-native proxy that covers the broadest scope. It handles LLM traffic, MCP (tool federation with stdio/HTTP/SSE transports), and A2A (agent-to-agent) protocol. This makes it the only gateway of the three that supports all three traffic types. It also provides built-in guardrails (regex filtering, moderation APIs), a Cedar/CEL policy engine, and tool poisoning protection. It is a Linux Foundation project (originated at Solo.io) and runs both standalone and on Kubernetes.

Kuadrant is a CNCF Sandbox project that takes a different approach: it is a policy layer on top of existing gateways, not a standalone gateway. It attaches auth, rate limiting, DNS, and TLS policies to Gateway API resources. Its mcp-gateway sub-project specifically handles MCP server aggregation and tool-level authorisation (e.g., via Keycloak groups).

These talks explored the gateway and policy space:

Envoy in the Era of Agentic Workloads - Yan Avlasov, Google & Erica Hughberg, Tetrate: This talk explores how agentic workloads stress both data and control planes, and shows how Envoy and Envoy AI Gateway provide building blocks for service-centric networking for inference, MCP, and agent-to-agent (A2A) requests.
Least-Privilege for AI: Authorizing Agents and MCP Tools with Agentgateway and Kyverno - Luc Chmielowski, Nirmata & Nina Polshakova, Solo.io: This session shows how Agentgateway integrates with Kyverno policies to evaluate MCP traffic, preventing uncontrolled tool access and privilege escalation while ensuring namespace and tenant isolation.
Keeping Your Agents in Check: Layered Security for Agentic Platforms in Production - Evaline Ju, IBM Research & Kelly Abuelsaad, IBM: This talk explores a layered approach to securing agentic platforms, from identity and authorisation to guardrails at the Envoy gateway layer, demonstrating how risks like prompt injection and unauthorised tool access can be mitigated. The speakers also presented Kuadrant as an MCP Gateway that enforces policy and audit controls at a single, trusted point between agents and their tools.

Building and managing agents on Kubernetes

Beyond the networking and gateway layer, another set of tools focuses on the agent lifecycle itself: building agents, deploying them, discovering tools, and orchestrating workflows.

Agent frameworks

kagent is a CNCF Sandbox project (created at Solo.io) that provides a Kubernetes-native framework for building AI agents. Agents, tools, and model configurations are all defined as Kubernetes CRDs. It ships with pre-built MCP tool servers for cloud-native operations (Kubernetes, Istio, Helm, Argo, Prometheus, Grafana, Cilium). Its engine is built on Google ADK. Note: this is a different project from kagenti (IBM), which focuses on securing and operating agents rather than building them.

Dapr Agents takes a different approach, focusing on resilient agent workflows. Built on the CNCF Dapr runtime, it uses a virtual actor model where each agent is a lightweight, stateful actor with scale-to-zero support. Its key differentiator is durable execution: agent workflows are guaranteed to complete through automatic retries and state recovery after crashes. Dapr provides built-in pub/sub, state management, and service invocation out of the box.

Both frameworks were featured in talks that showed them in action:

Day-2 Ready: Bringing agentic pilots to production - Idit Levine & Keith Babo, Solo.io: Solo.io presented their vision for taking agents from prototype to production. They introduced kagent as the framework for defining and running agents on Kubernetes, and agentregistry (covered below) as the missing discovery layer. The central argument was that production-readiness for agents requires the same declarative, GitOps-friendly patterns that platform teams already use for the rest of their infrastructure.
Orchestrating Document Data Extraction with Dapr Agents - Fabian Steinbach, Software Architect, ZEISS: ZEISS Vision Care shared how they built a document data extraction pipeline using Dapr Agents. Their challenge was processing highly variable, multi-lingual documents with different layouts and handwriting. By composing deterministic workflows that blend specialized OCR, LLM calls, and standard code, they went from concept to production in two months. The talk demonstrated a key advantage of Dapr Agents: because workflows are durable and vendor-neutral, the team could swap AI providers without rearchitecting the system.

Agent registries

agentregistry addresses a practical problem: MCP servers and AI tools are scattered across npm, PyPI, Docker Hub, GitHub repos, and random URLs. Nobody knows which ones are trustworthy, which versions work, or how to get them running. agentregistry is an open-source platform that gives you one place to find, manage, and run MCP servers, AI agents, and skills. You import or publish artifacts once, and then anyone on your team can discover them, deploy them with one command, and have their IDE automatically configured to use them.

The broader AI landscape

Beyond agents and gateways, a few other AI-related talks stood out:

MCP in 2026: Context is All You Need - David Soria Parra, Anthropic: David Soria Parra, one of the creators of MCP at Anthropic, gave an update on where the Model Context Protocol is heading. The key takeaway was that MCP is moving beyond simple tool calling toward richer context exchange between agents and their environment, making the protocol a foundational building block for agentic systems rather than just a tool integration spec.
The New AI Coding Fabric - Patrick Debois, Tessl: Patrick Debois drew a parallel between the DevOps movement and today's autonomous coding agents. Just as DevOps introduced patterns for continuous integration and delivery, he argues we need a similar "fabric" of practices, pipelines, and feedback loops for AI-driven software development. The talk outlined how teams can apply lessons from DevOps to govern, test, and trust code produced by coding agents at scale.
Is AIOps the Future of Operations? Real Use Cases From the Trenches - Iveri Prangishvili & Danilo Banjac, Adobe: Adobe shared real-world examples of using agentic workflows to automate on-call tasks, diagnose incidents, and drive continuous improvement. The practical takeaway was that AIOps is not about replacing operators but about reducing manual toil: agents handle the repetitive triage and data gathering so that humans can focus on the decisions that matter.

Surprise highlight talk: Virtual Power Plant

One of the standout sessions for me was Virtual Power Plants (VPP): How They Work and What They Are - LeRenzo Malcom & Mario Flores, Enpal

They started by explaining the power grid as a single giant machine that must remain at a stable 50 Hz frequency, with supply and demand matched every second. The European grid, the largest in the world, has essentially never been shut down. Yet, events like the coldest winter in Texas showed how close we can come to a total blackout, which operators avoided by only a few minutes.

From there, they showed how renewables disrupt this fragile balance. Solar power can flood the grid with energy during the day and then vanish in the evening. During the day, it pushes prices so low that providers effectively pay people to use electricity. This creates the famous duck curve: low net demand at midday and a steep ramp-up later.

Virtual power plants are their answer to this problem. A VPP connects many small assets like rooftop solar, home batteries, and electric cars. It controls them via software, so they act as one flexible power plant. It can shift when homes and cars use or store energy, soaking up excess solar when it is cheap and feeding power back when demand is high. That makes the grid more stable, helps integrate more renewables, and gives households a way to save money or even earn by offering their flexibility.

If you want to dive deeper, their website and the talk recording are well worth a look.

SWAG

As always with events, there is SWAG (stuff we all get).

These are the highlights I found:

JetBrains bag featuring a map of Amsterdam. Uniquely designed for this conference, paired with the Dutch tulip tradition.
Solo.io van Gogh shirt featuring the k8s mascots. Special image blending van Gogh art, known from Amsterdam, with the k8s brand.
ASML Wafer Waffles. ASML merged stroopwafels (thin caramel-filled waffle cookies that are a classic local treat) with their wafer lithography systems.

Besides the highlights, as always, you could get shirts, socks, bags, stickers, coffee, muffins and more at each booth. Other marketing themes this year included raffles for LEGO sets, vacations, and gaming consoles.

That's it. If you went to KubeCon, I hope you had fun.

Thank you for reading.

Was this post helpful?

Blog author

Florian Wiech

Do you still have questions? Just send me a message.

Replacing Low-Code Platforms with AI-Driven Custom Development in Healthcare

AI
Software development
Integration

27.6.2026

Autonomous development workflows with Claude Code

Most developers today use AI tools as faster autocomplete. Over the past few months, on a client project, I took a different path: multi-agent setups with Claude Code, where specialized agents work in parallel, review one another, and coordinate on their...

AI
Software development
Generative AI

22.6.2026 | 17 minutes reading time

Christoph Dalski

From prompt to product: Why the design step matters

Anyone working with AI-assisted coding assistants today knows the promise: Type a description, and seconds later a working interface appears. Tools like Cursor, Claude Code, or GitHub Copilot deliver increasingly impressive results. Yet what is convincing...

AI
UX/UI
Frontend
Generative AI

16.6.2026 | 9 minutes reading time

Michel Ehmen

Ensuring accessibility with AI: what works today (and what doesn't)

Since June 2025, the Barrierefreiheitsstärkungsgesetz (BFSG), Germany's law implementing the European Accessibility Act, has been in effect. Most teams know they should be doing something about it, but in day-to-day work, the topic usually falls by the...

Accessibility
AI
UX/UI
Testing

2.6.2026 | 11 minutes reading time

Building MCP Servers with Spring AI

Introduction The Model Context Protocol (MCP) is an open standard that defines how AI models communicate with external tools, services, and data sources. It replaces ad-hoc integrations with a single, well-defined JSON-RPC 2.0 protocol, making it easy...

AI
Software development

17.5.2026 | 5 minutes reading time

Tobias Trelle

From Inference to Governance: Why Agent Metadata Matters When LLMs Already...

Modern LLMs demonstrate strong capability in inferring meaning from column names. A tool such as Genie can typically resolve pct_cust_attrit_q to "churn" or map rev_mrr_usd to a"MRR" through pattern recognition alone. On a small, well-structured table...

AI
LLM
Big Data
Database

15.5.2026 | 6 minutes reading time

Niklas Niggemann

The Accessible Domain: Knowledge Engineering for AI-Assisted Development

The Old Promise In the late 1970s, Stanford computer scientist Edward Feigenbaum coined the term "Knowledge Engineering". He described it as the process of extracting expert knowledge, structuring it, and making it usable within a software system. Central...

Generative AI
AI
LLM
Software Modernization
Software development

11.5.2026 | 10 minutes reading time

Johannes Barop

Benjamin Font Pera

Data Quality Powers AI Analytics: Building Trustworthy Genie Spaces in...

Garbage In, Garbage Out. This computing truism has never been more critical than in the age of AI. Large Language Models don't amplify poor data quality, they wrap it in confident-sounding prose that can mislead even experienced users. As organizations...

Generative AI
LLM
AI
Data

7.5.2026 | 8 minutes reading time

Niklas Niggemann

16,000 Tests in 4 Days – Reaching 80% Test Coverage with Claude Code

The Starting Point When we at codecentric recently took over a codebase from a previous service provider for a client, it quickly became clear that this would be no ordinary challenge. Backends, frontends, batch jobs, services — a grown application landscape...

AI
Software development
Testing

5.5.2026 | 12 minutes reading time

Selvarajah Sivarupan

Is Spring Boot Becoming Obsolete?

In March 2026, we kicked off a modernization project for a client. Spring Boot was an obvious choice. There was a strategic decision behind it. There was existing know-how. There was existing infrastructure. The team was set. The work began. One of the...

Generative AI
LLM
AI
Software development
Software architecture

27.4.2026 | 7 minutes reading time

Johannes Barop

EXACT Coding: AI-powered development that prioritizes quality over chaotic...

TL;DR Uncontrolled agentic coding (“vibe coding”) delivers code quickly—and often leads to security and maintenance issues as soon as the software goes live. EXACT Coding (Example-guided AI-Collaborative Test-driven Coding) combines best practices: ....

Generative AI
AI
Test Driven Development

22.4.2026 | 7 minutes reading time

Marco Emrich

Ferdinand Ade

The Ralph Wiggum Loop: Autonomous Code Generation with a Fresh Context

Ralph Wiggum is the simple-minded boy from The Simpsons who says things like "I'm learnding!" and eats glue. Of all people, he is now the namesake for a technique for autonomous code generation. The idea behind: If the thought of letting code be generated...

Generative AI
LLM
AI
Software development

6.4.2026 | 7 minutes reading time

Johannes Barop

AI Code Tsunami Hits the QA Dam: The End of Balanced Velocity

Note upfront: This article is specifically aimed at teams working on the modernization and further development of existing systems, not at greenfield projects where completely different rules apply. Everyone is talking about the massive productivity ...

Generative AI
AI
DevOps
Test Driven Development
Testing

30.3.2026 | 8 minutes reading time

DeepFake: Detect AI-Generated Images in 5 Steps

We live in a time when an image is no longer a reliable guarantee of truth. AI‑generated content floods social media feeds, news platforms and messenger groups every single day, and only very few people are able to tell the difference. What once required...

IT-Security
AI
Generative AI
Search
Google
data protection
Digitalization

16.3.2026 | 5 minutes reading time

From Stories to Code: How Domain Storytelling and EventStorming Give LLMs...

The Broken Promise of AI-Assisted Development By now, most development teams have tried using an LLM to generate code. The results are familiar: syntactically correct, superficially plausible, and frequently wrong in ways that take hours to diagnose...

4.3.2026 | 15 minutes reading time

Don't Let Your AI Cheat: Isolated Specification Testing with Claude Code

AI agents are powerful — but they will cheat if you let them. Letting the same agent develop and test your application risks one thing: it will no longer fulfill the specification, it will simply learn to pass the tests. This article shows how to ...

AI
LLM
Testing

2.3.2026 | 12 minutes reading time

Thomas Jaspers

5 reasons we developers misjudge agentic software engineering

Throughout 2025 a kind of trench warfare raged between software developers on the pro and anti-AI development camps. We are, by definition, the experts on software creation. Ironically, this also makes us highly biased, and is exactly the reason you ...

Generative AI
AI

8.1.2026 | 5 minutes reading time

John Fletcher

The developer's dilemma - mastering the transition to AI engineering

Dear software developer, please choose one of the following options for 2026 and beyond:a) finding yourself with obsolete skills, and eventually, unemployed. b) salary increases lower than inflation, whilst expectations of your output continually increase...

AI
Generative AI

1.1.2026 | 11 minutes reading time

John Fletcher

Ingress NGINX Retirement — Don't Panic, We've Got You Covered

Ingress NGINX Retirement - Don't Panic, We've Got You Covered At KubeCon NA 2025, the Kubernetes community faced a significant announcement: the ingress-nginx controller officially entered retirement by March 2026. Adding to the uncertainty, its designated...

Kubernetes
Cloud native
DevOps
Cloud

18.11.2025 | 5 minutes reading time

Manuel Zapf

Using Environment Variables with the Cloud Native Buildpacks lifecycle

Using Cloud Native Buildpacks (CNB) in your GitLab CI pipelines without Docker and pack CLI is common practice. Especially if you leverage shared Kubernetes runners without access to the host's Docker daemon. In this case you need to use the CNB lifecycle...

Cloud native
Platform engineering
DevOps
Infrastructure as Code
GitLab

1.10.2025 | 11 minutes reading time

KubeCon Europe 2026: AI agents go to production

Takeaway: Getting agents to production

Use k8s to run AI agents

Agent identities in k8s with SPIFFE/SPIRE

New traffic patterns demand new infrastructure

Inference routing: Gateway API Inference Extension

AI Gateways: proxying LLM, MCP, and agent traffic

Building and managing agents on Kubernetes

Agent frameworks

Agent registries

The broader AI landscape

Surprise highlight talk: Virtual Power Plant

SWAG

Was this post helpful?

Blog author

More articles in this subject area

Replacing Low-Code Platforms with AI-Driven Custom Development in Healthcare

Autonomous development workflows with Claude Code

From prompt to product: Why the design step matters

Ensuring accessibility with AI: what works today (and what doesn't)

Building MCP Servers with Spring AI

From Inference to Governance: Why Agent Metadata Matters When LLMs Already...

The Accessible Domain: Knowledge Engineering for AI-Assisted Development

Data Quality Powers AI Analytics: Building Trustworthy Genie Spaces in...

16,000 Tests in 4 Days – Reaching 80% Test Coverage with Claude Code

Is Spring Boot Becoming Obsolete?

EXACT Coding: AI-powered development that prioritizes quality over chaotic...

The Ralph Wiggum Loop: Autonomous Code Generation with a Fresh Context

AI Code Tsunami Hits the QA Dam: The End of Balanced Velocity

DeepFake: Detect AI-Generated Images in 5 Steps

From Stories to Code: How Domain Storytelling and EventStorming Give LLMs...

Don't Let Your AI Cheat: Isolated Specification Testing with Claude Code

5 reasons we developers misjudge agentic software engineering

The developer's dilemma - mastering the transition to AI engineering

Ingress NGINX Retirement — Don't Panic, We've Got You Covered

Using Environment Variables with the Cloud Native Buildpacks lifecycle