From Inference to Governance: Why Agent Metadata Matters When LLMs Already Understand Your Data

15.5.2026 | 6 minutes reading time

Modern LLMs demonstrate strong capability in inferring meaning from column names. A tool such as Genie can typically resolve pct_cust_attrit_q to "churn" or map rev_mrr_usd to a"MRR" through pattern recognition alone. On a small, well-structured table, inference produces correct results in the majority of cases.

However, "the majority of cases" does not constitute a governance standard. As schemas scale to hundreds of columns, as multiple columns become plausible matches for a single query, and as different departments adopt divergent terminology for the same metric, inference alone is no longer sufficient. Databricks Agent Metadata addresses this gap, not by enabling AI tools to interpret data for the first time, but by ensuring they do so consistently, correctly, and at scale. Agent Metadata requires Databricks Runtime 17.3 and YAML version 1.1.

Where Inference Falls Short

On a small table with six descriptive columns, an LLM can often infer the correct mapping through pattern recognition. Consider the following metrics table:

Column	Example Value
`rev_mrr_usd`	48250.00
`pct_cust_attrit_q`	0.034
`n_active_subs`	1205
`dt_cohort_start`	2025-01-15

Genie can reasonably deduce that rev_mrr_usd relates to MRR and that pct_cust_attrit_q involves customer attrition. With a limited number of columns and recognizable abbreviations, inference produces adequate results.

Production schemas, however, rarely present this level of clarity. When a table contains rev_mrr_usd, rev_nrr_usd, rev_grr_usd, rev_arr_usd, and rev_exp_usd, a query referencing "revenue" could plausibly match any of them. When Finance refers to a metric as "net retention" while Product uses "expansion revenue", both expecting resolution to distinct columns, the model has no basis for disambiguation. When a schema spans hundreds of columns across multiple business domains, the probability of incorrect resolution increases proportionally.

In these scenarios, inference becomes unreliable, not due to a limitation in model capability, but because ambiguity, scale, and terminological inconsistency exceed what pattern recognition alone can resolve.

Agent Metadata: Embedding Business Context into Data Definitions

Agent Metadata in Unity Catalog enables organizations to attach business context directly to their data definitions through three mechanisms: display names, synonyms, and format specifications. This metadata is governed within Unity Catalog and automatically consumed by downstream tools including dashboards and AI assistants.

The following example demonstrates a metric view definition for a SaaS metrics model:

1version: 1.1
2
3source: analytics.saas.subscription_metrics
4
5dimensions:
6  - name: dt_cohort_start
7    expr: dt_cohort_start
8    display_name: 'Cohort Start Date'
9    synonyms:
10      - signup date
11      - cohort date
12      - when they signed up
13
14  - name: plan_tier
15    expr: plan_tier
16    display_name: 'Plan Tier'
17    synonyms:
18      - pricing plan
19      - subscription level
20      - plan type
21
22  - name: region_code
23    expr: region_code
24    display_name: 'Region'
25    synonyms:
26      - geography
27      - market
28      - territory
29
30measures:
31  - name: rev_mrr_usd
32    expr: SUM(rev_mrr_usd)
33    display_name: 'Monthly Recurring Revenue'
34    synonyms:
35      - MRR
36      - monthly revenue
37      - recurring revenue
38    format:
39      type: currency
40      currency_code: USD
41      decimal_places:
42        type: exact
43        places: 2
44
45  - name: pct_cust_attrit_q
46    expr: AVG(pct_cust_attrit_q)
47    display_name: 'Quarterly Churn Rate'
48    synonyms:
49      - churn
50      - attrition rate
51      - customer churn
52      - churn rate
53    format:
54      type: percentage
55      decimal_places:
56        type: exact
57        places: 1
58
59  - name: n_active_subs
60    expr: SUM(n_active_subs)
61    display_name: 'Active Subscriptions'
62    synonyms:
63      - active customers
64      - subscriber count
65      - active users
66    format:
67      type: number
68      decimal_places:
69        type: exact
70        places: 0

This definition leverages three complementary metadata capabilities:

Display names replace technical column names with human-readable labels across dashboards and reports. For example, rev_mrr_usd is presented as "Monthly Recurring Revenue" in any downstream visualization.

Synonyms enable AI discoverability. When a user asks Genie "What's our churn?", the synonym mapping resolves that term to the correct measure. Each dimension or measure supports up to 10 synonyms of up to 255 characters, providing sufficient coverage for the terminology variations that exist across teams and departments.

Format specifications define how values are rendered in visualization tools. Churn is displayed as 3.4% rather than 0.034, and MRR as $48,250.00 rather than a raw numeric value. These formatting rules propagate automatically to all dashboards built on the metric view.

The Difference in Practice: Where Metadata Changes the Outcome

On a simple schema, Genie's inference is often sufficient to resolve common abbreviations and return correct results without any metadata. The value of Agent Metadata becomes apparent in three areas: ambiguity resolution, presentation quality, and consistency across teams.

Ambiguity Resolution

Consider the query: "Show me MRR by pricing plan"

On the six-column table above, Genie resolves this correctly without metadata. The abbreviation mrr is present in the column name, and plan_tier is the only plausible grouping column. There is no ambiguity to resolve.

Now consider a production schema containing the following revenue columns:

Column	Description
`rev_mrr_usd`	Monthly Recurring Revenue
`rev_nrr_usd`	Net Revenue Retention
`rev_grr_usd`	Gross Revenue Retention
`rev_arr_usd`	Annual Recurring Revenue
`rev_exp_usd`	Expansion Revenue

The same query, "Show me revenue by plan", now presents five plausible matches. Without metadata, the model must infer which revenue column the user intends, with no mechanism to guarantee a correct selection. With synonyms explicitly mapping "MRR" to rev_mrr_usd and "expansion revenue" to rev_exp_usd, the resolution becomes deterministic.

Presentation Quality

Even when Genie resolves the correct column without metadata, the output quality differs significantly.

Without Agent Metadata:

pct_cust_attrit_q	region_code
0.034	NA
0.051	EMEA
0.028	APAC

With Agent Metadata:

Region	Quarterly Churn Rate
NA	3.4%
EMEA	5.1%
APAC	2.8%

The underlying data is identical. However, display names replace technical column headers, and format specifications render 0.034 as 3.4%. This represents the minimum improvement that Agent Metadata provides, independent of whether inference would have succeeded. For stakeholders consuming results in dashboards or Genie responses, this distinction is not cosmetic, a table of raw decimals requires the reader to infer the unit and context. A properly formatted table communicates both immediately.

Consistency Across Teams

The most significant benefit of Agent Metadata is one that does not surface in a single-user test. When Finance queries "net retention" and Product queries "expansion revenue", synonyms ensure both terms resolve to their respective correct columns. Without metadata, both queries rely on the LLM's interpretation, which may vary depending on phrasing, context window content, or model version.

Agent Metadata eliminates this variability. The mapping is explicit, governed, and version-controlled within Unity Catalog. Every user, across every tool, resolves the same term to the same column, not because the model inferred correctly, but because the definition is authoritative.

The Bigger Picture

Agent Metadata does not address a problem that is immediately visible on a small, well-structured dataset. An LLM will often produce correct results without it. This can lead organizations to underestimate its value, until schema complexity increases, teams scale, or a quarterly report surfaces a discrepancy traceable to an ambiguous column resolution.

The value of Agent Metadata is structural. It elevates the semantic layer from a presentation concern, historically managed at the BI tool level, to a governed component of the data catalog. Business meaning is defined once within Unity Catalog, version-controlled, subject to access policies, and consumed automatically by every downstream tool, dashboards, Genie, notebooks, and any future integration.

For organizations operating at scale, this represents the difference between an AI tool that produces correct results in most cases and one that does so reliably. Synonyms eliminate ambiguity. Display names ensure readability. Format specifications enforce consistent presentation. None of these mechanisms depend on model inference, and none degrade as schemas grow in complexity.

The question organizations should consider is not whether an LLM can interpret their data without metadata, in many cases, it can. The question is whether inference alone provides a sufficient foundation for organizational reporting and decision-making.

Was this post helpful?

Blog author

Niklas Niggemann

Working Student Data & AI

Do you still have questions? Just send me a message.

AI as a Design Partner — Drafter, Validator, Provocateur

Part of the series Domain-Driven Design Meets AI. The previous post introduced the Synergetic Blueprint as the structured process that turns DDD methods into a coherent end-to-end design flow, and made the case that AI augments every step of it. This...

14.5.2026 | 12 minutes reading time

The Accessible Domain: Knowledge Engineering for AI-Assisted Development

The Old Promise In the late 1970s, Stanford computer scientist Edward Feigenbaum coined the term "Knowledge Engineering". He described it as the process of extracting expert knowledge, structuring it, and making it usable within a software system. Central...

Generative AI
AI
LLM
Software Modernization
Software development

11.5.2026 | 10 minutes reading time

Johannes Barop

Benjamin Font Pera

Data Quality Powers AI Analytics: Building Trustworthy Genie Spaces in...

Garbage In, Garbage Out. This computing truism has never been more critical than in the age of AI. Large Language Models don't amplify poor data quality, they wrap it in confident-sounding prose that can mislead even experienced users. As organizations...

Generative AI
LLM
AI
Data

7.5.2026 | 8 minutes reading time

Niklas Niggemann

16,000 Tests in 4 Days – Reaching 80% Test Coverage with Claude Code

The Starting Point When we at codecentric recently took over a codebase from a previous service provider for a client, it quickly became clear that this would be no ordinary challenge. Backends, frontends, batch jobs, services — a grown application landscape...

AI
Software development
Testing

5.5.2026 | 12 minutes reading time

Selvarajah Sivarupan

The Synergetic Blueprint Revisited — and Why AI Changes Everything

From Workshop to Working Software — the Gap Nobody Talks About Most teams that adopt Domain-Driven Design invest heavily in workshops. Domain Storytelling sessions, EventStorming boards, context mapping exercises — the collaboration is real, and the ...

28.4.2026 | 8 minutes reading time

Is Spring Boot Becoming Obsolete?

In March 2026, we kicked off a modernization project for a client. Spring Boot was an obvious choice. There was a strategic decision behind it. There was existing know-how. There was existing infrastructure. The team was set. The work began. One of the...

Generative AI
LLM
AI
Software development
Software architecture

27.4.2026 | 7 minutes reading time

Johannes Barop

EXACT Coding: AI-powered development that prioritizes quality over chaotic...

TL;DR Uncontrolled agentic coding (“vibe coding”) delivers code quickly—and often leads to security and maintenance issues as soon as the software goes live. EXACT Coding (Example-guided AI-Collaborative Test-driven Coding) combines best practices: ....

Generative AI
AI
Test Driven Development

22.4.2026 | 7 minutes reading time

Marco Emrich

Ferdinand Ade

Ask Your Data(bricks) with Natural Language

The hottest topic in data and AI today is arguably talking to your own data. Writing SQL queries is far from intuitive when exploring data, so the ability to simply ask questions in natural language and receive AI-powered answers backed by your business...

Data
Big Data

16.4.2026 | 9 minutes reading time

Niklas Niggemann

The Ralph Wiggum Loop: Autonomous Code Generation with a Fresh Context

Ralph Wiggum is the simple-minded boy from The Simpsons who says things like "I'm learnding!" and eats glue. Of all people, he is now the namesake for a technique for autonomous code generation. The idea behind: If the thought of letting code be generated...

Generative AI
LLM
AI
Software development

6.4.2026 | 7 minutes reading time

Johannes Barop

KubeCon Europe 2026: AI agents go to production

tl;dr A summary of KubeCon Europe 2026: It is the year AI agents move from prototypes to production. This article covers what that means: giving agents verifiable identities, routing inference traffic with the new Gateway API Inference Extension, governing...

Cloud native
AI

31.3.2026 | 11 minutes reading time

AI Code Tsunami Hits the QA Dam: The End of Balanced Velocity

Note upfront: This article is specifically aimed at teams working on the modernization and further development of existing systems, not at greenfield projects where completely different rules apply. Everyone is talking about the massive productivity ...

Generative AI
AI
DevOps
Test Driven Development
Testing

30.3.2026 | 8 minutes reading time

DeepFake: Detect AI-Generated Images in 5 Steps

We live in a time when an image is no longer a reliable guarantee of truth. AI‑generated content floods social media feeds, news platforms and messenger groups every single day, and only very few people are able to tell the difference. What once required...

IT-Security
AI
Generative AI
Search
Google
data protection
Digitalization

16.3.2026 | 5 minutes reading time

MotherDuck Dives: From Natural Language to Live Dashboards

Dives are interactive visualizations created through natural language, built directly on top of data in MotherDuck. Users describe what they want to see, and an AI agent generates a persistent, interactive component that lives in their workspace alongside...

MotherDuck
Data
Data Science
Big Data

9.3.2026 | 8 minutes reading time

Niklas Niggemann

From Stories to Code: How Domain Storytelling and EventStorming Give LLMs...

The Broken Promise of AI-Assisted Development By now, most development teams have tried using an LLM to generate code. The results are familiar: syntactically correct, superficially plausible, and frequently wrong in ways that take hours to diagnose...

4.3.2026 | 15 minutes reading time

The Anatomy of Claude Code Workflows: Turning Slash Commands into an AI...

Anatomy of a Claude Code Workflow: How Slash Commands become an AI Development System The tooling surrounding AI-driven development workflows for Claude Code is currently evolving at breathtaking speed. To understand how such a tool works under the hood...

LLM
Generative AI

3.3.2026 | 11 minutes reading time

Don't Let Your AI Cheat: Isolated Specification Testing with Claude Code

AI agents are powerful — but they will cheat if you let them. Letting the same agent develop and test your application risks one thing: it will no longer fulfill the specification, it will simply learn to pass the tests. This article shows how to ...

AI
LLM
Testing

2.3.2026 | 12 minutes reading time

Thomas Jaspers

Ibis: Selecting the Right Execution Engine Without Rewriting Your Logic

In our previous benchmarks, DuckDB consistently outperformed Polars and Pandas on large analytical workloads, but performance comparisons miss a critical question: what happens when you need to move from local DuckDB development to a BigQuery production...

MotherDuck
Data
Big Data
Data Science

10.2.2026 | 6 minutes reading time

Niklas Niggemann

5 reasons we developers misjudge agentic software engineering

Throughout 2025 a kind of trench warfare raged between software developers on the pro and anti-AI development camps. We are, by definition, the experts on software creation. Ironically, this also makes us highly biased, and is exactly the reason you ...

Generative AI
AI

8.1.2026 | 5 minutes reading time

John Fletcher

The developer's dilemma - mastering the transition to AI engineering

Dear software developer, please choose one of the following options for 2026 and beyond:a) finding yourself with obsolete skills, and eventually, unemployed. b) salary increases lower than inflation, whilst expectations of your output continually increase...

AI
Generative AI

1.1.2026 | 11 minutes reading time

John Fletcher

DuckDB vs. DataFrame Libraries

Update 10.12.25 – After helpful insights from Polars Engineer Thijs Nieuwdorp following the initial posting of this article, we were able to refactor our use of the deprecated .count() function in Polars, replacing it with the correct .len() function...

MotherDuck
Data
Data Science
Python
Database

1.12.2025 | 10 minutes reading time

Niklas Niggemann

From Inference to Governance: Why Agent Metadata Matters When LLMs Already Understand Your Data

Where Inference Falls Short

Agent Metadata: Embedding Business Context into Data Definitions

The Difference in Practice: Where Metadata Changes the Outcome

Ambiguity Resolution

Presentation Quality

Consistency Across Teams

The Bigger Picture

Was this post helpful?

Blog author

More articles in this subject area

AI as a Design Partner — Drafter, Validator, Provocateur

The Accessible Domain: Knowledge Engineering for AI-Assisted Development

Data Quality Powers AI Analytics: Building Trustworthy Genie Spaces in...

16,000 Tests in 4 Days – Reaching 80% Test Coverage with Claude Code

The Synergetic Blueprint Revisited — and Why AI Changes Everything

Is Spring Boot Becoming Obsolete?

EXACT Coding: AI-powered development that prioritizes quality over chaotic...

Ask Your Data(bricks) with Natural Language

The Ralph Wiggum Loop: Autonomous Code Generation with a Fresh Context

KubeCon Europe 2026: AI agents go to production

AI Code Tsunami Hits the QA Dam: The End of Balanced Velocity

DeepFake: Detect AI-Generated Images in 5 Steps

MotherDuck Dives: From Natural Language to Live Dashboards

From Stories to Code: How Domain Storytelling and EventStorming Give LLMs...

The Anatomy of Claude Code Workflows: Turning Slash Commands into an AI...

Don't Let Your AI Cheat: Isolated Specification Testing with Claude Code

Ibis: Selecting the Right Execution Engine Without Rewriting Your Logic

5 reasons we developers misjudge agentic software engineering

The developer's dilemma - mastering the transition to AI engineering

DuckDB vs. DataFrame Libraries