Narwhals: Building Dataframe-Agnostic Libraries with Zero Dependencies

3.3.2026 | 11 minutes reading time

After the publication of our article about Ibis, Dr André Schemaitat pointed us to a similar tool with growing popularity – Narwhals. Narwhals describes itself as an "extremely lightweight and extensible compatibility layer between dataframe libraries". At its core, it uses a subset of the Polars API and has zero dependencies; it only uses what the user passes in, so the library can stay as lightweight as possible.

While both Narwhals and Ibis enable dataframe-agnostic code, they serve fundamentally different audiences. Ibis is designed as a complete analytical framework for end users, data scientists and analysts performing their daily work. Narwhals, by contrast, is built for library maintainers and application developers who want to accept multiple dataframe types as inputs without requiring all of them as dependencies. This distinction shapes every aspect of Narwhals' design, from its minimal API surface to its strict backwards compatibility guarantees.

What Does Narwhals Do?

Writing dataframe-agnostic code is hard because the same expression can produce different results depending on the library. A unified, simple and predictable API can help developers focus on behavior rather than subtle implementation differences. The installation of the library itself is as easy as expected and no dependencies are needed, apart from the dataframe library or libraries we actually want to use. Another interesting aspect is Narwhals' handling of both pandas and Polars frequently deprecating behavior. Narwhals tests against nightly builds of both libraries and handles backwards compatibility internally, so the user does not have to worry about it.

Narwhals is primarily aimed at library maintainers rather than end users. As such, it takes stability and backward compatibility very seriously. Public functions in the stable releases v1 and v2 will never be removed or changed. If backwards-incompatible changes have to be made, they will only be pushed into the main narwhals namespace and eventually into narwhals.stable.v3, but v1 and v2 will always stay unaffected and will be maintained indefinitely. This means different packages can depend on different Narwhals stable APIs, and end users can use all of them in the same project without conflict.

Because Narwhals implements a subset of the Polars API, and Polars' syntax is subject to change, this stability guarantee is particularly valuable. Users may encounter deprecated functions (like we did in our first benchmark) or even breaking changes in upstream libraries. Narwhals shields users from this: code written with the stable namespace will keep working, even in newer versions of Polars where functions have been rewritten. This ambitious promise has its limits, which Narwhals acknowledges. Unambiguous bugs will be fixed without that counting as a breaking change, type hints may be refined, and anything labeled "unstable" can change. The developers also state that if Polars were to remove expressions entirely or pandas dropped support for categorical data, Narwhals itself would have to be reconsidered, but they consider such radical changes unlikely. Old Python versions will be dropped around the time of their end of life.

At this point, it might seem a bit overwhelming, as there seem to be lots of different versions. The documentation acknowledges this and provides help. In general, narwhals should be used for prototyping, so users can iterate quickly. If the product becomes something users wish to release as production-ready and stable, they should switch to narwhals.stable.v2. For an entirely new project, users should either use narwhals or narwhals.stable.v2. In case of a project already using narwhals.stable.v1 without the need for newer features, there is no reason to switch to v2. If users wish to use v2, they should require at least narwhals>=2.0. The v1 version is older and has noticeable differences, which can be seen here. There are no missing features in v2 and everything that is stable from the main namespace is available in that version.

Narwhals can generate SQL via the narwhals.sql module. Currently, this module requires DuckDB to be installed. By setting the pretty parameter of the to_sql function to True, the SQL gets formatted to a more readable format, but this in turn requires sqlparse to be installed. As the SQL module relies on DuckDB, the generated SQL code follows DuckDB's dialect. To translate it to other dialects, users can use SQLGlot directly, or alternatively use Ibis or SQLFrame, which both leverage SQLGlot internally, as we discussed in our previous article about Ibis. This dependency requirement is likely to change to align with Narwhals' zero-dependency approach, but it was not as high a priority as for Ibis given their different target audiences.

Narwhals vs Ibis

Our initial instinct was to compare Ibis to Narwhals, as they both appear to do the same thing. The Narwhals documentation acknowledges this and interestingly enough states that they consider them to be very different and not in competition. Narwhals even supports Ibis tables, meaning that dataframe-agnostic code written using Narwhals' lazy API also supports Ibis.

The documentation outlines several key differences between the two tools. Most fundamentally, Narwhals is designed for library maintainers building tools that need to accept multiple dataframe types, while Ibis targets end users, data scientists and analysts performing their analytical work. This difference in audience carries through to every design choice.

Narwhals allows users to write functions that take a DataFrame and return one in the exact same format, preserving the input type. Ibis can materialize to pandas, Polars, and PyArrow, but has no built-in way to return the exact input type. On the data type side, Narwhals supports Categorical and Enum types across its backends, while Ibis does not. Their execution models also differ: Ibis focuses on lazy execution with SQL generation, whereas Narwhals separates between lazy and eager APIs, with the eager API providing very fine control over dataframe operations.

From a dependency perspective, Ibis requires pandas and PyArrow for all backends by default, while Narwhals has zero required dependencies, it only uses what the user passes in (unless converting to SQL, which requires DuckDB). Ibis currently supports more backends (20+ execution engines), but Narwhals still supports pandas and Dask, for which Ibis has deprecated support. Perhaps the most relevant difference regarding daily usage is the API itself: Narwhals uses a subset of the Polars API, while Ibis uses its own pandas/dplyr-inspired API.

In practice, these tools can be complementary rather than competing. A library built with Narwhals can accept Ibis tables as inputs, and users working with Ibis can leverage Narwhals-powered libraries seamlessly.

There are already some libraries and tools that use Narwhals for their dataframe interoperability needs, like Bokeh for interactive data visualization in the browser, Marimo for a reactive notebook, or the interactive graphing library Plotly, all with around 20,000 stars on GitHub.

Practice

To write a dataframe-agnostic function, we first need to initialize a Narwhals DataFrame or LazyFrame by passing our dataframe to nw.from_native. All the calculations stay lazy if we start with a LazyFrame, and Narwhals will never automatically trigger computation without asking. After that, we can express our logic using the subset of the Polars API supported by Narwhals. Finally, we can return a dataframe in its original library using nw.to_native. Since steps 1 and 3 are so common, Narwhals provides a utility @nw.narwhalify decorator, so we only have to explicitly write step 2. Let's consider the following example with a simple group-by and mean operation.

1import narwhals as nw
2from narwhals.typing import FrameT
3
4
5@nw.narwhalify
6def func(df: FrameT) -> FrameT:
7    return df.group_by("a").agg(nw.col("b").mean()).sort("a")

We can then simply use it with whatever engine we like:

1import pandas as pd
2
3df = pd.DataFrame({"a": [1, 1, 2], "b": [4, 5, 6]})
4print(func(df))

If we want to change from pandas to Polars, we do not have to rewrite the logic itself, only the import (and reference).

1import polars as pl
2
3df = pl.LazyFrame({"a": [1, 1, 2], "b": [4, 5, 6]})
4print(func(df).collect())

When dealing with different libraries, especially their more complex operations, it is possible to encounter functions that may not yet be implemented in Narwhals. In such cases, Narwhals can still be useful as a thin DataFrame ingestion layer. If a library developed with Narwhals wants to accept dataframes in any format but operates on pandas internally, Narwhals can handle the conversion. This is significantly more lightweight than including all dataframe libraries as dependencies, and the implementation is straightforward:

1def df_to_pandas(df: IntoDataFrame) -> pd.DataFrame:
2    return nw.from_native(df).to_pandas()

This overview does not nearly cover all capabilities of Narwhals, and the documentation offers great resources to get started.

Performance Overhead

The documentation claims that the overhead of running pandas via Narwhals compared to running pure pandas is negligible and sometimes "even negative". The developers were careful to avoid unnecessary copies and index resets. For lazy backends, Narwhals respects the backends' laziness and never evaluates a full query unless explicitly asked via .collect(). In some places, such as joins and selects, Narwhals does need to inspect a dataframe's schema to mimic Polars' behavior, but this is typically cheap since it can be done from metadata alone without reading the full dataset into memory. To minimize this overhead, Narwhals caches schema and column name evaluations.

How Narwhals Works Under the Hood

At the core of Narwhals is one rule: "An expression is a function from a DataFrame to a sequence of Series." A Series is a one-dimensional labeled array capable of holding any data type. In its simplest form, nw.col('a') returns the Series a from the DataFrame. Expressions can also return multiple Series (e.g., nw.col('a', 'b')), but all columns must have been derived from the same dataframe. By itself, an expression doesn't produce a value, it only produces one once given to a DataFrame context. What happens depends on which context is used: .select creates a DataFrame with only the result of the given expression, .with_columns produces a DataFrame like the current one plus the expression's result, and .filter keeps only rows where the expression evaluates to True.

Each implementation in Narwhals defines their own Narwhals-compliant objects in subfolders such as narwhals._pandas_like, narwhals._arrow, or narwhals._polars. Meanwhile the top-level modules such as narwhals.dataframe and narwhals.series coordinate how to dispatch the Narwhals API to each backend. So in the end, there are a couple of layers: the nw.DataFrame is backed by a Narwhals-compliant DataFrame, such as narwhals._pandas_like.dataframe.PandasLikeDataFrame or narwhals._arrow.dataframe.ArrowDataFrame. These Narwhals-compliant DataFrames are then again backed by a native DataFrame, in our case a pandas DataFrame and a PyArrow Table.

When a user executes code, some top-level Narwhals API is being called. The API then forwards the call to a Narwhals-compliant dataframe wrapper, like PandasLikeDataFrame or PolarsDataFrame. The DataFrame wrapper then forwards the call to the underlying library, like the pandas or Polars DataFrame.

Each operation in Narwhals is a node, which can be accessed via ._nodes. Additionally, Narwhals provides metadata of its expressions. Here we can see how and whether the expression expands to multiple outputs, how many order-dependent operations it contains, and whether the output of the expression is always length-1, and more.

One noteworthy trick (of many) of Narwhals is the elementwise push-down. SQL is picky about over operations, whereas Polars isn't. In SQL, abs(sum(a)) over (partition by b) is not valid. In Polars, pl.col('a').sum().abs().over('b') is valid. Narwhals rewrites expressions to keep Polars' level of flexibility when translating to SQL engines. Specifically, it pushes down over nodes past elementwise ones. In our Polars example, Narwhals automatically inserts the over operation before the abs one. The idea is that elementwise operations operate row-by-row and don't depend on the rows around them. An over node partitions or orders a computation. Therefore, an elementwise operation followed by an over operation is the same as doing the over operation followed by the same elementwise operation. But it is important to keep in mind that query optimization is out-of-scope for Narwhals. This expression rewrite is considered acceptable by the developers because it is simple and allows users to evaluate operations that would otherwise not be allowed for certain backends.

Conclusion

Narwhals and Ibis both address dataframe portability, but for different audiences. Ibis is a complete analytical framework for end users who want to write logic once and run it across 20+ backends. Narwhals is built for library maintainers who want to accept multiple dataframe types without requiring them all as dependencies.

The key insight: these tools are complementary, not competing. Narwhals-powered libraries like Plotly, Bokeh, and Marimo can accept Ibis tables as inputs, so end users can work with their preferred framework while library maintainers avoid the complexity of supporting every dataframe format directly.

Narwhals is the right choice when building a library or tool that needs to accept dataframes as inputs, when supporting multiple dataframe libraries without maintaining separate codebases, or when backwards compatibility guarantees matter for production deployments. Ibis is the better fit for end users performing analytical work across multiple database backends, for workflows that need comprehensive SQL backend support across 20+ engines, or for teams that want to develop locally on DuckDB and deploy to BigQuery or Snowflake without code changes.

Together, these tools reduce friction and eliminate lock-in. Ibis enables portability of analytical intent across execution engines; Narwhals enables portability of dataframe inputs across tools and libraries. As DuckDB, Polars, and other high-performance engines gain adoption, tools like Narwhals ensure that competition and innovation in dataframe libraries don't fragment the broader ecosystem.

If you are interested in exploring how these tools fit into modern data analytics workflows, check out our related articles on DuckDB vs Polars vs Pandas benchmarking, Ibis for backend-agnostic analytics, and our DuckDB and Polars stress test exploring extreme-scale workloads.

Was this post helpful?

Blog author

Niklas Niggemann

Working Student Data & AI

Do you still have questions? Just send me a message.

AI Code Review: Why Loops Without Tests Are Dangerous

In Part 1 we sorted out the three market terms: Context, Harness, Loop Engineering. But Addy Osmani himself warns of a concrete risk: loops without verification keep running, even when the output is wrong. "Whoever writes the loop often no longer understands...

AI
Generative AI
Software development
Software architecture

15.7.2026 | 10 minutes reading time

Marcel Mikl

Genie One: How Databricks Is Reshaping Its Data Assistant

Databricks has reworked Genie, moving it from a tool focused on answering questions about data toward one intended to help users act on it. This shift is packaged under Genie One, alongside two related developments, Genie Agents and Genie Ontology, that...

LLM
Generative AI
Big Data
Data
Compliance

14.7.2026 | 4 minutes reading time

Niklas Niggemann

Loop Engineering, Harness Engineering, Context Engineering: what's the...

Boris Cherny, Head of Claude Code at Anthropic, said: "I don't prompt Claude anymore. I write loops that prompt Claude." Only days later, on June 7, 2026, Addy Osmani, Engineering Lead at Google Chrome, turned that into the term Loop Engineering. Since...

AI
Generative AI
Software development

5.7.2026 | 12 minutes reading time

Benjamin Font Pera

Why every redesign breaks your Playwright project — and how three layers...

TL;DR: We show how a structural separation of UI selectors and business logic can look like when using Playwright, adapting the proven Robot Pattern into the Layered Robot Pattern. This way, browser automation can proceed without fear of UI changes. ...

AI
Software development
Frontend
Testing
Pattern
UX/UI
Test Driven Development
Software architecture
Resilience
Webdevelopment
BDD
Android

3.7.2026 | 9 minutes reading time

Lars Jouon

Rebecca Jox

Replacing Low-Code Platforms with AI-Driven Custom Development in Healthcare

A healthcare software solution needs to be developed to aggregate information (e.g., patient data, diagnoses, lab results) from various medical systems and provide it to another component for further processing via a custom-defined API. The system must...

AI
Software development
Integration

27.6.2026 | 8 minutes reading time

Christian Langmann

Autonomous development workflows with Claude Code

Most developers today use AI tools as faster autocomplete. Over the past few months, on a client project, I took a different path: multi-agent setups with Claude Code, where specialized agents work in parallel, review one another, and coordinate on their...

AI
Software development
Generative AI

22.6.2026 | 17 minutes reading time

Christoph Dalski

Building MCP Servers with Spring AI

Introduction The Model Context Protocol (MCP) is an open standard that defines how AI models communicate with external tools, services, and data sources. It replaces ad-hoc integrations with a single, well-defined JSON-RPC 2.0 protocol, making it easy...

AI
Software development

17.5.2026 | 5 minutes reading time

Tobias Trelle

The Accessible Domain: Knowledge Engineering for AI-Assisted Development

The Old Promise In the late 1970s, Stanford computer scientist Edward Feigenbaum coined the term "Knowledge Engineering". He described it as the process of extracting expert knowledge, structuring it, and making it usable within a software system. Central...

Generative AI
AI
LLM
Software Modernization
Software development

11.5.2026 | 10 minutes reading time

Johannes Barop

Benjamin Font Pera

Data Quality Powers AI Analytics: Building Trustworthy Genie Spaces in...

Garbage In, Garbage Out. This computing truism has never been more critical than in the age of AI. Large Language Models don't amplify poor data quality, they wrap it in confident-sounding prose that can mislead even experienced users. As organizations...

Generative AI
LLM
AI
Data

7.5.2026 | 8 minutes reading time

Niklas Niggemann

16,000 Tests in 4 Days – Reaching 80% Test Coverage with Claude Code

The Starting Point When we at codecentric recently took over a codebase from a previous service provider for a client, it quickly became clear that this would be no ordinary challenge. Backends, frontends, batch jobs, services — a grown application landscape...

AI
Software development
Testing

5.5.2026 | 12 minutes reading time

Selvarajah Sivarupan

Is Spring Boot Becoming Obsolete?

In March 2026, we kicked off a modernization project for a client. Spring Boot was an obvious choice. There was a strategic decision behind it. There was existing know-how. There was existing infrastructure. The team was set. The work began. One of the...

Generative AI
LLM
AI
Software development
Software architecture

27.4.2026 | 7 minutes reading time

Johannes Barop

Ask Your Data(bricks) with Natural Language

The hottest topic in data and AI today is arguably talking to your own data. Writing SQL queries is far from intuitive when exploring data, so the ability to simply ask questions in natural language and receive AI-powered answers backed by your business...

Data
Big Data

16.4.2026 | 9 minutes reading time

Niklas Niggemann

The Ralph Wiggum Loop: Autonomous Code Generation with a Fresh Context

Ralph Wiggum is the simple-minded boy from The Simpsons who says things like "I'm learnding!" and eats glue. Of all people, he is now the namesake for a technique for autonomous code generation. The idea behind: If the thought of letting code be generated...

Generative AI
LLM
AI
Software development

6.4.2026 | 7 minutes reading time

Johannes Barop

MotherDuck Dives: From Natural Language to Live Dashboards

Dives are interactive visualizations created through natural language, built directly on top of data in MotherDuck. Users describe what they want to see, and an AI agent generates a persistent, interactive component that lives in their workspace alongside...

MotherDuck
Data
Data Science
Big Data

9.3.2026 | 8 minutes reading time

Niklas Niggemann

Nested Fixture Pattern for JUnit

JUnit's [@Nested][nested] classes are usually presented as a way to group related tests. But combined with [@RegisterExtension][register-extension] and [ExtensionContext.Store][store], they become something more powerful: a declarative scenario tree ...

Testing
Java
Software development

9.3.2026 | 11 minutes reading time

Rüdiger zu Dohna

Talk to your Data Part 3: The Potential of Natural Language

This is the last and final part of our article series covering the new MCP server by MotherDuck. We have already presented the basics and challenges in previous parts. Now, we want to conclude with our findings and comments on the current state and give...

MotherDuck
Data

27.2.2026 | 7 minutes reading time

Hendrik Kamp

Niklas Niggemann

Talk to your Data Part 2: Limits and Performance Enhancements

In part one of this series, we introduced the MotherDuck MCP server in combination with opencode and showcased initial context engineering. We also showed deeper knowledge retrieval using natural language instead of SQL. In this article we will dive ...

MotherDuck
Data

19.2.2026 | 8 minutes reading time

Niklas Niggemann

Hendrik Kamp

Talk to your Data Part 1: How to generate Insights with MotherDuck MCP...

MotherDuck's new MCP server gives us the opportunity to have a conversation with an AI models like Claude or ChatGPT and ask questions about our data that are directly transformed into SQL. The queries are executed against the actual data in our cloud...

MotherDuck
Data

13.2.2026 | 6 minutes reading time

Niklas Niggemann

Hendrik Kamp

Ibis: Selecting the Right Execution Engine Without Rewriting Your Logic

In our previous benchmarks, DuckDB consistently outperformed Polars and Pandas on large analytical workloads, but performance comparisons miss a critical question: what happens when you need to move from local DuckDB development to a BigQuery production...

MotherDuck
Data
Big Data
Data Science

10.2.2026 | 6 minutes reading time

Niklas Niggemann

DuckDB vs. Polars: Performance & Memory on Massive Parquet Data

Update 02.02.26 – After helpful insights from the Polars team on LinkedIn, we enhanced our benchmark setup with a configuration of Polars where async is forced. This is elaborated in the article. Our previous benchmark compared DuckDB, Polars, and Pandas...

MotherDuck
Data Science
Data

20.1.2026 | 15 minutes reading time

Niklas Niggemann

Narwhals: Building Dataframe-Agnostic Libraries with Zero Dependencies

What Does Narwhals Do?

Narwhals vs Ibis

Practice

Performance Overhead

How Narwhals Works Under the Hood

Conclusion

Was this post helpful?

Blog author

More articles in this subject area

AI Code Review: Why Loops Without Tests Are Dangerous

Genie One: How Databricks Is Reshaping Its Data Assistant

Loop Engineering, Harness Engineering, Context Engineering: what's the...

Why every redesign breaks your Playwright project — and how three layers...

Replacing Low-Code Platforms with AI-Driven Custom Development in Healthcare

Autonomous development workflows with Claude Code

Building MCP Servers with Spring AI

The Accessible Domain: Knowledge Engineering for AI-Assisted Development

Data Quality Powers AI Analytics: Building Trustworthy Genie Spaces in...

16,000 Tests in 4 Days – Reaching 80% Test Coverage with Claude Code

Is Spring Boot Becoming Obsolete?

Ask Your Data(bricks) with Natural Language

The Ralph Wiggum Loop: Autonomous Code Generation with a Fresh Context

MotherDuck Dives: From Natural Language to Live Dashboards

Nested Fixture Pattern for JUnit

Talk to your Data Part 3: The Potential of Natural Language

Talk to your Data Part 2: Limits and Performance Enhancements

Talk to your Data Part 1: How to generate Insights with MotherDuck MCP...

Ibis: Selecting the Right Execution Engine Without Rewriting Your Logic

DuckDB vs. Polars: Performance & Memory on Massive Parquet Data