DuckDB’s friendly SQL is a game changer for developer experience

14.10.2025 | 11 minutes reading time

High level comparison between semantic order of PostgreSQL and DuckDB

I don’t think anyone will be surprised when I say that SQL is not the nicest language to work with. Some might even say that it has terrible ergonomics, especially for larger and more complex queries. Still, there are very good reasons why SQL is the choice for accessing and processing data. Fortunately, DuckDB’s friendly SQL dialect tackles a lot of those issues while retaining the familiar style of classic SQL.

The main issue with maintaining large and complex SQL queries boils down to the syntactical order being very different from the semantic order. This has to be kept in mind when writing queries, making it unintuitive and harder to understand unfamiliar queries: for example, the logical starting point of a query is usually buried somewhere in the middle, and adding additional subqueries is necessary^PDF to use previous results. But it can also impair tool support, for example autocompletion of column names before the FROM clause has been written.

To show that this can be improved significantly, I’ll start with a somewhat realistic example in PostgreSQL dialect as a stand-in for classic SQL. Then, I’ll go over some of DuckDB’s friendly SQL features and show how they make the same example easier to read and maintain. But before I go into the details, some notes for additional context:

Do you prefer a hands-on approach?
- Check out my GitHub repository to run the examples locally.
- Or use the web IDE from MotherDuck (free account required) and access my example data with this share: ATTACH 'md:_share/duckdb-friendly-sql-example-data/0c374195-c236-4b6f-a8ce-02595d90cba6'
DuckDB is not the only dialect that improves on classic SQL, it just happens to be one I like a lot. So the shown features might exist elsewhere, too.
The shown features are just a personal selection, ranging from subtle to fancy, and are by no means an exhaustive list. You might not like all of them (or my formatting), but since DuckDB is largely compatible with PostgreSQL, you can cherry-pick to your liking.

A somewhat realistic example: Rubber Duck KPIs

Minimal examples are great for showing how a feature works, but aren’t enough to showcase their impact on things like developer experience. So, I tried to create a somewhat realistic example: big enough to be meaningful, but not so large that it becomes unwieldy. This way, I can show how each improvement makes a difference.

My query of choice selects a specific product category, adds some extra properties and aggregates a few KPIs grouped by product and customer country.

1WITH
2  enriched AS (
3    SELECT
4      p.product_id                   AS duck_id,
5      p.product_name                 AS duck_name,
6      COALESCE(p.color, 'Unknown')   AS duck_color,
7      COALESCE(p.material, 'Rubber') AS duck_material,
8      c.country                      AS customer_country,
9
10      oi.quantity,
11      oi.unit_price,
12      COALESCE(oi.discount_pct, 0)   AS discount_pct,
13      p.base_cost,
14
15      (
16        COALESCE(oi.discount_pct, 0) >= 0.20
17        OR COALESCE(p.color, '') IN ('Gold', 'Platinum')
18        OR COALESCE(p.material, '') ILIKE '%silicone%'
19      )                              AS is_premium_duck
20    FROM orders      AS o
21    JOIN order_items AS oi USING (order_id)
22    JOIN products    AS p  USING (product_id)
23    JOIN customers   AS c  USING (customer_id)
24    WHERE p.category = 'Rubber Duck'
25  )
26
27SELECT
28  duck_id,
29  format('%s (%s, %s)', duck_name, duck_color, duck_material) AS duck_description,
30  customer_country,
31  SUM(quantity * unit_price * (1 - discount_pct))                            AS net_revenue,
32  SUM(quantity * base_cost)                                                  AS cogs,
33  SUM((quantity * unit_price * (1 - discount_pct)) - (quantity * base_cost)) AS gross_margin
34FROM enriched
35WHERE is_premium_duck
36GROUP BY duck_id, duck_description, customer_country
37ORDER BY duck_id, customer_country;

Some points that make this kind of query harder to maintain and reduce developer experience:

Lots of repeated, similar expressions. This makes it cumbersome to change and easy to introduce inconsistencies.
Not easy to read from top to bottom. For me, the logical starting point would be the FROM clause of the enriched CTE.
I would prefer if the calculation of the individual KPIs were not part of the aggregation in the final projection. This can be easily done by introducing another CTE or some subqueries, but this would add more repetitions (more verbose example).
The last one is a personal preference, but I do not like the default alias syntax. The column name can provide helpful context for understanding an expression. expression AS alias changes the text directionality to a half-baked right-to-left.

In the following sections, I’ll introduce the friendly SQL features I chose with minimal examples and what the somewhat realistic example looks like with them applied.

Friendlier Syntax

The first features for a friendlier SQL are purely syntactical. Some are subtle, but others can take some getting used to. Especially if you have been working with SQL for a long time, the reading patterns you’ve formed might collide with the changed structure.

FROM-first syntax

This one simply lets you put the FROM and JOIN clauses before the SELECT statement.

1FROM orders o
2JOIN customers c USING (customer_id)
3SELECT o.order_id, o.order_date, c.country;

This makes it a lot easier to read queries from top to bottom. For more details, check out the official documentation.

Trailing commas

Many modern programming languages allow trailing commas in different places, and for good reason. Being able to reorder projections or comment out lines without thinking about it makes working on queries a lot smoother.

1FROM products
2SELECT
3  product_id,
4  product_name,
5  base_cost,
6;

It's something you don't really notice at first, but it is sorely missed once it's gone. Unfortunately, trailing commas aren’t supported everywhere. Most notably in CTEs, but ORDER BY doesn’t allow them either.

Prefix aliases

Putting the column alias before its expression is not a subtle change and is absolutely a matter of personal preference.

1FROM products
2SELECT
3     duck_id:  product_id,
4   duck_name:  product_name,
5  duck_color:  color,
6WHERE
7  category = 'Rubber Duck';

I find it makes queries easier to read, but I can see how it can complicate formatting and might feel unnatural when you’re used to expression AS alias. Ultimately, it’s more important to pick a style and stick with it than which one is chosen. Like trailing commas, prefix aliases aren’t supported everywhere. For example, you can’t use them in the star and COLUMNS expressions (which I’ll cover later).

A friendlier somewhat realistic example

Applying all three of these features to the example query is a solid first step toward a better developer experience.

1WITH
2  enriched AS (
3    FROM  o: orders
4    JOIN oi: order_items USING (order_id)
5    JOIN  p: products    USING (product_id)
6    JOIN  c: customers   USING (customer_id)
7    SELECT
8               duck_id:  p.product_id,
9             duck_name:  p.product_name,
10            duck_color:  COALESCE(p.color, 'Unknown'),
11         duck_material:  COALESCE(p.material, 'Rubber'),
12      customer_country:  c.country,
13  
14           oi.quantity,
15         oi.unit_price,
16          discount_pct:  COALESCE(oi.discount_pct, 0),
17           p.base_cost,
18  
19       is_premium_duck:  (
20         COALESCE(oi.discount_pct, 0) >= 0.20
21         OR COALESCE(p.color, '') IN ('Gold', 'Platinum')
22         OR COALESCE(p.material, '') ILIKE '%silicone%'
23       ),
24    WHERE p.category = 'Rubber Duck'
25  )
26
27FROM enriched
28SELECT
29           duck_id,
30  duck_description:  format('{0} ({1}, {2})', duck_name, duck_color, duck_material),
31  customer_country,
32       net_revenue:  SUM(quantity * unit_price * (1 - discount_pct)),
33              cogs:  SUM(quantity * base_cost),
34      gross_margin:  SUM((quantity * unit_price * (1 - discount_pct)) - (quantity * base_cost)),
35WHERE is_premium_duck
36GROUP BY duck_id, duck_description, customer_country
37ORDER BY duck_id, customer_country;

Reusable column aliases

In classic SQL, you can’t reuse aliased expressions within the same scope. This leads to lots of repeated code or extra subqueries and CTEs. Complex queries can quickly become hard to maintain and balancing its length and legibility is not an easy feat.

The way DuckDB allows using aliases directly is a major reason why I consider it a game changer. It enables us to think in a more direct fashion when constructing queries. Chaining calculations together in small increments feels very smooth and is easier to follow.

1FROM order_items
2SELECT
3  discount_price:  unit_price * (1 - discount_pct),
4     total_price:  discount_price * quantity,
5     taxed_price:  total_price * (1 + tax_rate),
6WHERE
7  taxed_price - total_price > 5;

But minimal examples like this undersell the difference this makes and even the somewhat realistic example is - in my opinion - too small to really show this. So I suggest to everyone to take their SQL monster of the week and see what changes when aliases can be reused.

1WITH
2  enriched AS (
3    FROM  o: orders
4    JOIN oi: order_items USING (order_id)
5    JOIN  p: products    USING (product_id)
6    JOIN  c: customers   USING (customer_id)
7    SELECT
8                duck_id:  p.product_id,
9              duck_name:  p.product_name,
10             duck_color:  COALESCE(p.color, 'Unknown'),
11          duck_material:  COALESCE(p.material, 'Rubber'),
12       customer_country:  c.country,
13
14          duck_discount:  COALESCE(oi.discount_pct, 0),
15       net_revenue_line:  oi.quantity * oi.unit_price * (1 - duck_discount),
16              cogs_line:  oi.quantity * p.base_cost,
17      gross_margin_line:  net_revenue_line - cogs_line,
18
19        is_premium_duck:  duck_discount >= 0.20 OR duck_color IN ('Gold', 'Platinum') OR duck_material ILIKE '%silicone%',
20    WHERE p.category = 'Rubber Duck'
21  )
22
23FROM enriched
24SELECT
25           duck_id,
26  duck_description:  format('{0} ({1}, {2})', duck_name, duck_color, duck_material),
27  customer_country,
28       net_revenue:  SUM(net_revenue_line),
29              cogs:  SUM(cogs_line),
30      gross_margin:  SUM(gross_margin_line),
31WHERE is_premium_duck
32GROUP BY duck_id, duck_description, customer_country
33ORDER BY duck_id, customer_country;

A few differences I want to highlight:

Reusing aliases in the same SELECT statement makes the enriched CTE much more concise.
- Columns like duck_color and duck_material can be reused, making it feasible to put the is_premium_duck expression in a single line.
- The KPIs can be calculated step by step, without needing an intermediate subquery.
Moving the KPIs to the enriched CTEs improves separation of concerns.
- enriched becomes a collection of self-contained, pre-processed data points.
- The final projection builds the target structure and calculates the desired aggregations.

Star & COLUMNS expression

The previous two categories were more on the low-key end of things, so I wanted my last pick to be a bit fancy. DuckDB’s star and COLUMNS expressions are powerful additions. But, as with any powerful tool, overusing them can make things worse rather than better. And, as mentioned earlier, prefix aliases do not work here, so the classic alias syntax has to be used.

Extended star expression capabilities

A shorthand to select every column of a table is useful, but what if you need every column except one or two? In classic SQL you would have to list all the columns you want. DuckDB makes this easier by letting you use EXCLUDE, REPLACE, and RENAME clauses with the star expression.

1-- EXCLUDE
2FROM products
3SELECT * EXCLUDE(category)
4WHERE category = 'Rubber Duck';
5
6-- REPLACE
7FROM order_items
8SELECT * REPLACE(
9  lpad(order_item_id::VARCHAR, 4, '0') AS order_item_id,
10  lpad(product_id::VARCHAR, 4, '0') AS product_id,
11);
12
13-- RENAME
14FROM customers
15SELECT * RENAME(customer_name AS full_name);
16
17-- Combination of all
18FROM customers
19SELECT 
20  * EXCLUDE(country)
21    REPLACE('000' || customer_id AS customer_id)
22    RENAME(customer_name AS full_name),
23;

This isn’t everything that has been added to the star expression. You can find all features in the documentation.

COLUMNS expression

The extended star expression is already powerful, but the COLUMNS expression takes things even further. For example, you can apply the same expression on all matching columns, or even rename columns via regular expressions. The latter is something I would absolutely mark as „danger territory“.

1FROM products
2SELECT lower(
3  COLUMNS(* EXCLUDE (product_id, base_cost))
4);
5
6FROM order_items
7SELECT COLUMNS('(.+)_id') AS '\1';

There’s a lot more you can do with this. Check out the documentation for all the details.

A fancy somewhat realistic example

Applying those features to the somewhat realistic example isn’t straightforward, so I had to introduce an additional CTE. This makes it somewhat more artificial, but it should be easy enough to imagine reasons why a query evolves this way over time.

1WITH
2  enriched AS (
3    FROM  o: orders
4    JOIN oi: order_items USING (order_id)
5    JOIN  p: products    USING (product_id)
6    JOIN  c: customers   USING (customer_id)
7    SELECT
8                duck_id:  p.product_id,
9              duck_name:  p.product_name,
10             duck_color:  COALESCE(p.color, 'Unknown'),
11          duck_material:  COALESCE(p.material, 'Rubber'),
12       customer_country:  c.country,
13
14          duck_discount:  COALESCE(oi.discount_pct, 0),
15       net_revenue_line:  oi.quantity * oi.unit_price * (1 - duck_discount),
16    --  tax_collected_line:  net_revenue_line * COALESCE(oi.tax_rate, 0),
17              cogs_line:  oi.quantity * p.base_cost,
18      gross_margin_line:  net_revenue_line - cogs_line,
19
20        is_premium_duck:  duck_discount >= 0.20 OR duck_color IN ('Gold', 'Platinum') OR duck_material ILIKE '%silicone%',
21    WHERE p.category = 'Rubber Duck'
22  ),
23  premium_with_description AS (
24    FROM enriched
25    SELECT
26      * EXCLUDE (duck_name, duck_color, duck_material),
27      duck_description: format('{0} ({1}, {2})', duck_name, duck_color, duck_material),
28    WHERE is_premium_duck
29  )
30
31FROM premium_with_description
32SELECT
33  duck_id,
34  duck_description,
35  customer_country,
36  SUM(COLUMNS('(.*)_line')) AS '\1',
37GROUP BY duck_id, duck_description, customer_country
38ORDER BY duck_id, customer_country;

Being able to pick’n’mix columns with less verbosity is nice to have, but adding a new KPI by simply adding a column named <KPI name>_line to enriched is very fancy. It’s also a bit scary, because it significantly increases the potential for side effects. In comparison, a similar query with classic SQL is about 15 lines (35%) longer and requires touching every query to add a new KPI.

Wrapping up

I hope this has given you a sense of why I think DuckDB’s friendly SQL is a game changer for developers and data engineers once the code grows beyond a few lines. If you’re considering DuckDB for your next project, keep in mind that it is explicitly scoped as a simple and fast DBMS. Something else is needed for orchestration, distributed workloads, scalability and other big-project needs. One option is MotherDuck, which is the only tool I know of that has DuckDB at its core. So it might be the right one for you.

Also, this has been just a glimpse of the features DuckDB offers, ranging from nice things making SQL friendlier to having completely new possibilities. Here is an arbitrary list of things, with links to their documentation, that didn’t fit in the scope:

QUALIFY clause for filtering immediately on window function results.
GROUP BY ALL and ORDER BY ALL so you can omit the columns entirely.
First class support for complex data structures like lists, structs & maps including list comprehension and slicing.
CREATE OR REPLACE as a shorthand for overwriting objects.
Lambda functions to turn SQL on its head.
Additional join types like ASOF and POSITIONAL.
Recursive CTEs with USING KEY allowing access to results beyond the previous iteration.
Dot operator for function chaining so statements don’t have to end in five closing brackets.
And everything else in the friendly SQL docs.

Is there anything on the list above you’d like to hear more about? Did I miss an obvious DuckDB feature that would improve the somewhat realistic example? Do you think my SQL formatting is awful and prefix aliases shouldn’t exist? Let me know and I’ll see what I can do.

Was this post helpful?

Blog author

Lennart Hensler

Do you still have questions? Just send me a message.

ODPS: The Standard for Data Products

The data landscape in an organization often looks like this: teams gather and produce data everyday. Each team develops their own metadata models and documentation, if there is any at all. Governance policies exist in scattered documentation (spreadsheets...

Data

7.11.2025 | 4 minutes reading time

DuckDB and MotherDuck for customer facing analytics

MotherDuck
Data

21.10.2025 | 5 minutes reading time

Matthias Niehoff

Zero-ETL with MotherDuck: A Technical Deep Dive

MotherDuck, the cloud-native service built on DuckDB, fundamentally transforms how organizations interact with data stored in cloud blob storage. By eliminating the traditional ETL/ELT pipeline, MotherDuck enables direct SQL analytics on Parquet, JSON...

MotherDuck
Data

7.10.2025 | 6 minutes reading time

Hendrik Kamp

Your First Data Analysis with MotherDuck and DuckDB: From CSV to Insights...

In this post, we'll explore how MotherDuck, powered by DuckDB, revolutionizes the way you interact with your data, particularly when dealing with CSV files. You'll learn how to quickly parse and filter even large datasets directly from your local machine...

Data
Database
MotherDuck
Big Data

30.9.2025 | 8 minutes reading time

5 Reasons Why We’re Excited About MotherDuck Launch in AWS Frankfurt

5 Reasons We’re Excited About MotherDuck’s Launch in AWS Frankfurt For some time, a key challenge for European data teams has been balancing innovation with strict regulation. We’ve often seen powerful tools launch first in the US, while our need for...

Data
Big Data
Database
News
MotherDuck

24.9.2025 | 6 minutes reading time

Marcel Mikl

Using Dagster with DuckDB

DuckDB has rapidly emerged as a popular in-process analytics database. Dagster, on the other hand, is a modern data orchestration framework that makes it easy to build and manage data pipelines. Combining Dagster with DuckDB allows data engineers to ...

Data

16.5.2025 | 4 minutes reading time

Hendrik Kamp

Querying Databricks Delta Tables in Motherduck

Intro In a previous article, my colleague Matthias Niehoff demonstrated how duckdb can serve as a viable alternative to Spark for processing data stored in Databricks, specifically by directly accessing the Unity Catalog. Building upon that, a next ...

Data

25.4.2025 | 4 minutes reading time

Hendrik Kamp

Introducing Data Interface Quadrants (DIQs)

In today’s rapidly evolving, data-driven world, organisations face an increasingly complex challenge: how to design, implement, and manage data interfaces that meet both immediate operational demands and long-term strategic business objectives. A data...

API
Data

30.1.2025 | 8 minutes reading time

Daniel Kocot

Miriam Greis

Access Databricks UnityCatalog from duckdb

Databricks is a great platform when it comes to data management and governance, mostly due to the unity catalog. But Spark as an engine for processing the data is just ok'ish, especially when data is not really big. New engines like polars, datafusion...

Data

20.1.2025 | 5 minutes reading time

Matthias Niehoff

Charge your APIs Volume 36 - Trends for 2025

As 2025 approaches, new trends are emerging in the world of APIs. After 2024 was user-centric, the focus is now shifting back to developer needs and increasing productivity. APIs are evolving and the technologies surrounding them are becoming more powerful...

Integration
API
Data
Software architecture

11.12.2024 | 5 minutes reading time

Daniel Kocot

When Business Meets Technology: From Data Product to Data Architecture...

Abstract The Data Product Canvas (DPC) is a tool for the lightweight and iterative definition of data products. It increases the efficiency of product definition by clearly presenting the key impact areas on data products. Additionally, the DPC motivates...

Software architecture
Data
DDD
Digital product developement

6.8.2024 | 24 minutes reading time

Dr. Florian Rademacher

Charge your APIs Volume 28: Empowering application and data integration...

In today's fast-paced world, seamless application and data integration is crucial for organisational success. This blog explores how frameworks like Maslow's Pyramid, Team Topologies, Evolutionary Architectures, API Federation, and API Marketplaces, ...

API
Data
Integration

25.7.2024 | 8 minutes reading time

Daniel Kocot

Data for the Masses Volume 2: Data Products, Data Contracts and API Contracts

The pillars of modern data architectures as success factors for organisations In the digital economy, a well-thought-out data architecture and the efficient use of data are crucial for organisational success. Data products, data contracts and API contracts...

Data
API

13.6.2024 | 7 minutes reading time

Daniel Kocot

Becoming a Data-Driven Company with Applied Data Products

In recent years, the hype surrounding the value of data has grown continuously, and a multitude of concepts and methods have emerged on how companies can become 'data-driven'. From strategic top management to detail-oriented data analysts attempts are...

Agile
Big Data
Data
Product management
Digitalization
Data Science
Business Intelligence

18.5.2024 | 9 minutes reading time

Dr. Florian Rademacher

A/B Testing: Tool support and testing GrowthBook

In the previous blog post we introduced some general concepts of A/B testing: we explored the main aspects, defined test types and explained the most common statistical methods. Now we want to explore the areas in which A/B testing tools can provide...

Testing
Python
Data
UX/UI
Analysis
JavaScript

18.3.2024 | 20 minutes reading time

Francesca Diana

A/B Testing: An introduction

This blog series aims to aid teams who are contemplating adding A/B testing to their toolkit but are unsure of which tool to use. In addition to helping with tool selection, the series also provides the entire team with a consistent initial understanding...

Testing
Data
UX/UI
Analysis

6.2.2024 | 29 minutes reading time

Francesca Diana

Data for the Masses Volume 1: The Digital Product Passport - A Key Element...

The Digital Product Passport represents a significant shift for digital units within organisations, compelling them to ensure comprehensive data transparency. This tool not only serves as a product's digital fingerprint but also opens up new dimensions...

Data
Product management

25.1.2024 | 7 minutes reading time

Daniel Kocot

Charge your APIs: NordicAPIs Platform Summit Edition - API first ... not...

In the ever-evolving landscape of software development, buzzwords and paradigms come and go. One such term that has gained significant traction in recent years is "API-First Development." It's been hailed as the holy grail of modern software engineering...

API
Data

19.10.2023 | 5 minutes reading time

Daniel Kocot

An introduction to federated learning in an industrial context: Advanced

In the Machine Learning space, it was long believed that sharing learnings or weights was safe in the sense that the input data couldn't be extracted. However, this belief has been challenged by researchers coming out over the years. Nowadays, numerous...

Machine Learning
Big Data
Data Science
Data

18.9.2023 | 9 minutes reading time

An introduction to federated learning in an industrial context: Fundamentals

With the help of data, companies are able to make more informed decisions, optimize their workflows and gain an edge in the competitive world of business using the power of Machine Learning (ML). However, handling data has become increasingly difficult...

Machine Learning
Data Science
Data
Big Data

25.8.2023 | 8 minutes reading time

DuckDB’s friendly SQL is a game changer for developer experience

A somewhat realistic example: Rubber Duck KPIs

Friendlier Syntax

Reusable column aliases

Star & COLUMNS expression

Wrapping up

Was this post helpful?

Blog author

More articles in this subject area

ODPS: The Standard for Data Products

DuckDB and MotherDuck for customer facing analytics

Zero-ETL with MotherDuck: A Technical Deep Dive

Your First Data Analysis with MotherDuck and DuckDB: From CSV to Insights...

5 Reasons Why We’re Excited About MotherDuck Launch in AWS Frankfurt

Using Dagster with DuckDB

Querying Databricks Delta Tables in Motherduck

Introducing Data Interface Quadrants (DIQs)

Access Databricks UnityCatalog from duckdb

Charge your APIs Volume 36 - Trends for 2025

When Business Meets Technology: From Data Product to Data Architecture...

Charge your APIs Volume 28: Empowering application and data integration...

Data for the Masses Volume 2: Data Products, Data Contracts and API Contracts

Becoming a Data-Driven Company with Applied Data Products

A/B Testing: Tool support and testing GrowthBook

A/B Testing: An introduction

Data for the Masses Volume 1: The Digital Product Passport - A Key Element...

Charge your APIs: NordicAPIs Platform Summit Edition - API first ... not...

An introduction to federated learning in an industrial context: Advanced

An introduction to federated learning in an industrial context: Fundamentals