Database design using Anchor Modeling

27.7.2017 | 11 minutes reading time

Anchor modeling offers agile database design, immutable data storage, and enables temporal queries using regular relational database. This catchy excerpt certainly spiked my interest two years ago at Data Modeling Zone conference in Hamburg.

I enjoy making discipline crossovers and in this article I would like to discuss the concept of Anchor modeling related to software development. Anchor modeling has become a trendy subject in the field of business intelligence (BI). However, not everybody will be familiar with what’s buzzing in BI land so I will provide a background into Anchor modeling and then discuss its potential merits for developers.

A lot of companies now favour an iterative approach over a big up-front design approach. Although it might not feel like that on every day. In general I would say we are certainly improving our capability to change the things we built in the past and adapt our plan for what we will built in the future. However the big improvements have generally been at the source code level of our software and in recent years at the deployment of our software. While refactorability of our code might have improved a great deal, I feel industry practices regarding incremental database design changes are not up to par in most development teams. This is exactly something that Anchor Modeling promises to address.

A word of warning. To keep this article readable, I might have stereotyped software development in this article.

Where did Anchor Modeling originate?

Anchor modeling originated in the field of data warehouses ten years ago and was formally presented in academia in 2009. The basic premise of Anchor Modeling is that it offers non-destructive model iterations while avoiding null values and eliminating redundancies. Anchor modeling is an extension of normalisation practices where the result is mostly in the 6th normal form. But don’t be alarmed or put off. A major part of this design approach is that Anchor modeling gives you the means to make this design manageable.

In recent years, these practices have found their way from academia to especially the business intelligence discipline. I am particularly interested how I, as a developer, can learn from these advances and where I can apply them in practice.

Anchor modeling basics

Anchor modeling has four basic building blocks: anchors, attributes, knots and ties. All four of these building blocks are implemented using their own table. The basic concept of any anchor model are the anchors. The concepts of Anchor models are explained below.

An anchor holds the identity of an entity using a generated surrogate key rather than its natural key. An example could be “Tenant” and “House”.
A tie defines the relation between usually two – but potentially more – entities (thus anchors). An example would be a relation from Tenant to House that defines the ownership
An attribute – representing some piece of information – holds the attribute value and is related to an anchor. Attributes are strongly typed. An example could be “Rent”.
A knot defines the possible limited states of a tie or attribute. So effectively it provides context by defining for example a limited set of genders. It provides context on for example a tie like active (yes/no).

Let’s visualise this using the free online modeling tool on www.anchormodeling.com .

Anchor model

House and Tenant are anchors here connected using a tie. AmountOfRooms, Rent and PhoneNumber are attributes connected using knots. Exported to PostgresSQL this simple model will provide us six tables, twelve views to process this information, and 21 stored procedures. This might look complex, but the tables are our main focus, and you could easily follow Anchor Modeling principles without the online tool and sql generation. When you start out, it’s just an easy quick start. You will get highly optimised views to query information out of the box.

Why Anchor modeling in online transaction processing (OLTP) software?

As a developer you first question at this point could be: why blog about a relational database technique in 2017? Well, it may not be the most popular topic to write about in the development scene, but it is still the most popular storage technology. On the topic of agility, many NoSQL solutions offer relaxed schemas as a way of dealing with changes in requirements, which is nice of course. But NoSQL technology has its own downside – which is really product specific (and beyond the scope of this article). I also personally think that the richness of an SQL interface to your data storage is as yet unmatched in NoSQL technology and is still highly appreciated. Moreover, initiatives like the development of – the highly potent – Cockroach DB seem to support that idea and could potentially unleash a new wave of RDBMS adoption. So it seems that keeping some of your eggs in the relational basket and some of them in the latest-practices-basket will go a long way.

As a developer we often only deal with databases as ‘the system of record’ in our systems that support a business function. The storage behind our applications that we store into and retrieve from is often aimed at producing information about business operations. Information is mostly stored in a state and structure that is very closely related to the software that uses it. So having a highly normalised database might be a scary thought for a developer. In university you are probably taught the basic principles of database design and implementation. If you are like me, you probably haven’t encountered the same amount of academic purity in the databases you use in your working life. A common – potentially outdated – understanding is that highly normalised databases have many practical, but also performance related downsides on the operational front. Another problem is that as a programmer, we have to deal with the structure of information in our storage, and this tends to reflect on the representation of our models in our code. One could say that in many cases the design of the database leaks through. Of course this also happens the other way around. A customer object that consists of ten fields could end up in a database in a single table with ten columns. We are used to create an abstraction layer in our software to protect the business side of our application from changes in the database and vice versa. But I would argue protection is not good enough.

This is the case I am specifically interested in. I would like to discuss how Anchor modeling can aid an agile team in properly designing, building, and maintaining a relational storage layer. Secondly my goal is to show that we can do proper design iterations in an agile team. If storage is easier to modify people are less inclined to take a one-time right approach.

Agility and temporality

One of the defining features of anchor modeling is the capacity for non-destructive schema evolution. In other words, the characteristics of the storage can differ over time without invasive redesigns and large migrations. This is certainly desirable if you are designing and maintaining a huge data lake and have to satisfy business wishes on a weekly basis, like in the position of business intelligence analyst. So let’s see how these rules apply to business software development.

Anchor modeling has been designed with temporality in mind. Attributes and ties can both be of a temporal nature. This is optional. Ties and attributes can be meta-dated with timestamps to signal the lifespan of relations. I will show an example later on.

Let’s enumerate the more invasive database changes that we think about on a daily level, where we are used to deal with databases that are designed foremost to remove redundancy and are often in 3rd normal form.

	Conventional database	Anchor modeling
A new column	This changes the structure of a table. An addition of a column is still moderately easy. We alter the existing table. Sometimes you can keep your application alive, but it is most likely unable to query the table for some time. Rollbacks are possible but require some effort.	Using anchor modeling you add an attribute table and, if required, a knot table to your database. Existing tables are left untouched, so most RDBMS will stay up and running.
Removal of a column	This changes the structure of a table. A column removal is difficult. The table is locked during the transaction. The application needs migration to be able to handle the new table design. Rollbacks are not possible or require a lot of preparation.	Using anchor modeling removing the attribute of an entity is substantially easier. By default everything is designed to be immutable. So the removal of a relation between an entity and an attribute will actually not cause an update in the database schema. Inserts to the knot table is enough to couple or decouple a relation. There is also the option to use the temporal aspect of an anchor model design.
Removal of a table	This changes the structure of a schema. A table removal is difficult. The table is locked during the transaction. Related tables often as well. It is destructive in nature. The application needs migration. Rollbacks are not possible or require a lot of preparation.	Using anchor modeling the removal of an entity is substantially more easy. By default everything is designed to be immutable. So the removal of a table will actually not cause an update in the database schema. The relations can simply be invalidated.

In the example above the rent and telephone number are marked as temporal (indicated by a dot in the design), as well as the relation between a tenant and a house. This means that when things change, the current reality can be replaced by a new one. By leveraging the power of SQL select statements, combined with parameterized views one can even travel in time to see what the state of our database looked liked on Christmas 2015! And thanks to the dispersed character of the data, this is agnostic of both the structure or the content of your tables.

Drawbacks

So, enough with all those nice features. Let’s get some drawbacks on the table. Well, for starters there’s of course the mental effort it takes to fill your head with the number of tables used by a 6th normal form database. I think one of the biggest contributions to the enormous success of RDBMS is the degree to which the solution fits in with our mental (3rd normal form) model of the world.

It is perfectly possible (I have seen this first-hand) for people to adopt these design principles and write down properly structured tables. However, most people will need a tool for this. Luckily most RDB design tools will work and the more advanced ones like Vertica offer features to support this work for uses-cases where the landscape will be too complex to grasp.

Your performance behaviour will vary. I am not a professional DBA nor do I profile databases on a daily basis. So I will refrain from making bold performance claims. One thing to note is that Anchor modeling is built with SQL execution optimization in mind. It will use many advanced features that modern systems offer. One of the most important is table elimination. This limits the number of products you can use (taken from www.anchormodeling.com ).

Database Engine	Support
Microsoft SQL Server 2005	full
Microsoft SQL Server 2008	full
Oracle 10gR2 Express Edition*	partial
Oracle 11gR1 Enterprise/Express Edition	full
IBM DB2 v9.5	full
PostgreSQL v8.4 beta	full
Teradata v12**	partial
MySQL v5.0.70	none
MySQL v6.0.10 alpha	none
MariaDB v5.1	full
Sybase	not tested

Conclusions

So far we have gone over the concepts of Anchor modeling and seen some nice features that Anchor modeled databases can offer. By leveraging the temporal nature of a database design a development team is potentially better able to adapt to change. The temporal nature of a relation in this design makes schema evolution non-disruptive. You will no longer be restricted by the decisions of your past self (or a colleague). Furthermore the concept of Anchor modeling delivers a set of rules that implicitly work towards a proper design.

A second important lesson I took from looking into this is that I, as a 30-year old developer, have become biased by the maturity of RDBMSs. I normally work with technology that helps me abstract the knowledge I need when working with databases – like Hibernate – and in general lost interest in what problems databases can solve for me. Added to this is of course a movement towards the hip and buzzing NoSQL solutions over the past years. In the development scene I just don’t see many developers blogging (for example) about what’s new and hip in this latest PostgreSQL release. Or how a new release opens up nice technological opportunities. So this new kind of modeling definitely renewed my interest.

The third big takeaway is that the world beyond 3rd normal form has come within reach and should be part of our toolset when we face problems in practice. Anchor modeling is not something that you have to adopt wholesale. It is a design approach that helps you to deal with change. You could for example only apply it to the parts of the system that have a high rate of change, like your product entity.

In the end it’s a matter of making a proper design decision, and also Anchor modeling is not a silver bullet 😉

A lot of credits due to the nice reads and introduction at:

http://www.anchormodeling.com
Regardt, O., Rönnbäck, L., Bergholtz, M., Johannesson, P., & Wohed, P. (2009, November). Anchor Modeling. In ER (pp. 234-250).

Was this post helpful?

Blog author

Kevin van

Do you still have questions? Just send me a message.

fromKevin van

Lessons learned from a successful project

As consultants, we are always focussed on the next thing to improve, so we easily to forget to celebrate our successes. We should pay special attention to our achievements. On average 29% of IT projects are delivered successfully (source ). When projects...

DevOps
Agile
CI/CD
Software development
Project management

31.12.2019 | 8 minutes reading time

Kevin van

AWS CDK Part 6: Lessons learned

In this blog post we will focus on reflecting on our AWS CDK experience during one of our projects where we had to set up a new infrastructure for one of our customers. We will address the issues of version iterations within the library, what we deemed...

Software architecture
Cloud
CI/CD
DevOps
AWS
Serverless

28.11.2019 | 6 minutes reading time

Kevin van

Maik Kingma

AWS CDK Part 5: How to create a step function

In this blog post we will focus on creating the step function (state machine) that coordinates our Lambda workload. Our Lambdas will read from S3, transform data, and store this into the RDS instance we created in part 3 and part 4 of our blog series...

Software architecture
CI/CD
Cloud
DevOps
AWS
Serverless
JavaScript

26.11.2019 | 4 minutes reading time

Kevin van

Maik Kingma

AWS CDK Part 4: How to create Lambdas

In this blog post we will focus on creating the Lambdas that comprise the execution part of our application landscape. Our Lambdas will read from S3, transform data, and store this into the RDS instance we created in part 3 of our blog series. By the...

Software architecture
Cloud
DevOps
Node.js
AWS
Serverless

7.11.2019 | 7 minutes reading time

Kevin van

Maik Kingma

RDS database migration with Lambda

When I was building Java-based web applications we had some handy tools like Flyway and Liquibase for database schema migration. Nowadays I am using Lambda for quite some projects, and recently I had to use RDS (MySQL, as well as Aurora MySQL). The...

Software architecture
Cloud
CI/CD
AWS
Software development
Serverless
Database

16.10.2019 | 4 minutes reading time

Kevin van

Will AWS CDK replace Terraform and the Serverless Framework?

This is a post about infrastructure management with code for AWS serverless projects. However, much of the findings can be applied to more generic cloud management as well. Recently I got the opportunity to work with the Serverless Framework, Terraform...

Software architecture
CI/CD
Cloud
DevOps
AWS
Serverless

16.9.2019 | 12 minutes reading time

Kevin van

Use Serverless AWS step functions to reduce VPC costs

Recently I found myself in a situation where a customer (big in the music festival business) requested a cloud solution supporting the continuous reporting of administrative business workflows. They required an architecture which demands high availability...

Software architecture
Infrastructure
Serverless
AWS
Cloud

11.9.2019 | 4 minutes reading time

Kevin van

Improving the Lambda developer experience

From a developer’s perspective, running Lambdas as a runtime to serve your main business logic is a breeze. If you are a dev and have embraced the operational side of things, you will have noticed it’s not an easy task. In general developing software...

Software architecture
Cloud
Node.js
Testing
AWS
Serverless

1.9.2019 | 5 minutes reading time

Kevin van

Retrospective on the value stream of your software delivery

In this article I’ll introduce a retrospective format that you can use to evaluate a team’s ability to deliver software in a healthy manner. I used the structure of a value stream, like we see in value stream mapping or value stream analysis. Value stream...

Agile
Agile methods
Software development

25.2.2019 | 4 minutes reading time

Kevin van

Reflections on DDD Europe 2019

This year I visited the DDD Europe conference in Amsterdam. It was my first visit to any DDD conference, and I was happily surprised with the diversity of subjects and also the diversity of the audience. Gender, technical affiliation, business affiliation...

8.2.2019 | 5 minutes reading time

Kevin van

Continuous Validation for Security Configurations

Testing integration with a component that has a completely separate life cycle apart from your application is hard. Think about a database system version upgrade. In more cases than one, it has caused a decision to skip automation entirely and rely on...

IT-Security
Testing

4.1.2018 | 4 minutes reading time

Kevin van

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Charge your APIs Volume 27: Transition from COE/C4E to an API Platform...

The Center of Excellence (COE) focuses on centralised expertise, ensuring best practices and governance, while the Center for Enablement (C4E) empowers teams with tools, guidance, and support for API development. Although beneficial, these models face...

API
Platform engineering
Agile transformation
Agile

24.5.2024 | 10 minutes reading time

Daniel Kocot

Becoming a Data-Driven Company with Applied Data Products

In recent years, the hype surrounding the value of data has grown continuously, and a multitude of concepts and methods have emerged on how companies can become 'data-driven'. From strategic top management to detail-oriented data analysts attempts are...

Agile
Big Data
Data
Product management
Digitalization
Data Science
Business Intelligence

18.5.2024 | 9 minutes reading time

Dr. Florian Rademacher

Count your queries! Repository integration tests with Hibernate Statistics

If you are using Spring Data JPA as a data access framework, Hibernate is almost certainly hiding under the hood. And although this setup takes a lot of work off your hands by doing a lot of awesome things, the final outcome should better be checked....

Java
Testing
Spring
Database

7.8.2023 | 6 minutes reading time

Kevin Peters

How to upgrade your Aurora Serverless database schema using CDK and Lambda

Imagine the following situation: You are building a serverless application using e.g. lambdas, you setup your system using CDK (or CloudFormation) and you store your data in Aurora Serverless. How would you automate your database schema adaptations or...

Cloud
Database
AWS
Infrastructure as Code
Serverless

16.1.2023 | 12 minutes reading time

Microstream – the end of O/R mappers?

Searching for alternatives to O/R mappers and persistence frameworks for NoSQL databases, I came across Microstream and was interested pretty quickly. On the one hand because Microstream is being developed in my home region Oberpfalz, but mainly because...

Java
Database
Software architecture

29.9.2022 | 14 minutes reading time

Planning Poker: Tools for online estimation sessions

Many agile teams are using Planning Poker or Sprint Poker to estimate the size of their product backlog items. Shifting to remote or hybrid work, your team might look for a solution to hold virtual Planning Poker sessions. Luckily there are a lot of ...

Product management
Project management
Agile
Remote Work
Agile methods

23.6.2022 | 9 minutes reading time

Jira templates for user stories, tasks and bugs

A recurring task in product management is writing user stories. In agile product development, a user story describes requirements for a product that are formulated from the viewpoint of a user. Therefore they become a key tool to work with requirements...

Project management
Agile
Atlassian
Product management
Agile methods

12.1.2022 | 4 minutes reading time

Low-code software development and the emergence of “Devigners”

With every new technology and innovation come new roles and skills that were not possible or needed before. “Software developer” is a good example among thousands of new professions that emerged in the 20th century. The 21st century, unsurprisingly, ...

Agile transformation
Atlassian
Process management
Agile

11.5.2021 | 11 minutes reading time

Organising a remote workshop EventStorming

Recently we have been busy organizing an Eventstorming workshop. It was a bit different than normal though since the goal of the workshop was to show the participants what an EventStorming workshop is and how they can apply it. We organized this workshop...

Remote Work
Agile
DDD

21.4.2021 | 5 minutes reading time

Crowded backlog? A product is more than the sum of its features

We often find businesses in a stage of growth where they are experiencing problems caused by an increasing number of customer requests and requirements. They missed the moment when their success created the need for a different approach to their requirements...

Product management
Agile
Coaching
Agile methods

28.3.2021 | 5 minutes reading time

Anja Frank

Agile Toolbox: 10-minute story time

Backlog refinement meetings can become unrewarding and tedious really fast if you have to work through 20 stories in two hours. Wouldn’t it be nice if there was a format where a team could use its full energy while at the same time upping their flexibility...

Agile transformation
Process management
Product management
Project management
Agile
Coaching
Agile methods
Software architecture

23.3.2021 | 7 minutes reading time

Quality means teamwork

We are usually involved in projects demanding a high level of software quality, whether they be large legacy applications, where the customer is already beset with problems and is trying to shut the stable door after the horse has bolted, or newer pilot...

Agile
Coaching

20.1.2021 | 11 minutes reading time

Why user-oriented development is so important – the story of tactics.ai

In this blog post, we want to give you an insight into the product development of tactics.ai. Our initial idea was a data-driven football analysis tool that applies machine learning techniques to analyze the strengths and weaknesses of opponents and ...

Agile
AI
Startup
Machine Learning
Product management

23.8.2020 | 8 minutes reading time

Denis Stalz-John

Failure on demand – Scenes from an agile transformation

In this blog post, we want to show why agile transformations fail, illustrating various situations that unfortunately still occur far too often in reality today. More and more often, we notice that the company culture lived by the management and the...

Agile transformation
Agile

30.7.2020 | 12 minutes reading time

Modern solutions to manage large-scale projects in 2020

In the previous post , we reviewed most common problems with using Excel or out-of-the-box project management tools in large-scale projects. In this article, we will lay out legitimate expectations of a modern solution and current technological options...

Agile
Process management
Project management
Collaboration
Atlassian

24.5.2020 | 10 minutes reading time

How NOT to manage large and complex projects in 2020

Today, modern solutions offer a valuable infrastructure for managing complex projects and end-to-end processes. This is achieved by providing high customizability accompanied by a low-code and modular approach to development.In order to understand the...

Agile
Product management
Process management
Project management
Atlassian

17.5.2020 | 13 minutes reading time

Pair programming without keyboard

Pairing in general—and pair programming in particular—is an essential practice of XP . Unfortunately, pairing is closely associated with coding. Take, for example, the definition of the driver role: it is the person in control of the keyboard (Beck 2...

Agile
Software development
Agile methods
Team Programming

29.3.2020 | 3 minutes reading time

Remote teamwork – experience report from a distributed team

This blog post is something we had on our to-do list for such a long time that it feels like forever: sharing all the learnings about day-to-day remote work in our codecentric Digitization Labs teams. Now that COVID-19 hit Europe and everyone needs to...

Agile methods
Agile
Product management
Remote Work

23.3.2020 | 15 minutes reading time

Mob programming and shared everything

Mob programming is a technique we use extensively for sharing knowledge in the team, improving developer skills and increasing team cohesion. These might not be the primary goals of your business, but they probably contribute much more than you realise...

Agile
Agile methods
Team Programming
Software development
Remote Work

19.3.2020 | 11 minutes reading time

Florian Schneider

John Fletcher

We did our homework – what are the next steps? – Part 4

First: the most important step for a company is to identify the user’s pain points or particular frustration, rather than focussing on the amount of features you think are good for the user to have. Take a moment and rethink those decisions based on ...

Startup
Agile
Agile transformation
Product management
Agile methods
Testing
UX/UI

16.3.2020 | 6 minutes reading time

Database design using Anchor Modeling

Where did Anchor Modeling originate?

Anchor modeling basics

Why Anchor modeling in online transaction processing (OLTP) software?

Agility and temporality

Drawbacks

Conclusions

Was this post helpful?

Blog author

More articles

Lessons learned from a successful project

AWS CDK Part 6: Lessons learned

AWS CDK Part 5: How to create a step function

AWS CDK Part 4: How to create Lambdas

RDS database migration with Lambda

Will AWS CDK replace Terraform and the Serverless Framework?

Use Serverless AWS step functions to reduce VPC costs

Improving the Lambda developer experience

Retrospective on the value stream of your software delivery

Reflections on DDD Europe 2019

Continuous Validation for Security Configurations

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Charge your APIs Volume 27: Transition from COE/C4E to an API Platform...

Becoming a Data-Driven Company with Applied Data Products

Count your queries! Repository integration tests with Hibernate Statistics

How to upgrade your Aurora Serverless database schema using CDK and Lambda

Microstream – the end of O/R mappers?

Planning Poker: Tools for online estimation sessions

Jira templates for user stories, tasks and bugs

Low-code software development and the emergence of “Devigners”

Organising a remote workshop EventStorming

Crowded backlog? A product is more than the sum of its features

Agile Toolbox: 10-minute story time

Quality means teamwork

Why user-oriented development is so important – the story of tactics.ai

Failure on demand – Scenes from an agile transformation

Modern solutions to manage large-scale projects in 2020

How NOT to manage large and complex projects in 2020

Pair programming without keyboard

Remote teamwork – experience report from a distributed team

Mob programming and shared everything

We did our homework – what are the next steps? – Part 4