Popular searches
//

ODPS: The Standard for Data Products

7.11.2025 | 4 minutes reading time

The data landscape in an organization often looks like this: teams gather and produce data everyday. Each team develops their own metadata models and documentation, if there is any at all. Governance policies exist in scattered documentation (spreadsheets, wikis, Confluence pages) and are most of the time outdated. Data quality enforcement happens manually. The usage of your data is monetized mostly manually, if it is monetized at all. More often you see one time exports in CSV files sent around. Service Level Agreements are often communicated in meetings that are not documented, so there are no machine-readable commitments that can be monitored.

You end up with datasets that cannot scale beyond their original team, cannot be discovered or evaluated by AI agents, cannot cross organizational boundaries without extensive manual coordination.

wonderlane-6jA6eVsRJ6Q-unsplash.jpg

Finding the Right Data Without Standards [Symbolic Image]

Start Thinking in Data Products

You should start to think in data products. The previously described state of organization’s data landscape is mostly based on datasets in contrast to data products. Datasets are generated without a consumer in mind. They are the outcome of a team’s daily work and lack company wide intention. Data products on the other hand serve a purpose. They are business-oriented and solve specific problems. That said, the value of data products is much higher for an organization than the value of datasets.

Why Standards Could Help

Data products are most valuable if your whole organization has the same understanding of data products. Each team knows how to assess a data product if there is a universal language. The perfect situation to introduce a standard. Without a standard all teams invent their own format of data product metadata and document it in different places. Data products cannot be discovered by AI agents automatically. Manual discovery is also only possible if you have tribal knowledge.

hermann-mayer-RtQ1CTQMySk-unsplash.jpg

Organizing Data Products With ODPS [Symbolic Image]

Open Data Product Specification (ODPS)

The Open Data Product Specification (ODPS) is the standard to address this problem. It is a machine-readable YAML standard designed to describe data products in a consistent and interoperable way. Developed as a vendor-neutral, open-source initiative it provides a unified language that can be interpreted in an automatic way.

ODPS (v4.1) has been designed to provide information about technical, business, legal, and ethical aspects in ten core elements:

ODPS-design-971d4b0b.png

Technical Aspects (Infrastructure & Data Access)

Data Access

How to technically access the data (e.g. APIs, files, SQL, AI/MCP)

Data SLA

Technical service commitments (e.g. uptime, latency, response time)

Data Quality

Technical quality dimensions and validation rules

Data Contract

Technical expectations and schema definitions

Business Aspects (Product Strategy, Pricing, and Plans)

Data Product Details

Business metadata (e.g. name, description, value proposition, use cases)

Data Product Strategy

Business objectives and KPI alignment

Data Pricing Plans

Pricing models and monetization strategy

Payment Gateways

Transaction processing for monetization

Legal Aspects (Licensing & Intellectual Property Rights)

Data Licensing

Legal terms, usage rights, geographical restrictions

Data Holder

Legal entity responsible for the data product

Ethical Aspects (Data Privacy)

Data Licensing

Privacy terms, data subject rights, confidentiality

Data Access

Access control and authentication (who can access what)

Data Holder

Accountability for ethical data handling

Everything-as-Code

The key feature of ODPS is that everything is machine-readable code. Traditionally you have the problem that governance documentation exists – if at all – in Word documents, PDF files or the internal wiki. To enforce a policy, you manually translate documentation into code that can monitor the SLAs or your data quality. You need to do this every time for every product, while this process is error prone. 

ODPS eliminates the downsides of manual translation since the specification is the source for automation.

SLA:
  declarative:
    default:
      [...]
      dimensions:
        - dimension: uptime
          displaytitle:
            en: Uptime
          objective: 90
          unit: percent
  executable:
    - dimension: uptime
      type: prometheus
      reference: 'https://prometheus.io/docs/prometheus/latest/querying/basics/'
      spec: |-
        avg_over_time(
          (
            sum without() (up{job="prometheus"})
              or
            (0 * sum_over_time(up{job="prometheus"}[7d]))
          )[7d:5m]
        )

With the code example in the executable part you can automatically generate monitoring dashboards in Prometheus in your CI/CD pipeline. The metadata becomes executable infrastructure. As ODPS supports reusable components, you only have to change one YAML file to update production across 100 data products. This is infrastructure-as-code applied to data products.

Where You Could Start

ODPS standardizes data product metadata for automation. Developers get reusable components and tooling integration. Decision makers get faster scaling, reduced governance overhead, and vendor independence. The standard is production-ready, actively maintained, and backed by Linux Foundation.

ODPS makes sense for you, if you have multiple data products requiring consistent governance, want to monetize your data (internal or external), want to scale governance without headcount, have a data mesh architecture, or want to provide structured metadata to your AI agents to gain valuable insights from your data products

To see if the Open Data Product Specification provides value to your organization, I recommend the following: Pick one of your existing data products and map its metadata to the ODPS format to see what gaps the specification reveals in your current documentation.

You can simply start by managing ODPS files in git to streamline your documentation and automate your governance. In the future you could introduce an ODPS-supporting data marketplace like Entropy Data or Alation to gain even more value from your data products.

⁠Sources:
https://unsplash.com/photos/office-table-with-pile-of-papers-6jA6eVsRJ6Q
https://unsplash.com/photos/multi-story-white-painted-library-interior-RtQ1CTQMySk
https://opendataproducts.org/v4.1/images/ODPS-design-971d4b0b.png

share post

//

More articles in this subject area

Discover exciting further topics and let the codecentric world inspire you.