Data APIs for the Modern Data Stack
OpenDAPI is an open specification for describing data models and their governance policies as Data APIs (DAPIs) — declarative, machine-readable contracts that make data products versioned, reviewable, and automatable.
Why OpenDAPI?
Modern data teams still rely on tribal knowledge and scattered documentation. Schemas evolve without clear ownership, downstream models break silently, and compliance checks happen far too late.
OpenDAPI changes that by treating data like code.
- Data producers take ownership — each dataset defines its own contract (a DAPI file).
- Pre-merge controls enforce metadata, privacy, and purpose tagging before code is merged.
- Post-merge automations keep dbt, replication, and catalogs in sync automatically.
- Self-service data becomes possible — consumers can discover governed datasets with confidence.
In short: DAPIs make data observable, predictable, and governed by design.
What is a DAPI?
A DAPI file is a YAML or JSON document that defines:
- The dataset (name, owner, and schema)
- Each field’s metadata (type, purpose, category, sensitivity)
- References to reusable governance policies (teams, datastores, purposes, subjects, categories)
Example:
# Post.dapi.yaml
schema: https://opendapi.org/spec/0-0-1/dapi.json
urn: woven.prisma.blog_prisma.Post
owner_team_urn: woven.teams.marketing_ops
description: This data model represents a blog post entity within the Prisma framework
for Woven's data management platform. The data is pulled from Substack and Medium
type: entity
primary_key:
- id
fields:
- name: authorId
data_type: int
is_nullable: true
description: A reference identifier linking the blog post to the author's unique
ID, establishing authorship and allowing for relational queries.
data_subjects_and_categories:
- subject_urn: employee
category_urn: id.unique_identifier
sensitivity_level: confidential
is_personal_data: true
is_direct_identifier: true
datastores:
sources:
- urn: woven.datastores.mysql
data:
identifier: production_Post
business_purposes:
- marketing_content
retention_days: 30
sinks:
- urn: woven.datastores.snowflake
data:
identifier: productionpost
namespace: production.hourly
replication_mechanism: fivetran
origin_urn: woven.datastores.mysql
freshness: one_hour
replication_config_urn: fivetran_connection_one_hour
business_purposes:
- analytics
retention_days: 30
privacy_requirements:
dsr_access_endpoint: dsr.woven.api
dsr_deletion_endpoint: dsr.woven.api
context:
service: blog_prisma
integration: prisma
rel_model_path: ../prisma/schema.prisma
retention_reference: created_at
The OpenDAPI Specification
OpenDAPI is defined by a set of JSON Schemas — one for each file type — versioned under /spec.
dapi.json— Defines the dataset contract — schema, fields, ownership, purposes, and retention.datastores.json— Lists physical storage systems or logical warehouses (Snowflake, BigQuery, etc.).purposes.json— Defines valid business purposes for data use (e.g.,marketing_analytics,risk_reporting).teams.json— Maps team identifiers to owners or responsible groups.subjects.json— Describes data subjects (individuals, organizations, systems) for privacy tagging.categories.json— Defines classification categories (e.g.,PII,Financial,Operational).opendapi.config.json— Configures OpenDAPI integrations and settings.
Learn More
OpenDAPI is maintained by Woven — helping data teams bring governance, observability, and automation directly into code.