Skip to content

Common Metadata Framework (CMF)

CMF is a framework for tracking the full provenance of data, code, and metadata across complex scientific workflows. It brings version control and reproducibility to data science in the same way Git does for software.

What It Does

  • Treats code, data, and metadata as a single atomic unit
  • Tracks the full graph of inputs, transformations, and outputs
  • Supports both local and distributed repositories
  • Enables reproducible, auditable, and shareable workflows

CMF integrates with tools like Git, DVC (Data Version Control), and MLMD, and provides APIs for capturing metadata and linking it directly to workflow artifacts. It plays a central role in the Fusion Data Platform, where it powers provenance tracking and dataset sharing across teams and facilities.

Documentation

📚 Full CMF documentation is available at:
👉 https://hewlettpackard.github.io/cmf/


CMF makes it easy to understand where your results came from, how they were produced, and how to recreate them. Whether you're building workflows locally or sharing results across a federation, CMF provides the backbone for trust and reproducibility.