Skip to content

Research Data Management (RDM) tool for computer simulations: track and version control of input data, code and software versions, output data.

License

Notifications You must be signed in to change notification settings

cadet/CADET-RDM

CADET-RDM

CI Documentation License Python

CADET-RDM is a Research Data Management toolbox developed at Forschungszentrum Jülich. It supports computational research projects by tracking code, data, environments, and generated results in a reproducible and shareable way.

The toolbox is domain-agnostic and can be applied to any computational project with a structured workflow.

Scope and purpose

CADET-RDM helps manage and version

  • input data
  • source code
  • configurations and metadata
  • software and environment versions
  • generated output data

The primary goal is to ensure reproducibility, traceability, and reuse of computational results by explicitly linking them to the project state that produced them.

Repository structure

A CADET-RDM project consists of two independent but coupled Git repositories:

  1. Project repository Contains source code, configuration files, documentation, and metadata required to execute the computations.

  2. Output repository Contains the results generated by running the project code, including data products, models, figures, and run-specific metadata.

Both repositories have separate Git histories and remotes. CADET-RDM provides workflows that operate on both repositories to maintain a consistent link between code and results.

Using CADET-RDM

Result tracking and reproducibility

Each execution of project code creates a new output branch that contains only the files generated by that run.

In addition, a central run history records

  • the project repository commit used for the run
  • software and environment information
  • metadata required to reproduce the result

This commit structure allows results to be reproduced and inspected without manual bookkeeping.

Interfaces

CADET-RDM can be used through

  • a command line interface (CLI), e.g. for scripted or automated bash workflows
  • a Python interface, e.g. for direct context tracking of code within existing Python workflows

Additionally, CADET-RDM can be used within Jupyter Lab with some limitations.

Detailed descriptions of commands and APIs are provided in the dedicated interface documentation.

Typical workflow

  1. Initialize or clone a CADET-RDM project
  2. Develop and commit project code
  3. Execute computations with CADET-RDM result tracking
  4. Generate versioned output branches automatically
  5. Push project and output repositories to their remotes
  6. Reuse or reference results via their output branches

Results are referenced by unique output branch names that encode the timestamp, active project branch, and project commit hash. CADET-RDM provides a local cache mechanism that allows results from previous runs or from other CADET-RDM projects to be reused as input data while preserving provenance information.

Getting started

The full documentation is available at https://cadet-rdm.readthedocs.io

It includes installation instructions, usage guides for the different interfaces, and detailed descriptions of repository and result management workflows.

Project information

About

Research Data Management (RDM) tool for computer simulations: track and version control of input data, code and software versions, output data.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 5