Version Control

The practice of managing and tracking changes to AI code, models, and datasets over time to ensure reproducibility and auditability.

Definition

Involves use of systems like Git for code, DVC or LakeFS for data, and model‐registry tools for artifact versions. Every change—feature-engineering scripts, hyperparameter settings, dataset snapshots, trained model binaries—is tagged and documented. Version control enables rollback, branch management for experimentation, and full traceability of how any production model was derived.

Real-World Example

A financial‐services team stores its preprocessing code in Git, tracks raw and cleaned datasets via DVC, and registers each trained model in MLflow with its parameters and input data version. When anomalies appear, they can reproduce any previous model version exactly, aiding both debugging and audit compliance.