Model Validation

The evaluation activities (e.g., testing against hold-out data, stress scenarios) that confirm an AI model meets its intended purpose and performance criteria.

Definition

A set of pre-deployment checks including: back-testing on unseen data, stress-testing under extreme or adversarial conditions, fairness and calibration assessments, and sensitivity analyses. Validation reports document methodologies, results, and any limitations. Governance requires independent validators, clear validation criteria, and formal sign-off before production release.

Real-World Example

A credit-scoring model undergoes validation by an independent team: they test it on a two-month hold-out set, simulate economic downturn scenarios, evaluate fairness across income brackets, and certify that performance and fairness metrics meet the bank’s policy thresholds before approval for live use.