Data Quality
The condition of data based on factors such as accuracy, completeness, reliability, and relevance, crucial for effective AI model performance.
Definition
A multidimensional measure—including correctness (error-free), completeness (no missing values), consistency (uniform formats), timeliness (up-to-date), and relevance (fit for purpose). Data-quality programs deploy automated validation rules, cleansing pipelines, and quality dashboards, with escalation procedures when metrics fall below thresholds.
Real-World Example
A credit-risk team tracks data-quality metrics for income and employment fields in loan applications. When missing-value rates exceed 2%, an automated alert triggers a review: data engineers correct ETL scripts and notify frontline staff to enforce mandatory fields, restoring data completeness before model retraining.