Hashing

The process of converting data into a fixed-size string of characters, used for data integrity checks and privacy-preserving record linkage.

Definition

A one-way cryptographic transformation that maps inputs of any length to fixed-length digests. In AI governance, hashing ensures data has not been tampered with (integrity checks), enables secure linkage across datasets without revealing raw values (privacy- preserving joins), and supports audit trails. Governance must select strong hash algorithms (SHA-2/3), manage salt values, and rotate algorithms when vulnerabilities emerge.

Real-World Example

A healthcare analytics platform hashes patient IDs with a per-project salt before joining data from clinics and labs. Analysts link records using the hash but never see actual IDs. Integrity checks compare file hashes to expected values before any data pipeline run, ensuring no unauthorized changes to sensitive datasets.