Feature Engineering

Creating, selecting or transforming raw dataset attributes into features that improve the performance of machine learning models.

Definition

The art and science of converting domain data (timestamps, text, sensor readings) into meaningful inputs—creating polynomial features, encoding categorical variables, constructing interaction terms, or normalizing distributions. Good feature engineering reduces model complexity, enhances interpretability, and can embed domain knowledge. Governance needs to track feature lineage, validate transformations, and assess feature-drift impacts on model fairness.

Real-World Example

A retail analytics team engineers “days_since_last_purchase” from transaction dates and “average_spend_per_visit” from sales logs. These features significantly boost the customer-churn model’s recall from 70% to 85%, and clear documentation ensures data-governance audits can trace each engineered field back to raw sources.