Chapter 29: Data & Model Drift Detection🔗
Drift detection concepts and tools are covered in Chapter 28: Model Monitoring.
Additional tools for statistical drift detection:
Evidently AI (Detailed)🔗
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset, TargetDriftPreset
report = Report(metrics=[DataDriftPreset(), TargetDriftPreset()])
report.run(reference_data=train_df, current_data=production_df)
report.save_html("drift_report.html")
# Programmatic access
results = report.as_dict()
drifted_features = [
k for k, v in results["metrics"][0]["result"]["drift_by_columns"].items()
if v["drift_detected"]
]
print(f"Drifted features: {drifted_features}")
Statistical Tests Used🔗
| Test | When to Use | Metric |
|---|---|---|
| KS Test | Continuous features | p-value < 0.05 = drift |
| Chi-squared | Categorical features | p-value < 0.05 = drift |
| PSI | Population Stability Index | PSI > 0.2 = significant drift |
| KL Divergence | Distribution distance | Higher = more drift |
| Jensen-Shannon | Symmetric version of KL | 0 = identical, 1 = max different |
| Wasserstein | Distribution distance | Higher = more drift |