29 Drift Detection

Chapter 29: Data & Model Drift Detection🔗

Drift detection concepts and tools are covered in Chapter 28: Model Monitoring.

Additional tools for statistical drift detection:

Evidently AI (Detailed)🔗

from evidently.report import Report
from evidently.metric_preset import DataDriftPreset, TargetDriftPreset

report = Report(metrics=[DataDriftPreset(), TargetDriftPreset()])
report.run(reference_data=train_df, current_data=production_df)
report.save_html("drift_report.html")

# Programmatic access
results = report.as_dict()
drifted_features = [
    k for k, v in results["metrics"][0]["result"]["drift_by_columns"].items()
    if v["drift_detected"]
]
print(f"Drifted features: {drifted_features}")

Statistical Tests Used🔗

Test When to Use Metric
KS Test Continuous features p-value < 0.05 = drift
Chi-squared Categorical features p-value < 0.05 = drift
PSI Population Stability Index PSI > 0.2 = significant drift
KL Divergence Distribution distance Higher = more drift
Jensen-Shannon Symmetric version of KL 0 = identical, 1 = max different
Wasserstein Distribution distance Higher = more drift

Next → Chapter 30: Metadata & Lineage