Chapter 09: ClearML & Neptune.ai🔗
"More experiment tracking choices — ClearML for self-hosted control, Neptune for enterprise collaboration."
9.1 ClearML🔗
ClearML is a full open-source MLOps platform — experiment tracking, data versioning, pipeline orchestration, and model serving in one tool.
Key Features🔗
ClearML SUITE:
├── ClearML Experiment → Experiment tracking (like MLflow)
├── ClearML Data → Dataset versioning (like DVC)
├── ClearML Pipelines → Workflow orchestration
├── ClearML Serving → Model deployment
└── ClearML Agent → Remote execution on any machine
Quick Start🔗
# pip install clearml
from clearml import Task, Logger
# Initialize task (auto-logs everything)
task = Task.init(
project_name="Churn Prediction",
task_name="GBM Experiment v3",
task_type=Task.TaskTypes.training,
)
# Auto-logging captures: hyperparams, metrics, stdout, git diff, requirements.txt
# Manual logging
logger = task.get_logger()
logger.report_scalar("accuracy", "train", value=0.91, iteration=1)
logger.report_scalar("accuracy", "test", value=0.88, iteration=1)
logger.report_text("Model trained successfully")
# Connect config dict
config = {"n_estimators": 200, "lr": 0.05}
task.connect(config)
# Log model
task.upload_artifact("model", artifact_object="models/model.pkl")
# Close
task.close()
ClearML Agent (Remote Execution)🔗
# Install agent on GPU machine
pip install clearml-agent
# Start agent (picks up queued experiments)
clearml-agent daemon --queue default --detached
9.2 Neptune.ai🔗
Neptune.ai is a metadata store for ML experiments — logs runs, compares them, and integrates with almost everything.
# pip install neptune
import neptune
run = neptune.init_run(
project="my-org/churn-prediction",
api_token="NEPTUNE_API_TOKEN",
)
# Log parameters
run["parameters"] = {
"n_estimators": 200,
"learning_rate": 0.05,
}
# Log metrics
run["train/accuracy"] = 0.91
run["test/accuracy"] = 0.88
run["test/f1"] = 0.85
# Log files
run["model/pickled"].upload("models/model.pkl")
run["reports/confusion_matrix"].upload("reports/confusion.png")
# Stop
run.stop()
9.3 Tool Selection Guide🔗
| Situation | Recommended Tool |
|---|---|
| Open source, self-hosted needed | ClearML or MLflow |
| Best UI and collaboration | W&B |
| Enterprise compliance | Neptune.ai |
| GCP-native | Vertex AI Experiments |
| Starting out, free | MLflow (self-hosted) or W&B free tier |
Next → Chapter 10: AutoML & HPO