Chapter 05: Jenkins for MLOpsπ
"Jenkins is the automation engine that glues your ML pipeline together."
5.1 What is Jenkins?π
Jenkins is an open-source automation server written in Java. It enables developers and ML engineers to build, test, and deploy software and ML models automatically.
Key Featuresπ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β JENKINS FEATURES β
β β
β π 1800+ Plugins β integrates with anything β
β π Distributed β run jobs on multiple agents β
β π Pipeline-as-Code β Jenkinsfile in your repo β
β π Notifications β Slack, Email, PagerDuty β
β π Triggers β Git hooks, cron, webhooks β
β π₯οΈ Web UI β visual pipeline monitoring β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
5.2 Jenkins Architectureπ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β JENKINS ARCHITECTURE β
β β
β βββββββββββββββββββββββ β
β β JENKINS MASTER β β Web UI, Job Scheduling, Config β
β β (Controller) β β
β ββββββββββββ¬βββββββββββ β
β β β
β ββββββββββΌβββββββββ β
β βΌ βΌ βΌ β
β ββββββββ ββββββββ ββββββββ β
β βAgent β βAgent β βAgent β β Workers that run jobs β
β β 1 β β 2 β β 3 β β
β βLinux β βDockerβ βCloud β β
β ββββββββ ββββββββ ββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Master/Controller: Manages the web UI, schedules builds, coordinates agents
- Agent: Machine that executes the actual pipeline steps
5.3 Jenkins Setup (Docker)π
Easiest way to run Jenkins locally:
# Pull and run Jenkins
docker run -d \
--name jenkins \
-p 8080:8080 \
-p 50000:50000 \
-v jenkins_home:/var/jenkins_home \
jenkins/jenkins:lts
# Get initial admin password
docker exec jenkins cat /var/jenkins_home/secrets/initialAdminPassword
# Access Jenkins at: http://localhost:8080
5.4 Jenkinsfile β Pipeline as Codeπ
A Jenkinsfile defines your CI/CD pipeline in code, stored in your Git repo.
Two Syntaxesπ
Declarative (recommended β simpler) Scripted (flexible β Groovy)
βββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββ
pipeline { node {
agent any stage('Build') {
stages { sh 'make build'
stage('Build') { }
steps { }
sh 'make build'
}
}
}
}
5.5 Complete MLOps Jenkinsfileπ
// Jenkinsfile β MLOps CI/CD Pipeline
pipeline {
agent {
docker {
image 'python:3.10-slim'
args '-v /var/run/docker.sock:/var/run/docker.sock'
}
}
environment {
GCR_REGISTRY = 'gcr.io/my-gcp-project'
MODEL_NAME = 'ml-model'
ACCURACY_THRESHOLD = '0.85'
}
stages {
stage('π§ Setup') {
steps {
sh 'pip install -r requirements.txt'
}
}
stage('π§Ή Lint & Format') {
steps {
sh 'flake8 src/ --max-line-length=100'
sh 'black --check src/'
}
}
stage('π§ͺ Unit Tests') {
steps {
sh 'pytest tests/unit/ -v --junitxml=reports/unit_tests.xml'
}
post {
always {
junit 'reports/unit_tests.xml'
}
}
}
stage('π Data Validation') {
steps {
sh 'python src/validate_data.py'
}
}
stage('ποΈ Train Model') {
steps {
sh 'python src/train.py --config config/train_config.yaml'
archiveArtifacts artifacts: 'models/*.pkl', fingerprint: true
}
}
stage('π Evaluate Model') {
steps {
sh 'python src/evaluate.py'
sh '''
ACCURACY=$(python -c "import json; print(json.load(open('metrics/results.json'))['accuracy'])")
echo "Accuracy: $ACCURACY"
python -c "
import json, sys
m = json.load(open('metrics/results.json'))
print(f'Accuracy: {m[\"accuracy\"]}')
assert m['accuracy'] >= float('${ACCURACY_THRESHOLD}'), f'Below threshold!'
print('β
Model quality check PASSED')
"
'''
}
}
stage('π³ Build Docker Image') {
steps {
sh "docker build -t ${GCR_REGISTRY}/${MODEL_NAME}:${BUILD_NUMBER} ."
sh "docker tag ${GCR_REGISTRY}/${MODEL_NAME}:${BUILD_NUMBER} ${GCR_REGISTRY}/${MODEL_NAME}:latest"
}
}
stage('π€ Push to Registry') {
steps {
withCredentials([file(credentialsId: 'gcp-sa-key', variable: 'GCP_KEY')]) {
sh "gcloud auth activate-service-account --key-file=$GCP_KEY"
sh "gcloud auth configure-docker"
sh "docker push ${GCR_REGISTRY}/${MODEL_NAME}:${BUILD_NUMBER}"
sh "docker push ${GCR_REGISTRY}/${MODEL_NAME}:latest"
}
}
}
stage('π Deploy to Staging') {
steps {
sh "kubectl set image deployment/ml-model ml-model=${GCR_REGISTRY}/${MODEL_NAME}:${BUILD_NUMBER} -n staging"
sh "kubectl rollout status deployment/ml-model -n staging"
}
}
stage('β
Integration Tests') {
steps {
sh 'pytest tests/integration/ -v'
}
}
stage('π Deploy to Production') {
when {
branch 'main'
}
input {
message "Deploy to production?"
ok "Yes, deploy!"
}
steps {
sh "kubectl set image deployment/ml-model ml-model=${GCR_REGISTRY}/${MODEL_NAME}:${BUILD_NUMBER} -n production"
sh "kubectl rollout status deployment/ml-model -n production"
}
}
}
post {
success {
slackSend(color: 'good', message: "β
ML Pipeline SUCCESS: Build #${BUILD_NUMBER}")
}
failure {
slackSend(color: 'danger', message: "β ML Pipeline FAILED: Build #${BUILD_NUMBER}")
emailext(
subject: "FAILED: ML Pipeline Build ${BUILD_NUMBER}",
body: "Check Jenkins: ${BUILD_URL}",
to: "mlteam@company.com"
)
}
always {
cleanWs()
}
}
}
5.6 Jenkins Pipeline Visualizationπ
Stage View (Jenkins Blue Ocean):
Setup β Lint β Tests β Data β Train β Eval β Dockerβ Push β Deployβ
β β β Valid β β β β β β
β
β β
β β
β β
β β
β β
β β
β β
β β
β
2s β 5s β 30s β 10s β 5min β 15s β 45s β 20s β 30s β
5.7 Jenkins Triggersπ
# Trigger types in Jenkinsfile:
# 1. Poll SCM β check GitHub every 5 minutes
triggers {
pollSCM('H/5 * * * *')
}
# 2. Webhook β GitHub pushes trigger instantly (preferred)
# Configure in GitHub β Settings β Webhooks β Add webhook β http://jenkins:8080/github-webhook/
# 3. Cron β run nightly retraining at 2am
triggers {
cron('0 2 * * *')
}
# 4. Upstream β trigger when another job finishes
triggers {
upstream(upstreamProjects: 'data-pipeline-job', threshold: hudson.model.Result.SUCCESS)
}
5.8 Jenkins vs GitHub Actionsπ
| Feature | Jenkins | GitHub Actions |
|---|---|---|
| Hosting | Self-hosted | GitHub-managed |
| Setup | Complex | Simple (YAML) |
| Plugins | 1800+ | Marketplace Actions |
| Cost | Free (infra cost) | Free tier available |
| Customization | Very high | High |
| Best for | Enterprise, complex pipelines | GitHub-native projects |
Next Chapter β 06: Docker