12 Jenkins

Chapter 05: Jenkins for MLOpsπŸ”—

"Jenkins is the automation engine that glues your ML pipeline together."


5.1 What is Jenkins?πŸ”—

Jenkins is an open-source automation server written in Java. It enables developers and ML engineers to build, test, and deploy software and ML models automatically.

Key FeaturesπŸ”—

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 JENKINS FEATURES                     β”‚
β”‚                                                      β”‚
β”‚  πŸ”Œ 1800+ Plugins    β†’ integrates with anything      β”‚
β”‚  🌍 Distributed      β†’ run jobs on multiple agents   β”‚
β”‚  πŸ“‹ Pipeline-as-Code β†’ Jenkinsfile in your repo      β”‚
β”‚  πŸ”” Notifications    β†’ Slack, Email, PagerDuty       β”‚
β”‚  πŸ”„ Triggers         β†’ Git hooks, cron, webhooks     β”‚
β”‚  πŸ–₯️  Web UI          β†’ visual pipeline monitoring    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

5.2 Jenkins ArchitectureπŸ”—

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    JENKINS ARCHITECTURE                          β”‚
β”‚                                                                  β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                        β”‚
β”‚   β”‚   JENKINS MASTER    β”‚ ← Web UI, Job Scheduling, Config       β”‚
β”‚   β”‚   (Controller)      β”‚                                        β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                        β”‚
β”‚              β”‚                                                   β”‚
β”‚     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”                                          β”‚
β”‚     β–Ό        β–Ό        β–Ό                                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”                                      β”‚
β”‚  β”‚Agent β”‚ β”‚Agent β”‚ β”‚Agent β”‚  ← Workers that run jobs             β”‚
β”‚  β”‚  1   β”‚ β”‚  2   β”‚ β”‚  3   β”‚                                      β”‚
β”‚  β”‚Linux β”‚ β”‚Dockerβ”‚ β”‚Cloud β”‚                                      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”˜                                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  • Master/Controller: Manages the web UI, schedules builds, coordinates agents
  • Agent: Machine that executes the actual pipeline steps

5.3 Jenkins Setup (Docker)πŸ”—

Easiest way to run Jenkins locally:

# Pull and run Jenkins
docker run -d \
  --name jenkins \
  -p 8080:8080 \
  -p 50000:50000 \
  -v jenkins_home:/var/jenkins_home \
  jenkins/jenkins:lts

# Get initial admin password
docker exec jenkins cat /var/jenkins_home/secrets/initialAdminPassword

# Access Jenkins at: http://localhost:8080

5.4 Jenkinsfile β€” Pipeline as CodeπŸ”—

A Jenkinsfile defines your CI/CD pipeline in code, stored in your Git repo.

Two SyntaxesπŸ”—

Declarative (recommended β€” simpler)    Scripted (flexible β€” Groovy)
─────────────────────────────────────  ─────────────────────────────
pipeline {                             node {
  agent any                              stage('Build') {
  stages {                                 sh 'make build'
    stage('Build') {                     }
      steps {                          }
        sh 'make build'
      }
    }
  }
}

5.5 Complete MLOps JenkinsfileπŸ”—

// Jenkinsfile β€” MLOps CI/CD Pipeline
pipeline {
    agent {
        docker {
            image 'python:3.10-slim'
            args '-v /var/run/docker.sock:/var/run/docker.sock'
        }
    }

    environment {
        GCR_REGISTRY = 'gcr.io/my-gcp-project'
        MODEL_NAME   = 'ml-model'
        ACCURACY_THRESHOLD = '0.85'
    }

    stages {

        stage('πŸ”§ Setup') {
            steps {
                sh 'pip install -r requirements.txt'
            }
        }

        stage('🧹 Lint & Format') {
            steps {
                sh 'flake8 src/ --max-line-length=100'
                sh 'black --check src/'
            }
        }

        stage('πŸ§ͺ Unit Tests') {
            steps {
                sh 'pytest tests/unit/ -v --junitxml=reports/unit_tests.xml'
            }
            post {
                always {
                    junit 'reports/unit_tests.xml'
                }
            }
        }

        stage('πŸ“Š Data Validation') {
            steps {
                sh 'python src/validate_data.py'
            }
        }

        stage('πŸ‹οΈ Train Model') {
            steps {
                sh 'python src/train.py --config config/train_config.yaml'
                archiveArtifacts artifacts: 'models/*.pkl', fingerprint: true
            }
        }

        stage('πŸ“ˆ Evaluate Model') {
            steps {
                sh 'python src/evaluate.py'
                sh '''
                    ACCURACY=$(python -c "import json; print(json.load(open('metrics/results.json'))['accuracy'])")
                    echo "Accuracy: $ACCURACY"
                    python -c "
import json, sys
m = json.load(open('metrics/results.json'))
print(f'Accuracy: {m[\"accuracy\"]}')
assert m['accuracy'] >= float('${ACCURACY_THRESHOLD}'), f'Below threshold!'
print('βœ… Model quality check PASSED')
"
                '''
            }
        }

        stage('🐳 Build Docker Image') {
            steps {
                sh "docker build -t ${GCR_REGISTRY}/${MODEL_NAME}:${BUILD_NUMBER} ."
                sh "docker tag ${GCR_REGISTRY}/${MODEL_NAME}:${BUILD_NUMBER} ${GCR_REGISTRY}/${MODEL_NAME}:latest"
            }
        }

        stage('πŸ“€ Push to Registry') {
            steps {
                withCredentials([file(credentialsId: 'gcp-sa-key', variable: 'GCP_KEY')]) {
                    sh "gcloud auth activate-service-account --key-file=$GCP_KEY"
                    sh "gcloud auth configure-docker"
                    sh "docker push ${GCR_REGISTRY}/${MODEL_NAME}:${BUILD_NUMBER}"
                    sh "docker push ${GCR_REGISTRY}/${MODEL_NAME}:latest"
                }
            }
        }

        stage('πŸš€ Deploy to Staging') {
            steps {
                sh "kubectl set image deployment/ml-model ml-model=${GCR_REGISTRY}/${MODEL_NAME}:${BUILD_NUMBER} -n staging"
                sh "kubectl rollout status deployment/ml-model -n staging"
            }
        }

        stage('βœ… Integration Tests') {
            steps {
                sh 'pytest tests/integration/ -v'
            }
        }

        stage('🌐 Deploy to Production') {
            when {
                branch 'main'
            }
            input {
                message "Deploy to production?"
                ok "Yes, deploy!"
            }
            steps {
                sh "kubectl set image deployment/ml-model ml-model=${GCR_REGISTRY}/${MODEL_NAME}:${BUILD_NUMBER} -n production"
                sh "kubectl rollout status deployment/ml-model -n production"
            }
        }
    }

    post {
        success {
            slackSend(color: 'good', message: "βœ… ML Pipeline SUCCESS: Build #${BUILD_NUMBER}")
        }
        failure {
            slackSend(color: 'danger', message: "❌ ML Pipeline FAILED: Build #${BUILD_NUMBER}")
            emailext(
                subject: "FAILED: ML Pipeline Build ${BUILD_NUMBER}",
                body: "Check Jenkins: ${BUILD_URL}",
                to: "mlteam@company.com"
            )
        }
        always {
            cleanWs()
        }
    }
}

5.6 Jenkins Pipeline VisualizationπŸ”—

Stage View (Jenkins Blue Ocean):

 Setup β”‚ Lint  β”‚ Tests β”‚ Data  β”‚ Train β”‚ Eval  β”‚ Dockerβ”‚ Push  β”‚ Deployβ”‚
       β”‚       β”‚       β”‚ Valid β”‚       β”‚       β”‚       β”‚       β”‚       β”‚
  βœ…  β”‚  βœ…  β”‚  βœ…  β”‚  βœ…  β”‚  βœ…  β”‚  βœ…  β”‚  βœ…  β”‚  βœ…  β”‚  βœ…  β”‚
 2s   β”‚  5s   β”‚  30s  β”‚  10s  β”‚  5min β”‚  15s  β”‚  45s  β”‚  20s  β”‚  30s  β”‚

5.7 Jenkins TriggersπŸ”—

# Trigger types in Jenkinsfile:

# 1. Poll SCM β€” check GitHub every 5 minutes
triggers {
    pollSCM('H/5 * * * *')
}

# 2. Webhook β€” GitHub pushes trigger instantly (preferred)
# Configure in GitHub β†’ Settings β†’ Webhooks β†’ Add webhook β†’ http://jenkins:8080/github-webhook/

# 3. Cron β€” run nightly retraining at 2am
triggers {
    cron('0 2 * * *')
}

# 4. Upstream β€” trigger when another job finishes
triggers {
    upstream(upstreamProjects: 'data-pipeline-job', threshold: hudson.model.Result.SUCCESS)
}

5.8 Jenkins vs GitHub ActionsπŸ”—

Feature Jenkins GitHub Actions
Hosting Self-hosted GitHub-managed
Setup Complex Simple (YAML)
Plugins 1800+ Marketplace Actions
Cost Free (infra cost) Free tier available
Customization Very high High
Best for Enterprise, complex pipelines GitHub-native projects

Next Chapter β†’ 06: Docker