03 Mlops Roles Teams

Chapter 03: MLOps Roles & Team StructureπŸ”—

"MLOps is as much about people and process as it is about technology."


3.1 The MLOps Team LandscapeπŸ”—

Successful MLOps requires collaboration across multiple roles. Here's how they fit together:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     MLOPS TEAM STRUCTURE                             β”‚
β”‚                                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  DATA SCIENTIST  β”‚    β”‚  ML ENGINEER     β”‚    β”‚ MLOPS ENGINEERβ”‚  β”‚
β”‚  β”‚                  β”‚    β”‚                  β”‚    β”‚               β”‚  β”‚
β”‚  β”‚  Problem framing │◀──▢│ Model training   │◀──▢│ CI/CD pipelinesβ”‚  β”‚
β”‚  β”‚  Exploration     β”‚    β”‚ Optimization     β”‚    β”‚ Infrastructure β”‚  β”‚
β”‚  β”‚  Prototyping     β”‚    β”‚ Productionizationβ”‚    β”‚ Monitoring     β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚           β”‚                       β”‚                      β”‚          β”‚
β”‚           β–Ό                       β–Ό                      β–Ό          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  DATA ENGINEER   β”‚    β”‚ PLATFORM ENGINEERβ”‚    β”‚  QA ENGINEER  β”‚  β”‚
β”‚  β”‚                  β”‚    β”‚                  β”‚    β”‚               β”‚  β”‚
β”‚  β”‚  Data pipelines  β”‚    β”‚  K8s clusters    β”‚    β”‚  Data tests   β”‚  β”‚
β”‚  β”‚  Feature eng.    β”‚    β”‚  GPU infra       β”‚    β”‚  Model tests  β”‚  β”‚
β”‚  β”‚  Data quality    β”‚    β”‚  Developer tools β”‚    β”‚  Integration  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

3.2 Role DescriptionsπŸ”—

Data Scientist (DS)πŸ”—

Focus: Building accurate models
- Explores data, hypothesizes features, trains models
- Uses: Jupyter, Python, sklearn, PyTorch, Pandas
- Produces: Model prototypes, experiment results, feature ideas
- MLOps responsibility: Write clean, modular code; use experiment tracking; document experiments

Machine Learning Engineer (MLE)πŸ”—

Focus: Productionizing models
- Takes DS prototypes and makes them production-ready
- Uses: Python, Docker, FastAPI, MLflow, CI/CD tools
- Produces: Packaged models, serving code, tests
- MLOps responsibility: Refactor notebook code into modules; write unit tests; optimize inference

MLOps EngineerπŸ”—

Focus: Automating and operating the ML platform
- Builds CI/CD pipelines, manages infrastructure, sets up monitoring
- Uses: Jenkins, Kubernetes, Terraform, Prometheus, Airflow
- Produces: Pipelines, dashboards, alerting systems
- MLOps responsibility: Everything runs reliably at scale

Data Engineer (DE)πŸ”—

Focus: Reliable data pipelines
- Builds ETL/ELT pipelines that feed ML training and serving
- Uses: Airflow, Spark, dbt, BigQuery, Kafka
- Produces: Clean, versioned datasets; feature pipelines
- MLOps responsibility: Ensure data quality and freshness

Platform / Infrastructure EngineerπŸ”—

Focus: Developer platforms and compute
- Manages Kubernetes clusters, GPU nodes, storage
- Uses: Terraform, GKE, Helm, Ansible
- Produces: Scalable, reliable compute infrastructure


3.3 Team TopologiesπŸ”—

Model 1: Embedded ML TeamsπŸ”—

Product Team A:         Product Team B:
  DS + MLE + DE            DS + MLE + DE
  (own full stack)          (own full stack)

Shared Platform:
  MLOps Eng + Platform Eng (serve both teams)

Best for: Large orgs with many independent products

Model 2: Centralized ML PlatformπŸ”—

                    Central ML Platform Team
                   (MLOps + Platform + DataEng)
                          β”‚         β”‚
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         └─────────────┐
     DS Team A                              DS Team B
  (uses shared platform)               (uses shared platform)

Best for: Orgs starting MLOps, standardization needed

Model 3: Full-Stack ML TeamsπŸ”—

Team:  DS + MLE + DataEng + MLOps (all roles in one team)

Best for: Startups, early-stage ML


3.4 Responsibilities Matrix (RACI)πŸ”—

Activity Data Scientist ML Engineer MLOps Engineer Data Engineer
Define business problem R/A C I I
Data exploration R/A C I C
Feature engineering R A I C
Model training code R A I I
Unit tests for model C R/A I I
CI/CD pipeline I C R/A I
Docker packaging C R A I
K8s deployment I C R/A I
Monitoring setup I C R/A I
Data pipeline I I C R/A
Feature store mgmt C C I R/A

R=Responsible, A=Accountable, C=Consulted, I=Informed


3.5 Communication & Collaboration ToolsπŸ”—

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           MLOPS COLLABORATION STACK              β”‚
β”‚                                                  β”‚
β”‚  πŸ’¬ Communication: Slack, Teams                  β”‚
β”‚  πŸ“‹ Project Mgmt:  Jira, Linear, GitHub Issues   β”‚
β”‚  πŸ“š Documentation: Confluence, Notion, GitHub Wikiβ”‚
β”‚  πŸ”¬ Experiments:   MLflow, W&B (shared access)   β”‚
β”‚  πŸ“Š Dashboards:    Grafana, Looker (shared)      β”‚
β”‚  πŸ’» Code Review:   GitHub PRs with required      β”‚
β”‚                    reviews from MLE + MLOps      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Next β†’ Chapter 04: Git & GitHub