Chapter 03: MLOps Roles & Team Structureπ
"MLOps is as much about people and process as it is about technology."
3.1 The MLOps Team Landscapeπ
Successful MLOps requires collaboration across multiple roles. Here's how they fit together:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MLOPS TEAM STRUCTURE β
β β
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ β
β β DATA SCIENTIST β β ML ENGINEER β β MLOPS ENGINEERβ β
β β β β β β β β
β β Problem framing βββββΆβ Model training βββββΆβ CI/CD pipelinesβ β
β β Exploration β β Optimization β β Infrastructure β β
β β Prototyping β β Productionizationβ β Monitoring β β
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ β
β β DATA ENGINEER β β PLATFORM ENGINEERβ β QA ENGINEER β β
β β β β β β β β
β β Data pipelines β β K8s clusters β β Data tests β β
β β Feature eng. β β GPU infra β β Model tests β β
β β Data quality β β Developer tools β β Integration β β
β ββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3.2 Role Descriptionsπ
Data Scientist (DS)π
Focus: Building accurate models
- Explores data, hypothesizes features, trains models
- Uses: Jupyter, Python, sklearn, PyTorch, Pandas
- Produces: Model prototypes, experiment results, feature ideas
- MLOps responsibility: Write clean, modular code; use experiment tracking; document experiments
Machine Learning Engineer (MLE)π
Focus: Productionizing models
- Takes DS prototypes and makes them production-ready
- Uses: Python, Docker, FastAPI, MLflow, CI/CD tools
- Produces: Packaged models, serving code, tests
- MLOps responsibility: Refactor notebook code into modules; write unit tests; optimize inference
MLOps Engineerπ
Focus: Automating and operating the ML platform
- Builds CI/CD pipelines, manages infrastructure, sets up monitoring
- Uses: Jenkins, Kubernetes, Terraform, Prometheus, Airflow
- Produces: Pipelines, dashboards, alerting systems
- MLOps responsibility: Everything runs reliably at scale
Data Engineer (DE)π
Focus: Reliable data pipelines
- Builds ETL/ELT pipelines that feed ML training and serving
- Uses: Airflow, Spark, dbt, BigQuery, Kafka
- Produces: Clean, versioned datasets; feature pipelines
- MLOps responsibility: Ensure data quality and freshness
Platform / Infrastructure Engineerπ
Focus: Developer platforms and compute
- Manages Kubernetes clusters, GPU nodes, storage
- Uses: Terraform, GKE, Helm, Ansible
- Produces: Scalable, reliable compute infrastructure
3.3 Team Topologiesπ
Model 1: Embedded ML Teamsπ
Product Team A: Product Team B:
DS + MLE + DE DS + MLE + DE
(own full stack) (own full stack)
Shared Platform:
MLOps Eng + Platform Eng (serve both teams)
Best for: Large orgs with many independent products
Model 2: Centralized ML Platformπ
Central ML Platform Team
(MLOps + Platform + DataEng)
β β
βββββββββββββββ βββββββββββββββ
DS Team A DS Team B
(uses shared platform) (uses shared platform)
Best for: Orgs starting MLOps, standardization needed
Model 3: Full-Stack ML Teamsπ
Team: DS + MLE + DataEng + MLOps (all roles in one team)
Best for: Startups, early-stage ML
3.4 Responsibilities Matrix (RACI)π
| Activity | Data Scientist | ML Engineer | MLOps Engineer | Data Engineer |
|---|---|---|---|---|
| Define business problem | R/A | C | I | I |
| Data exploration | R/A | C | I | C |
| Feature engineering | R | A | I | C |
| Model training code | R | A | I | I |
| Unit tests for model | C | R/A | I | I |
| CI/CD pipeline | I | C | R/A | I |
| Docker packaging | C | R | A | I |
| K8s deployment | I | C | R/A | I |
| Monitoring setup | I | C | R/A | I |
| Data pipeline | I | I | C | R/A |
| Feature store mgmt | C | C | I | R/A |
R=Responsible, A=Accountable, C=Consulted, I=Informed
3.5 Communication & Collaboration Toolsπ
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MLOPS COLLABORATION STACK β
β β
β π¬ Communication: Slack, Teams β
β π Project Mgmt: Jira, Linear, GitHub Issues β
β π Documentation: Confluence, Notion, GitHub Wikiβ
β π¬ Experiments: MLflow, W&B (shared access) β
β π Dashboards: Grafana, Looker (shared) β
β π» Code Review: GitHub PRs with required β
β reviews from MLE + MLOps β
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
Next β Chapter 04: Git & GitHub