Training Pipelines: Automate Everything
Stop running scripts manually. Build Azure ML pipelines that chain data prep, training, evaluation, and registration into reproducible, automated workflows.
Why pipelines?
Imagine a car assembly line vs building a car by hand.
Building by hand: one person does everything β welding, painting, engine, interior. If theyβre sick, nothing happens. If a step fails, you start over.
Assembly line: each station does one job. Raw metal goes in, a finished car comes out. If the painting station fails, you fix just that station. You can run the line 24/7.
ML pipelines are the assembly line for model training. Data prep β feature engineering β training β evaluation β registration. Each step is a reusable component. The whole pipeline runs automatically, logs everything, and can be triggered by GitHub Actions.
Notebooks vs scripts vs pipelines
| Feature | Reproducible | Automatable | Production-Ready | Best For |
|---|---|---|---|---|
| Notebooks (.ipynb) | Low β cell order matters | Hard β requires conversion | No | Exploration, EDA, prototyping |
| Scripts (.py) | Medium β deterministic | Yes β CLI/SDK submission | Partial | Single training jobs, simple workflows |
| Pipelines | High β defined DAG | Yes β CI/CD triggers | Yes | Production training, multi-step workflows |
Exam tip: Notebooks in production
The exam recognises notebooks for exploration and experimentation but NOT for production training. If a question asks βwhat should a team use for production model training,β the answer is pipelines (or scripts submitted as jobs), never notebooks.
Notebooks are great for:
- Exploratory data analysis (EDA)
- Rapid prototyping
- Sharing results with stakeholders (visual outputs)
But they fail in production because:
- Cell execution order is fragile
- Hard to parameterise for different datasets
- Difficult to test and version reliably
Building a pipeline with Python SDK v2
from azure.ai.ml import load_component, Input
from azure.ai.ml.dsl import pipeline
# Load reusable components from YAML definitions
prepare_data = load_component(source="components/prepare/component.yaml")
train_model = load_component(source="components/train/component.yaml")
evaluate_model = load_component(source="components/evaluate/component.yaml")
@pipeline(
display_name="churn-training-pipeline",
compute="gpu-training-cluster",
experiment_name="churn-pipeline-runs"
)
def churn_pipeline(raw_data: Input, target_metric: float = 0.90):
# Step 1: Data preparation
prep_step = prepare_data(input_data=raw_data)
# Step 2: Training (uses output from step 1)
train_step = train_model(
training_data=prep_step.outputs.cleaned_data,
target_column="churned"
)
# Step 3: Evaluation (uses output from step 2)
eval_step = evaluate_model(
model=train_step.outputs.trained_model,
test_data=prep_step.outputs.test_data,
threshold=target_metric
)
return eval_step.outputs
# Create and submit the pipeline
pipeline_job = churn_pipeline(
raw_data=Input(type="uri_folder", path="azureml:churn-data:2")
)
returned_job = ml_client.jobs.create_or_update(pipeline_job)
Whatβs happening:
- Lines 5-7: Load components from YAML β each is a reusable building block
- Lines 9-13: The
@pipelinedecorator defines the workflow metadata - Line 14: The pipeline function accepts inputs β parameterised for different datasets
- Lines 16-29: Steps are chained by connecting outputs to inputs β Azure ML figures out the execution order
- Line 36: One line to submit the entire pipeline to the cloud
Pipeline YAML definition (alternative)
You can also define pipelines in YAML (often preferred for CI/CD):
# pipelines/training-pipeline.yaml
$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline
display_name: churn-training-pipeline
experiment_name: churn-pipeline-runs
compute: azureml:gpu-training-cluster
inputs:
raw_data:
type: uri_folder
path: azureml:churn-data:2
target_metric: 0.90
jobs:
prepare:
type: command
component: file:components/prepare/component.yaml
inputs:
input_data: ${{parent.inputs.raw_data}}
train:
type: command
component: file:components/train/component.yaml
inputs:
training_data: ${{parent.jobs.prepare.outputs.cleaned_data}}
target_column: churned
evaluate:
type: command
component: file:components/evaluate/component.yaml
inputs:
model: ${{parent.jobs.train.outputs.trained_model}}
test_data: ${{parent.jobs.prepare.outputs.test_data}}
threshold: ${{parent.inputs.target_metric}}
Whatβs happening:
- Lines 15-19: Step 1 references the pipeline input
- Lines 21-26: Step 2 references Step 1βs output β creating the dependency chain
- Lines 28-34: Step 3 evaluates using outputs from both previous steps
Exam tip: Python DSL vs YAML pipelines
Both approaches create identical pipelines. The exam may test when to use each:
- YAML pipelines: better for CI/CD (GitHub Actions can submit them directly), version-controlled, easy to review in PRs
- Python DSL (
@pipeline): better for complex logic, conditional steps, dynamic parameterisation
Most production MLOps teams use YAML for CI/CD pipelines and Python DSL for experimentation.
Scenario: Kai's automated retraining pipeline
NeuralSparkβs churn model needs monthly retraining on fresh data. Kai builds a pipeline triggered by GitHub Actions on the 1st of each month:
- Data prep β pulls latest customer data, cleans, splits
- Training β trains on fresh data with the same hyperparameters
- Evaluation β compares new model against production baseline
- Gate β if new model beats baseline by more than 1%, proceed
- Registration β registers the new model in the registry
The pipeline runs unattended. If the new model isnβt better, it stops at the gate and alerts the team.
Step caching
Azure ML caches pipeline step outputs. If a stepβs inputs and code havenβt changed, Azure ML reuses the previous output instead of re-running.
This means:
- Changing only the training script re-runs training and evaluation, but skips data prep
- Changing the dataset re-runs everything from prep onwards
- Changing the evaluation threshold re-runs only evaluation
Key terms flashcards
Knowledge check
NeuralSpark's training pipeline has 3 steps: data prep, training, and evaluation. Kai changes only the training script. What happens when the pipeline re-runs?
Dr. Fatima's compliance team requires that every production model training workflow is fully traceable and can be triggered automatically from CI/CD. What should she use?
π¬ Video coming soon
Next up: Distributed Training β scaling to datasets and models that donβt fit on a single machine.