🔒 Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided AI-300 Domain 2
Domain 2 — Module 2 of 8 25%
7 of 25 overall

AI-300 Study Guide

Domain 1: Design and Implement an MLOps Infrastructure

  • ML Workspace: Your AI Control Room Free
  • Data, Environments & Components
  • Compute Targets: Choosing the Right Engine
  • Infrastructure as Code: Provisioning at Scale
  • Git & CI/CD for ML Projects

Domain 2: Implement Machine Learning Model Lifecycle and Operations

  • MLflow: Track Every Experiment Free
  • AutoML & Hyperparameter Tuning
  • Training Pipelines: Automate Everything
  • Distributed Training: Scale to Big Data
  • Model Registration & Versioning
  • Model Approval & Responsible AI Gates
  • Deploying Models: Endpoints in Production
  • Drift, Monitoring & Retraining

Domain 3: Design and Implement a GenAIOps Infrastructure

  • Foundry: Hubs, Projects & Platform Setup Free
  • Network Security & IaC for Foundry
  • Deploying Foundation Models
  • Model Versioning & Production Strategies
  • PromptOps: Design, Compare, Version & Ship

Domain 4: Implement Generative AI Quality Assurance and Observability

  • Evaluation: Datasets, Metrics & Quality Gates Free
  • Safety Evaluations & Custom Metrics
  • Monitoring GenAI in Production
  • Cost Tracking, Logging & Debugging

Domain 5: Optimize Generative AI Systems and Model Performance

  • RAG Optimization: Better Retrieval, Better Answers Free
  • Embeddings & Hybrid Search
  • Fine-Tuning: Methods, Data & Production

AI-300 Study Guide

Domain 1: Design and Implement an MLOps Infrastructure

  • ML Workspace: Your AI Control Room Free
  • Data, Environments & Components
  • Compute Targets: Choosing the Right Engine
  • Infrastructure as Code: Provisioning at Scale
  • Git & CI/CD for ML Projects

Domain 2: Implement Machine Learning Model Lifecycle and Operations

  • MLflow: Track Every Experiment Free
  • AutoML & Hyperparameter Tuning
  • Training Pipelines: Automate Everything
  • Distributed Training: Scale to Big Data
  • Model Registration & Versioning
  • Model Approval & Responsible AI Gates
  • Deploying Models: Endpoints in Production
  • Drift, Monitoring & Retraining

Domain 3: Design and Implement a GenAIOps Infrastructure

  • Foundry: Hubs, Projects & Platform Setup Free
  • Network Security & IaC for Foundry
  • Deploying Foundation Models
  • Model Versioning & Production Strategies
  • PromptOps: Design, Compare, Version & Ship

Domain 4: Implement Generative AI Quality Assurance and Observability

  • Evaluation: Datasets, Metrics & Quality Gates Free
  • Safety Evaluations & Custom Metrics
  • Monitoring GenAI in Production
  • Cost Tracking, Logging & Debugging

Domain 5: Optimize Generative AI Systems and Model Performance

  • RAG Optimization: Better Retrieval, Better Answers Free
  • Embeddings & Hybrid Search
  • Fine-Tuning: Methods, Data & Production
Domain 2: Implement Machine Learning Model Lifecycle and Operations Premium ⏱ ~13 min read

AutoML & Hyperparameter Tuning

Don't guess hyperparameters — sweep them. Learn AutoML for automated model selection and hyperparameter tuning with sweep jobs to find the optimal configuration.

Finding the best model automatically

☕ Simple explanation

Imagine you’re buying a car but there are 500 models.

You could test-drive every single one — that would take years. Or you could tell a smart assistant: “I need a sedan, under $40K, good fuel economy” and let them narrow it down to 5 finalists for you to try.

AutoML does this for machine learning. Instead of manually trying Random Forest, then XGBoost, then Neural Net… AutoML tries dozens of algorithms and configurations automatically, then tells you which one performed best.

Hyperparameter tuning is the fine-tuning step: once you’ve chosen your car model, you adjust the seat, mirrors, and steering to get the perfect fit.

Azure ML offers two complementary approaches to automated model optimization:

  • AutoML — automatically tries multiple algorithms, feature engineering techniques, and hyperparameters to find the best model for your data. Best for: initial exploration, establishing a baseline, when you’re unsure which algorithm to use.
  • Sweep jobs (hyperparameter tuning) — systematically searches a defined hyperparameter space for a specific algorithm. Best for: when you’ve chosen an algorithm and need to optimise its configuration.

Both approaches use MLflow tracking to log every trial, so you can compare results and understand why one configuration beats another.

AutoML: automated model selection

AutoML in Azure ML automatically:

  1. Tries multiple algorithms (Random Forest, XGBoost, LightGBM, Neural Nets…)
  2. Applies feature engineering (encoding, scaling, imputation)
  3. Selects the best model based on your chosen metric
  4. Logs everything to MLflow
from azure.ai.ml import automl

# Define an AutoML classification job
classification_job = automl.classification(
    training_data=Input(type="mltable", path="azureml:churn-data:2"),
    target_column_name="churned",
    primary_metric="AUC_weighted",
    compute="gpu-training-cluster",
    experiment_name="churn-automl-baseline",
)

# Configure limits
classification_job.set_limits(
    max_trials=50,           # Try up to 50 model configurations
    max_concurrent_trials=4,  # Run 4 trials in parallel
    timeout_minutes=120,      # Stop after 2 hours
    enable_early_termination=True  # Stop bad trials early
)

# Submit the job
returned_job = ml_client.jobs.create_or_update(classification_job)

What’s happening:

  • Lines 4-9: Defines a classification task — AutoML needs to know the data, target column, and which metric to optimise
  • Line 7: AUC_weighted is the metric AutoML maximises — it tries different algorithms to get the highest score
  • Lines 13-17: Limits prevent runaway costs — max 50 trials, 4 at a time, 2-hour cap
  • Line 17: Early termination stops trials that are clearly performing poorly

AutoML task types

TaskUse CaseExample Metric
ClassificationPredict a categoryAUC_weighted, accuracy, F1
RegressionPredict a numberRMSE, R2, MAE
Time-series forecastingPredict future valuesMAPE, RMSE
Image classificationClassify imagesAccuracy
Object detectionFind objects in imagesmAP
NLP text classificationClassify text documentsAccuracy, F1
Scenario: Kai establishes a baseline fast

Kai has a new customer churn dataset and needs a baseline model by Friday. Instead of spending days trying different algorithms:

  1. Runs AutoML with 50 trials and a 2-hour timeout
  2. AutoML tries 12 algorithms with various feature engineering
  3. Best model: LightGBM with AUC of 0.943
  4. Kai logs the winner and uses it as the benchmark

Now the data science team knows: “Beat 0.943 AUC or we ship the AutoML model.”

Priya (CTO): “We have a production-ready baseline in 2 hours? I love this.”

Sweep jobs: hyperparameter tuning

Once you’ve chosen an algorithm, sweep jobs search for the best hyperparameters:

from azure.ai.ml.sweep import Choice, Uniform, BanditPolicy
from azure.ai.ml import command

# Define the training command
train_command = command(
    code="./src",
    command="python train.py "
            "--learning-rate ${{search_space.learning_rate}} "
            "--n-estimators ${{search_space.n_estimators}} "
            "--max-depth ${{search_space.max_depth}}",
    environment="azureml:churn-training:3",
    compute="gpu-training-cluster",
)

# Define the search space
sweep_job = train_command.sweep(
    sampling_algorithm="bayesian",
    primary_metric="f1_score",
    goal="maximize",
)

sweep_job.search_space = {
    "learning_rate": Uniform(min_value=0.001, max_value=0.1),
    "n_estimators": Choice(values=[50, 100, 200, 500]),
    "max_depth": Choice(values=[5, 8, 10, 15, 20]),
}

# Early termination — stop bad runs
sweep_job.early_termination = BanditPolicy(
    slack_factor=0.1,
    evaluation_interval=2,
)

sweep_job.set_limits(max_total_trials=200, max_concurrent_trials=8)

# Submit
returned_job = ml_client.jobs.create_or_update(sweep_job)

What’s happening:

  • Lines 6-12: The training script accepts hyperparameters as command-line arguments
  • Line 17: Bayesian sampling learns from previous trials to choose smarter next trials
  • Lines 23-26: The search space defines ranges — MLflow logs each combination tried
  • Lines 29-31: Bandit policy cancels runs that fall behind the best run by more than 10%

Sampling algorithms

Hyperparameter sampling strategies
FeatureIntelligenceSpeedBest For
GridNone — tries every combinationSlow (exhaustive)Small search spaces, need all results
RandomNone — picks randomlyFast start, good coverageLarge spaces, initial exploration
BayesianLearns from previous trialsSlower per trial, fewer neededWhen trials are expensive, want optimal result

Early termination policies

PolicyHow It WorksWhen to Use
BanditStops runs that lag behind the best by a slack factorMost common — good balance of exploration and cost
Median stoppingStops runs below the median of all runs at same pointWhen you want to keep more diverse trials
Truncation selectionCancels bottom X% of runs at each intervalAggressive pruning for large sweeps
💡 Exam tip: Bayesian vs random sampling

The exam often tests when to use each sampling algorithm:

  • Random: best when the search space is large and you want broad coverage quickly. Also useful when you can afford many trials.
  • Bayesian: best when each trial is expensive (GPU hours) and you want to converge on the optimum with fewer trials. NOT available with early termination policies that need all runs to complete.
  • Grid: only practical for very small search spaces (under 20 combinations).

If the question mentions “limited compute budget” and “find the optimal configuration,” the answer is usually Bayesian.

Key terms flashcards

Question

AutoML vs sweep jobs — what's the difference?

Click or press Enter to reveal answer

Answer

AutoML: tries multiple algorithms and feature engineering automatically (broad search). Sweep jobs: searches hyperparameters for ONE chosen algorithm (deep search). Use AutoML for baseline, sweeps for optimization.

Click to flip back

Question

What are the three sampling algorithms for sweep jobs?

Click or press Enter to reveal answer

Answer

Grid (exhaustive, every combination), Random (fast, broad coverage), Bayesian (learns from previous trials, fewer trials needed). Bayesian is best when trials are expensive.

Click to flip back

Question

What does the Bandit early termination policy do?

Click or press Enter to reveal answer

Answer

Cancels runs that fall behind the best-performing run by more than a specified slack factor. Saves compute by stopping clearly underperforming trials.

Click to flip back

Knowledge check

Knowledge Check

Kai has a new dataset and needs a baseline model by Friday. He doesn't know which algorithm will work best. What should he use?

Knowledge Check

Dr. Luca is running a hyperparameter sweep for a genomics model. Each trial uses an A100 GPU and takes 45 minutes. He has budget for about 30 trials. Which sampling algorithm should he choose?

🎬 Video coming soon


Next up: Training Pipelines — automating the entire training workflow end to end.

← Previous

MLflow: Track Every Experiment

Next →

Training Pipelines: Automate Everything

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.