πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided AZ-400 Domain 3
Domain 3 β€” Module 13 of 13 100%
19 of 25 overall

AZ-400 Study Guide

Domain 1: Design and Implement Processes and Communications

  • Work Item Tracking: Boards, GitHub & Flow
  • DevOps Metrics: Dashboards That Drive Decisions
  • Collaboration: Wikis, Teams & Release Notes

Domain 2: Design and Implement a Source Control Strategy

  • Branching Strategies: Trunk-Based, Feature & Release
  • Pull Requests: Policies, Protections & Merge Rules
  • Repository Management: LFS, Permissions & Recovery

Domain 3: Design and Implement Build and Release Pipelines

  • Package Management: Feeds, Versioning & Upstream
  • Testing Strategy: Quality Gates & Release Gates
  • Test Implementation: Code Coverage & Pipeline Tests
  • Azure Pipelines: YAML from Scratch Free
  • GitHub Actions: Workflows from Scratch Free
  • Pipeline Agents: Self-Hosted, Hybrid & VM Templates
  • Multi-Stage Pipelines: Templates, Variables & Approvals
  • Deployment Strategies: Blue-Green, Canary & Ring Free
  • Safe Rollouts: Slots, Dependencies & Hotfix Paths
  • Deployment Implementations: Containers, Scripts & Databases
  • Infrastructure as Code: ARM vs Bicep vs Terraform
  • IaC in Practice: Desired State & Deployment Environments
  • Pipeline Maintenance: Health, Migration & Retention

Domain 4: Develop a Security and Compliance Plan

  • Pipeline Identity: Service Principals, Managed IDs & OIDC Free
  • Authorization & Access: GitHub Roles & Azure DevOps Security
  • Secrets & Secure Pipelines: Key Vault & Workload Federation
  • Security Scanning: GHAS, Defender & Dependabot

Domain 5: Implement an Instrumentation Strategy

  • Monitoring for DevOps: Azure Monitor & App Insights
  • Metrics & KQL: Analysing Telemetry & Traces

AZ-400 Study Guide

Domain 1: Design and Implement Processes and Communications

  • Work Item Tracking: Boards, GitHub & Flow
  • DevOps Metrics: Dashboards That Drive Decisions
  • Collaboration: Wikis, Teams & Release Notes

Domain 2: Design and Implement a Source Control Strategy

  • Branching Strategies: Trunk-Based, Feature & Release
  • Pull Requests: Policies, Protections & Merge Rules
  • Repository Management: LFS, Permissions & Recovery

Domain 3: Design and Implement Build and Release Pipelines

  • Package Management: Feeds, Versioning & Upstream
  • Testing Strategy: Quality Gates & Release Gates
  • Test Implementation: Code Coverage & Pipeline Tests
  • Azure Pipelines: YAML from Scratch Free
  • GitHub Actions: Workflows from Scratch Free
  • Pipeline Agents: Self-Hosted, Hybrid & VM Templates
  • Multi-Stage Pipelines: Templates, Variables & Approvals
  • Deployment Strategies: Blue-Green, Canary & Ring Free
  • Safe Rollouts: Slots, Dependencies & Hotfix Paths
  • Deployment Implementations: Containers, Scripts & Databases
  • Infrastructure as Code: ARM vs Bicep vs Terraform
  • IaC in Practice: Desired State & Deployment Environments
  • Pipeline Maintenance: Health, Migration & Retention

Domain 4: Develop a Security and Compliance Plan

  • Pipeline Identity: Service Principals, Managed IDs & OIDC Free
  • Authorization & Access: GitHub Roles & Azure DevOps Security
  • Secrets & Secure Pipelines: Key Vault & Workload Federation
  • Security Scanning: GHAS, Defender & Dependabot

Domain 5: Implement an Instrumentation Strategy

  • Monitoring for DevOps: Azure Monitor & App Insights
  • Metrics & KQL: Analysing Telemetry & Traces
Domain 3: Design and Implement Build and Release Pipelines Premium ⏱ ~12 min read

Pipeline Maintenance: Health, Migration & Retention

Monitor pipeline health with failure rates and flaky tests. Optimise performance with caching and parallel jobs. Migrate classic pipelines to YAML and design retention strategies.

Why Pipeline Maintenance Matters

β˜• Simple explanation

Think of maintaining a car.

You would not drive 100,000 km without an oil change, tyre rotation, or brake check. The car still runs… until it does not. Pipelines are the same β€” they work fine until one day they are slow, flaky, or fail at the worst possible moment.

Pipeline maintenance means monitoring health (are builds failing too often?), optimising speed (why does this build take 45 minutes?), managing storage (do we really need build artifacts from 2 years ago?), and modernising (migrating from the old Classic editor to YAML).

Production pipelines are long-lived systems that degrade without active maintenance. The AZ-400 exam tests five maintenance areas: health monitoring (failure rate, duration, flaky tests), performance optimisation (caching, parallelism), cost optimisation (concurrency management, cancel-in-progress), retention strategies (artifact lifecycle), and classic-to-YAML migration. These are operational concerns that distinguish a senior DevOps engineer from someone who can only build new pipelines.

Pipeline Health Monitoring

Key Metrics

MetricWhat It MeasuresHealthy TargetRed Flag
Failure ratePercentage of pipeline runs that failBelow 10%Above 25% consistently
Mean durationAverage time from trigger to completionStable or decreasingIncreasing week-over-week
Queue timeTime a run waits for an available agentUnder 2 minutesOver 10 minutes regularly
Flaky test rateTests that pass/fail non-deterministicallyUnder 2% of test suiteOver 5% β€” undermines CI trust
MTTR (Mean Time to Recovery)How fast the team fixes a broken pipelineUnder 1 hourOver 1 day

Azure Pipelines Analytics

Azure DevOps provides built-in analytics for pipeline health:

  • Pipeline pass rate β€” trend over time, filterable by branch and stage
  • Test analytics β€” identify flaky tests by tracking tests that pass and fail on the same code
  • Duration trends β€” spot regressions in build time
  • Pipeline runs dashboard widget β€” add to team dashboards for visibility

Access via: Pipelines > [select a pipeline] > Analytics tab, or add the Pipeline runs widget to team dashboards.

Flaky Tests

Flaky tests are tests that produce different results (pass/fail) for the same code without any changes. They destroy CI trust β€” developers start ignoring failures because β€œit is probably just flaky.”

Common causes:

  • Timing dependencies (sleep, race conditions)
  • Shared test state (tests depend on execution order)
  • External service dependencies (network calls in unit tests)
  • Timezone and locale differences between agents

Azure DevOps flaky test detection: Azure Pipelines automatically flags tests as flaky when the same test passes and fails on the same code within a window. You can configure the system to not fail the build for known flaky tests while still tracking them for resolution.

Question

How does Azure DevOps detect flaky tests?

Click or press Enter to reveal answer

Answer

Azure Pipelines tracks test results across runs on the same branch and code version. If a test passes on one run and fails on another WITHOUT any code change, it is flagged as flaky. You can configure the pipeline to not fail the build for known flaky tests (they pass automatically) while tracking them in the Test analytics dashboard for resolution.

Click to flip back

Question

What is pipeline queue time and why does it matter?

Click or press Enter to reveal answer

Answer

Queue time is the duration a pipeline run waits for an available agent before execution begins. High queue times indicate insufficient agent capacity or poor concurrency management. Solutions: add more agents (self-hosted pool), increase parallel job licenses (Microsoft-hosted), or optimise pipeline triggers to reduce concurrent demand.

Click to flip back

Pipeline Optimisation

Caching

Pipeline caching stores dependencies between runs to avoid redundant downloads.

Azure Pipelines β€” Cache@2 task:

- task: Cache@2
  inputs:
    key: 'npm | "$(Agent.OS)" | package-lock.json'
    path: '$(Pipeline.Workspace)/.npm'
    restoreKeys: |
      npm | "$(Agent.OS)"
  displayName: 'Cache npm packages'

GitHub Actions β€” actions/cache:

- uses: actions/cache@v4
  with:
    path: ~/.npm
    key: npm-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
      npm-${{ runner.os }}-

Cache key design: Use the lock file hash as part of the key. When dependencies change, the lock file changes, invalidating the cache and forcing a fresh install.

Parallel Jobs and Test Sharding

Parallel jobs run independent tasks simultaneously rather than sequentially:

# Azure Pipelines β€” matrix strategy
strategy:
  matrix:
    linux:
      vmImage: 'ubuntu-latest'
    windows:
      vmImage: 'windows-latest'
    mac:
      vmImage: 'macOS-latest'

Test sharding splits a large test suite across parallel agents:

strategy:
  parallel: 4  # Split tests across 4 agents

Each agent runs a slice of the test suite. Total test time drops from the full suite duration to roughly one-quarter (wall clock) at the cost of four parallel agent-minutes.

Other Optimisation Techniques

  • Incremental builds β€” only rebuild changed modules (supported by MSBuild, Gradle, Bazel)
  • Docker layer caching β€” reuse unchanged layers in multi-stage builds
  • Pipeline triggers β€” use path filters to skip pipelines when only docs change
  • Cancel-in-progress β€” cancel running pipelines when a newer commit arrives on the same branch
Question

What is the purpose of cancel-in-progress in CI pipelines?

Click or press Enter to reveal answer

Answer

Cancel-in-progress automatically cancels any running pipeline when a newer commit is pushed to the same branch. This prevents wasted compute on builds that are already outdated β€” especially during active development with frequent pushes. In GitHub Actions, use concurrency with cancel-in-progress: true. In Azure Pipelines, enable the 'Cancel running and in-progress pipeline runs when a new run is triggered' option.

Click to flip back

Cost and Concurrency Optimisation

Concurrency Management

ApproachHow It WorksImpact
Parallel job licensesSet the max concurrent pipelines (Microsoft-hosted)More licenses = faster throughput but higher cost
Cancel-in-progressCancel older runs on same branchSaves agent time, reduces cost
Path filtersOnly trigger pipelines for relevant file changesPrevents unnecessary runs
Scheduled pipelinesRun nightly instead of on every push for heavy testsReduces peak concurrency demand
Self-hosted agentsUse your own VMs or containersFixed cost per agent, no per-minute billing
πŸ’‘ Exam tip: Cost vs speed trade-offs

The exam presents scenarios asking you to optimise for cost OR time. Key trade-offs:

  • More parallel jobs = faster builds but higher licensing cost
  • Self-hosted agents = lower per-minute cost but you pay for VM infrastructure and maintenance
  • Caching = faster builds at near-zero cost (just storage) β€” almost always worth implementing
  • Cancel-in-progress = saves money AND time β€” no trade-off, enable it by default
  • Test sharding = faster test runs but uses more parallel agent capacity

When the exam says β€œminimise cost” β€” look for caching, cancel-in-progress, and path filters. When the exam says β€œminimise time” β€” look for parallelism, sharding, and more agents.

Retention Strategies

Retention policies determine how long pipeline artifacts, test results, and run history are kept. Too short and you lose audit trails; too long and storage costs explode.

Artifact TypeRecommended RetentionRationale
Build artifacts (binaries, packages)30-90 days for development branches, 1 year for release branchesRelease artifacts may need redeployment; dev artifacts are disposable
Pipeline run history1-2 yearsAudit and trend analysis
Test results90 days minimumFlaky test detection needs history; compliance may require longer
Container images in ACRKeep tagged releases indefinitely, purge untagged after 30 daysUse ACR retention policies and scheduled purge tasks
NuGet/npm packagesKeep published versions indefinitely, purge pre-release after 90 daysDownstream projects may pin specific versions

Configuring Retention in Azure Pipelines

  • Project-level settings: default retention for all pipelines (Settings > Pipelines > Retention)
  • Pipeline-level override: individual pipelines can set longer retention for critical builds
  • Retention leases: protect specific runs from automatic cleanup (e.g., a production release)
  • Azure Artifacts retention: separate from pipeline retention β€” configured per feed
Question

What is a retention lease in Azure Pipelines?

Click or press Enter to reveal answer

Answer

A retention lease protects a specific pipeline run from automatic cleanup, regardless of the retention policy. Use it for production releases β€” you want to keep the build artifacts, test results, and deployment logs for that specific run even after the general retention window expires. Leases can be created manually, by pipeline tasks, or by release pipelines automatically.

Click to flip back

Migrating Classic to YAML

Classic pipelines use the visual editor (GUI) to define build and release pipelines. YAML pipelines define everything as code in a azure-pipelines.yml file checked into the repository.

Microsoft recommends YAML for all new pipelines β€” classic is in maintenance mode
CapabilityClassic (Visual Editor)YAML (Pipeline as Code)
DefinitionGUI-based, stored in Azure DevOpsCode in repository (azure-pipelines.yml)
Version controlLimited β€” no native Git versioning of pipeline definitionFull Git history β€” branch, diff, review, revert
Code reviewNo PR-based review of pipeline changesPipeline changes go through PR review like application code
BranchingOne pipeline definition shared across branchesPipeline definition can differ per branch
TemplatesTask groups (limited reuse)Templates with parameters (powerful composition)
Multi-stageSeparate Build and Release definitionsUnified stages in a single YAML file
EnvironmentsDeployment groupsEnvironments with approval gates and checks
Future directionNo new features being addedAll investment going into YAML

The 6-Step Migration Process

πŸ’‘ Scenario: Nadia migrates Meridian from Classic to YAML

🏒 Nadia leads the migration of 47 classic pipelines at Meridian Insurance. Her 6-step process:

Step 1 β€” Inventory and prioritise: Export all classic pipeline definitions. Categorise by complexity (simple CI, multi-stage, release pipelines with gates). Start with the simplest.

Step 2 β€” Use the β€œView YAML” feature: Azure DevOps lets you view the YAML equivalent of a classic pipeline. This generates a starting point (though it often needs cleanup).

Step 3 β€” Convert build pipelines first: Build pipelines are simpler than release pipelines. Convert them to YAML, test on a branch, and validate that outputs match the classic version.

Step 4 β€” Convert release pipelines to multi-stage YAML: Classic release pipelines with stages, gates, and approvals become YAML stages with environment approvals and deployment jobs.

Step 5 β€” Add YAML-only features: Leverage features that Classic does not support: templates for reuse across pipelines, conditional insertions, matrix strategies, and pipeline-as-code review through PRs.

Step 6 β€” Decommission classic pipelines: Run both classic and YAML in parallel for one sprint. Once the YAML pipeline is validated, disable (do not delete) the classic pipeline for rollback safety. Delete after 30 days of successful YAML runs.

Nadia estimates 3 months for the full migration. She tracks progress on a shared dashboard, converting 3-4 pipelines per sprint. Dmitri (VP Eng) approves the plan because YAML pipelines can be code-reviewed β€” a requirement Elena (compliance) has been requesting for audit purposes.

Question

What Azure DevOps feature helps start a Classic-to-YAML migration?

Click or press Enter to reveal answer

Answer

The 'View YAML' button on classic pipeline tasks generates the YAML equivalent of the current classic definition. It provides a starting point for conversion, though the output usually needs cleanup β€” variable groups, service connections, and complex release stages may require manual translation.

Click to flip back

Knowledge Check

A team's CI pipeline takes 35 minutes. Most of the time is spent downloading npm packages (8 minutes) and running tests (22 minutes across 500 tests). How should they optimise?

Knowledge Check

Nadia's team wants to ensure that the build artifacts for every production release are kept indefinitely, even though the default project retention is 30 days. What should she configure?

Knowledge Check

Which capability is available in YAML pipelines but NOT in Classic pipelines?

🎬 Video coming soon

Pipeline Health, Migration and Retention

Next up: Testing Strategy: Shift-Left and Continuous Testing (Domain 3 continues)

← Previous

IaC in Practice: Desired State & Deployment Environments

Next β†’

Pipeline Identity: Service Principals, Managed IDs & OIDC

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.