DevOps Metrics: Dashboards That Drive Decisions
Design dashboards with DORA metrics, cycle time, lead time, and deployment frequency. Learn which metrics matter for planning, development, testing, security, delivery, and operations.
Why Metrics Matter in DevOps
Think of a car dashboard.
Youβre driving down the motorway. The speedometer tells you how fast youβre going. The fuel gauge tells you how far you can go. The engine warning light tells you something needs attention. Without a dashboard, youβre driving blind β you donβt know if youβre going too fast, running out of fuel, or about to break down.
DevOps metrics are your engineering teamβs dashboard. Deployment frequency is your speedometer. Lead time is your fuel efficiency. Failure rate is your engine warning light. Without metrics, youβre shipping software blind.
The Four DORA Metrics
DORA metrics are the gold standard for measuring DevOps performance. They were developed by the team behind the Accelerate book and the annual State of DevOps Report. The exam expects you to know all four, their definitions, and the elite performance thresholds.
| Metric | What It Measures | Elite | High | Medium | Low |
|---|---|---|---|---|---|
| Deployment Frequency | How often you deploy to production | On-demand (multiple per day) | Weekly to monthly | Monthly to every 6 months | Fewer than once per 6 months |
| Lead Time for Changes | Time from commit to production | Less than 1 hour | 1 day to 1 week | 1 week to 1 month | 1 to 6 months |
| Change Failure Rate | Percentage of deployments causing a failure | 0β15% | 16β30% | 31β45% | 46β60% |
| Time to Restore Service (MTTR) | Time to recover from a production failure | Less than 1 hour | Less than 1 day | 1 day to 1 week | More than 6 months |
π’ Nadia Tracks DORA at Meridian
Nadia presents DORA metrics to Dmitri (VP Engineering) monthly:
- Deployment Frequency: Jumped from monthly to weekly after migrating from classic to YAML pipelines
- Lead Time: Dropped from 3 weeks to 4 days by implementing trunk-based development
- Change Failure Rate: Reduced from 22% to 8% by adding automated integration tests in the pipeline
- MTTR: Reduced from 6 hours to 45 minutes by implementing feature flags and automated rollbacks
Dmitri doesnβt need to understand pipeline YAML β he sees four numbers and a trend. Thatβs what a good DevOps dashboard delivers.
Cycle Time vs Lead Time β The Critical Difference
This is one of the most commonly tested concepts on AZ-400. Many candidates confuse the two.
Customer requests feature
β
βΌ
ββββ Lead Time βββββββββββββββββββββββββββββββββββββββββββ
β β
β Backlog wait βββΊ ββββ Cycle Time ββββββββββββ β
β (days/weeks) β β β
β β Dev β Review β Test β β β β
β βββββββββββββββββββββββββββββ β
β Deploy βββΊ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Lead Time = total time from request to delivery (includes backlog wait)
- Cycle Time = time from work started to work completed (active only)
- A team with a 2-day cycle time but 3-week lead time has a backlog problem, not a development speed problem
Exam Tip: Lead Time Questions
When the exam says βlead time for changesβ in the context of DORA metrics, it specifically means the time from code commit to running in production β not from customer request. This is a narrower definition than the general Kanban βlead time.β Always read the context: DORA lead time starts at commit; Kanban lead time starts at request.
Metrics by Lifecycle Stage
The exam asks you to design metrics for six lifecycle stages. Hereβs what to measure at each stage and which dashboard widgets to use.
Planning Metrics
| Metric | What It Shows | Dashboard Widget |
|---|---|---|
| Velocity | Story points completed per sprint | Velocity chart |
| Sprint burndown | Work remaining vs time remaining | Burndown chart |
| Cumulative flow | Items in each state over time β reveals bottlenecks | Cumulative flow diagram |
| Backlog health | Ratio of refined to unrefined items | Query tile |
| Cycle time / Lead time | How long items take from start to done | Cycle time chart (Analytics) |
Development Metrics
| Metric | What It Shows | Dashboard Widget |
|---|---|---|
| Commit frequency | How often developers push code | Code churn chart |
| PR cycle time | Time from PR created to PR merged | PR analytics (GitHub Insights) |
| PR size | Lines changed per PR (smaller is better) | Custom query |
| Build success rate | Percentage of builds that pass | Build history chart |
| Code coverage trend | Test coverage over time | Test results trend |
Testing Metrics
| Metric | What It Shows | Dashboard Widget |
|---|---|---|
| Test pass rate | Percentage of tests passing per run | Test results trend |
| Test execution time | How long the test suite takes | Pipeline analytics |
| Flaky test count | Tests that intermittently fail | Custom query on test history |
| Code coverage | Percentage of code exercised by tests | Code coverage widget |
| Bug escape rate | Bugs found in production vs in testing | Query tile |
Security Metrics
| Metric | What It Shows | Dashboard Widget |
|---|---|---|
| Vulnerability count | Open security vulnerabilities by severity | Dependency scanning report |
| Time to remediate | Average time to fix a security finding | Custom query |
| Compliance scan pass rate | Percentage of builds passing security gates | Pipeline analytics |
| Secret exposure incidents | Leaked credentials detected | GitHub secret scanning alerts |
| OWASP top 10 coverage | Which OWASP categories are tested | Custom dashboard |
Delivery Metrics
| Metric | What It Shows | Dashboard Widget |
|---|---|---|
| Deployment frequency | How often you deploy to production | Release analytics |
| Deployment duration | How long deployments take | Release pipeline analytics |
| Rollback rate | How often you need to roll back | Custom query |
| Environment health | Status of each environment | Deployment status widget |
| Release approval time | Time spent waiting for approvals | Custom query |
Operations Metrics
| Metric | What It Shows | Dashboard Widget |
|---|---|---|
| MTTR | Mean time to restore from incidents | Incident analytics |
| MTBF | Mean time between failures | Custom calculation |
| Availability (uptime %) | Service reliability over time | Azure Monitor integration |
| Incident count and severity | How many incidents and their impact | Query tile |
| Alert noise ratio | Actionable alerts vs false positives | Monitoring analytics |
Azure DevOps Dashboards vs GitHub Insights
| Capability | Azure DevOps Dashboards | GitHub Insights |
|---|---|---|
| Dashboard creation | Fully customisable with drag-and-drop widgets | Pre-built views per repository (Pulse, Contributors, Traffic) |
| Widget library | 30+ built-in widgets (charts, queries, build status, test results) | Limited to built-in views; Actions tab for CI/CD |
| Custom queries | Query editor with flat/tree/direct links modes powering charts | Filter-based views in Projects; no query language |
| DORA metrics | Available via DORA Metrics extension or Analytics views | Available in GitHub Insights for organisations (Enterprise) |
| Burndown/velocity | Built-in sprint burndown and velocity widgets | Not built-in; use third-party actions or Projects roadmap |
| Sharing | Dashboards shared at project or team level with permissions | Insights visible to repository contributors |
| Third-party integration | Extensible via marketplace widgets | GitHub Marketplace Actions for reporting |
| Audience | Multiple dashboards for different audiences (exec, team, ops) | Single set of insights per repository |
Building Effective Dashboards
Principle: Design dashboards for the audience, not the tools.
| Audience | What They Need | Example Widgets |
|---|---|---|
| Executives (CTO, VP Eng) | DORA trends, velocity, strategic progress | DORA metrics, deployment frequency trend, Epic burndown |
| Team leads | Sprint health, blockers, PR bottlenecks | Sprint burndown, cumulative flow, PR cycle time |
| Developers | Build status, test results, assigned items | Build history, test results trend, assigned work items |
| Operations | Service health, incident trends, MTTR | Availability, incident count, MTTR trend |
Scenario: Nadia's Three-Tier Dashboard Strategy
Nadia creates three dashboards at Meridian Insurance:
Executive Dashboard β Dmitri and the leadership team see DORA metrics, monthly deployment trends, and a high-level Epic progress board. Updated monthly.
Team Dashboard β each sprint team has their own dashboard with sprint burndown, cumulative flow, build success rate, and PR cycle time. Updated daily (auto-refresh).
Operations Dashboard β the NOC monitors Azure Monitor alerts, MTTR trends, service health, and active incident count. Real-time.
Each dashboard answers different questions for different people. The executive dashboard does not show build pass rates. The operations dashboard does not show sprint burndown. Mixing audiences creates noise.
Querying for Metrics
Azure DevOps Queries
Azure DevOps queries power most dashboard widgets. Key query patterns for the exam:
- βShow me all bugs created this sprintβ β flat query, filter on Work Item Type = Bug, Iteration Path = current
- βShow me all stories and their child tasksβ β tree query, parent type = User Story, child type = Task
- βShow me all stories with linked test casesβ β direct links query, link type = Tested By
- βShow me items that changed state in the last 7 daysβ β use @Today - 7 in Changed Date field
GitHub: Using GitHub Actions for Custom Metrics
GitHub doesnβt have a built-in query engine like Azure Boards, but you can collect metrics through:
- GitHub Actions β run scheduled workflows that query the API and post results
- GitHub CLI (
gh) β script queries for issues, PRs, and releases - Third-party tools β LinearB, Sleuth, Swarmia integrate with GitHub for DORA metrics
- GitHub Insights (Enterprise) β pre-built DORA metrics dashboard
Nadia's VP of Engineering wants a single number that captures how quickly the team recovers from production incidents. Which metric should Nadia add to the executive dashboard?
A team has a cycle time of 2 days but a lead time of 18 days. What does this indicate?
Which Azure DevOps dashboard widget visualises the number of work items in each Kanban column over time, helping identify where work accumulates?
π¬ Video coming soon
DevOps Metrics Deep Dive
Next up: Collaboration: Wikis, Teams & Release Notes