Agent Lifecycle: From Dev to Production

Simple explanation

Putting It All Together

Over the last four modules, you’ve learned the individual pieces: test sets, evaluation methods, solutions, environment variables, and Pipelines. This module synthesizes everything into a complete lifecycle — the end-to-end journey an agent takes from first idea to production deployment and beyond.

🏢 AgentForge scenario: Priya’s team has won a contract with Pacific Mutual Insurance. 🔧 Kai from TechBridge Consulting is the implementation partner. Together, they need to deliver a claims triage agent — from concept to production — following enterprise ALM practices.

Let’s walk through every phase.

Phase 1: Plan

Before writing a single topic or instruction, the team defines:

Business requirements — What should the claims triage agent do? Handle first notice of loss, route to correct adjuster team, and answer policyholder FAQs.
Success metrics — 85% accuracy on test sets, under 3 second average response time, 90% user satisfaction.
Environment strategy — Three environments: Dev (building), UAT (stakeholder testing), Production (policyholders).
Data requirements — What knowledge sources are needed? Claims procedures manual, adjuster routing rules, policy FAQ document.
Security and compliance — What data classification applies? PII handling for policyholder information.

Kai emphasizes: “The plan phase saves more time than any other. Every hour spent planning saves three hours of rework.”

Phase 2: Build

Development happens in the dev environment inside an unmanaged solution:

Create the solution with a custom publisher (pacmutual_)
Build the agent: system instructions, knowledge sources, topics, actions
Define environment variables for anything that will differ across environments (company name, API endpoints, support email, feature flags)
Create connection references for external integrations

The key discipline during build: everything goes in the solution. If you create a cloud flow outside the solution, it won’t travel with the agent when you deploy. Kai checks the solution’s component list daily — if anything is missing, he adds it before it becomes a deployment surprise.

Question

Why must every component be added to the solution during the build phase?

Click or press Enter to reveal answer

Answer

Components outside the solution won't travel when you export and deploy. A cloud flow, environment variable, or connection reference created outside the solution will be missing in UAT and production, causing the agent to break. Adding everything during build prevents deployment surprises.

Click to flip back

Phase 3: Test

With the agent built, Mira (QA lead) takes over:

Create test sets — baseline (auto-generated from knowledge), edge cases (manual), regression (from pilot conversations)
Run evaluations — accuracy, grounding quality, topic matching, response quality
Review results — identify failure patterns, classify root causes
Iterative improvement — fix, re-evaluate, repeat until thresholds are met
Document — record test results, known limitations, and test set composition for audit

Testing happens entirely in the dev environment. There’s no point deploying to UAT if the agent can’t pass its own test suite where it was built.

Mira’s rule: “If it doesn’t pass in dev, it doesn’t leave dev.”

Phase 4: Package

Once tests pass in dev:

Increment the solution version — e.g., from 1.0.0 to 1.1.0
Export as managed solution — this creates the locked, deployable package
Document the release — what changed, which test sets passed, known limitations

Version numbering matters. Kai uses semantic versioning:

Major (2.0.0) — breaking changes, new major features
Minor (1.1.0) — new features, non-breaking
Patch (1.1.1) — bug fixes only

Question

What is the difference between exporting as managed vs. unmanaged, and when do you use each?

Click or press Enter to reveal answer

Answer

Managed exports create locked packages where components cannot be edited in the target — used for UAT and production deployments. Unmanaged exports create editable packages — used only when another developer needs to continue building in a different dev environment. Always export as managed for deployment stages.

Click to flip back

Phase 5: Deploy

Jordan (PP admin) runs the Pipeline:

Dev → UAT — Pipeline auto-deploys the managed solution to UAT
Post-deployment flow runs: triggers Mira’s test sets against the UAT agent, validates scores
If tests pass, Priya receives an approval request for production
UAT → Production — After Priya approves, Pipeline deploys to production
Jordan sets environment variable current values in production (Pacific Mutual’s company name, API endpoint, support email)
Post-deployment validation — quick smoke test to confirm the agent responds correctly in the production environment

The pipeline handles the heavy lifting. Jordan’s manual work is limited to setting environment variable values and verifying the smoke test.

Phase 6: Monitor

Deployment is not the finish line. In production, the team monitors:

Conversation analytics — Are users engaging? Where do they drop off?
Application Insights — Error rates, response latency, token consumption
User feedback — Thumbs up/down, escalation rates, support tickets about the agent
Knowledge freshness — Are the claims procedures and routing rules still current?

Kai sets up monthly review meetings with Pacific Mutual’s claims team lead. They review analytics, discuss new requirements, and prioritize improvements.

Question

What four areas should you monitor after deploying an agent to production?

Click or press Enter to reveal answer

Answer

1) Conversation analytics — user engagement and drop-off points. 2) Application Insights — error rates, latency, token usage. 3) User feedback — satisfaction ratings and escalation rates. 4) Knowledge freshness — whether uploaded data sources are still current and accurate.

Click to flip back

Phase 7: Iterate

Production feedback flows back into development:

New failure patterns discovered in production → add to regression test set
Users ask questions the agent can’t answer → add knowledge sources or topics
Business rules change → update instructions and environment variables
Performance issues → optimize topics, reduce unnecessary API calls

Each iteration follows the same cycle: build in dev → test → package → deploy through the pipeline. The lifecycle is a loop, not a line.

AgentForge’s first iteration

Two weeks after launch, Pacific Mutual’s team reports that policyholders are asking about flood coverage — a topic the agent wasn’t designed for. The claims team provides a flood coverage FAQ document.

Kai adds the knowledge source in dev, Mira creates test cases for flood queries, the agent passes evaluation, Jordan deploys through the pipeline, and flood coverage is live within three days. No downtime, no manual file transfers, no broken deployments.

Environment Strategy

The three-environment model (Dev, UAT, Production) is the minimum for enterprise deployments. Some organizations add more:

Environment	Purpose	Who Uses It	Solution Type
Development	Build and unit test agents	Developers, AI engineers	Unmanaged
UAT / Staging	Stakeholder testing, automated test suites, approval gate	QA, business stakeholders	Managed
Production	Live end users	Policyholders, employees, customers	Managed
Sandbox (optional)	Experimentation, proof of concepts, training	Anyone exploring ideas	Unmanaged
Hotfix (optional)	Emergency production fixes that bypass normal UAT cycle	Senior developers with approval	Managed

The key rule: managed solutions flow forward (dev → UAT → prod). Never edit components directly in UAT or production. All changes originate in dev and travel through the pipeline.

Question

Why should you never edit agent components directly in UAT or production?

Click or press Enter to reveal answer

Answer

Direct edits bypass the pipeline, test sets, and approval gates. They create drift between environments — production no longer matches what was tested in UAT. When the next pipeline deployment runs, the manual changes may be overwritten. All changes should originate in dev and flow forward through the pipeline.

Click to flip back

Rollback Strategies

Things go wrong. Knowing how to roll back is as important as knowing how to deploy.

Pipeline-based rollback: The pipeline tracks deployment history. You can redeploy a previous solution version to roll back to a known-good state.

Version-based rollback: Because you increment the solution version before each export, you can re-import a specific older version.

Environment variable isolation: If the issue is configuration (not code), changing environment variable values can resolve problems without redeploying the solution.

Priya’s rule: “If production breaks, roll back first, investigate second. A broken agent in production costs more per minute than any investigation saves.”

Knowledge Check

Pacific Mutual's claims team reports that policyholders are asking about flood coverage, which the agent doesn't handle. What is the correct lifecycle response?

Knowledge Check

After a pipeline deployment to production, the agent starts giving incorrect responses. What should Jordan do FIRST?

Knowledge Check

Which solution type should be deployed to UAT and production environments?

Key Takeaways

The agent lifecycle has seven phases: plan → build → test → package → deploy → monitor → iterate
Every component must be in the solution during build — stray components cause deployment failures
Testing happens in dev first — “If it doesn’t pass in dev, it doesn’t leave dev”
Managed solutions flow forward (dev → UAT → prod). Never edit directly in deployment environments
Rollback first, investigate second when production issues arise
The lifecycle is a loop, not a line — production feedback drives continuous improvement