Agent Lifecycle: From Dev to Production
The complete ALM journey — from building in dev, through testing and staging, to production deployment and ongoing maintenance.
Putting It All Together
Over the last four modules, you’ve learned the individual pieces: test sets, evaluation methods, solutions, environment variables, and Pipelines. This module synthesizes everything into a complete lifecycle — the end-to-end journey an agent takes from first idea to production deployment and beyond.
🏢 AgentForge scenario: Priya’s team has won a contract with Pacific Mutual Insurance. 🔧 Kai from TechBridge Consulting is the implementation partner. Together, they need to deliver a claims triage agent — from concept to production — following enterprise ALM practices.
Let’s walk through every phase.
Phase 1: Plan
Before writing a single topic or instruction, the team defines:
- Business requirements — What should the claims triage agent do? Handle first notice of loss, route to correct adjuster team, and answer policyholder FAQs.
- Success metrics — 85% accuracy on test sets, under 3 second average response time, 90% user satisfaction.
- Environment strategy — Three environments: Dev (building), UAT (stakeholder testing), Production (policyholders).
- Data requirements — What knowledge sources are needed? Claims procedures manual, adjuster routing rules, policy FAQ document.
- Security and compliance — What data classification applies? PII handling for policyholder information.
Kai emphasizes: “The plan phase saves more time than any other. Every hour spent planning saves three hours of rework.”
Phase 2: Build
Development happens in the dev environment inside an unmanaged solution:
- Create the solution with a custom publisher (
pacmutual_) - Build the agent: system instructions, knowledge sources, topics, actions
- Define environment variables for anything that will differ across environments (company name, API endpoints, support email, feature flags)
- Create connection references for external integrations
The key discipline during build: everything goes in the solution. If you create a cloud flow outside the solution, it won’t travel with the agent when you deploy. Kai checks the solution’s component list daily — if anything is missing, he adds it before it becomes a deployment surprise.
Phase 3: Test
With the agent built, Mira (QA lead) takes over:
- Create test sets — baseline (auto-generated from knowledge), edge cases (manual), regression (from pilot conversations)
- Run evaluations — accuracy, grounding quality, topic matching, response quality
- Review results — identify failure patterns, classify root causes
- Iterative improvement — fix, re-evaluate, repeat until thresholds are met
- Document — record test results, known limitations, and test set composition for audit
Testing happens entirely in the dev environment. There’s no point deploying to UAT if the agent can’t pass its own test suite where it was built.
Mira’s rule: “If it doesn’t pass in dev, it doesn’t leave dev.”
Phase 4: Package
Once tests pass in dev:
- Increment the solution version — e.g., from 1.0.0 to 1.1.0
- Export as managed solution — this creates the locked, deployable package
- Document the release — what changed, which test sets passed, known limitations
Version numbering matters. Kai uses semantic versioning:
- Major (2.0.0) — breaking changes, new major features
- Minor (1.1.0) — new features, non-breaking
- Patch (1.1.1) — bug fixes only
Phase 5: Deploy
Jordan (PP admin) runs the Pipeline:
- Dev → UAT — Pipeline auto-deploys the managed solution to UAT
- Post-deployment flow runs: triggers Mira’s test sets against the UAT agent, validates scores
- If tests pass, Priya receives an approval request for production
- UAT → Production — After Priya approves, Pipeline deploys to production
- Jordan sets environment variable current values in production (Pacific Mutual’s company name, API endpoint, support email)
- Post-deployment validation — quick smoke test to confirm the agent responds correctly in the production environment
The pipeline handles the heavy lifting. Jordan’s manual work is limited to setting environment variable values and verifying the smoke test.
Phase 6: Monitor
Deployment is not the finish line. In production, the team monitors:
- Conversation analytics — Are users engaging? Where do they drop off?
- Application Insights — Error rates, response latency, token consumption
- User feedback — Thumbs up/down, escalation rates, support tickets about the agent
- Knowledge freshness — Are the claims procedures and routing rules still current?
Kai sets up monthly review meetings with Pacific Mutual’s claims team lead. They review analytics, discuss new requirements, and prioritize improvements.
Phase 7: Iterate
Production feedback flows back into development:
- New failure patterns discovered in production → add to regression test set
- Users ask questions the agent can’t answer → add knowledge sources or topics
- Business rules change → update instructions and environment variables
- Performance issues → optimize topics, reduce unnecessary API calls
Each iteration follows the same cycle: build in dev → test → package → deploy through the pipeline. The lifecycle is a loop, not a line.
AgentForge’s first iteration
Two weeks after launch, Pacific Mutual’s team reports that policyholders are asking about flood coverage — a topic the agent wasn’t designed for. The claims team provides a flood coverage FAQ document.
Kai adds the knowledge source in dev, Mira creates test cases for flood queries, the agent passes evaluation, Jordan deploys through the pipeline, and flood coverage is live within three days. No downtime, no manual file transfers, no broken deployments.
Environment Strategy
The three-environment model (Dev, UAT, Production) is the minimum for enterprise deployments. Some organizations add more:
| Environment | Purpose | Who Uses It | Solution Type |
|---|---|---|---|
| Development | Build and unit test agents | Developers, AI engineers | Unmanaged |
| UAT / Staging | Stakeholder testing, automated test suites, approval gate | QA, business stakeholders | Managed |
| Production | Live end users | Policyholders, employees, customers | Managed |
| Sandbox (optional) | Experimentation, proof of concepts, training | Anyone exploring ideas | Unmanaged |
| Hotfix (optional) | Emergency production fixes that bypass normal UAT cycle | Senior developers with approval | Managed |
The key rule: managed solutions flow forward (dev → UAT → prod). Never edit components directly in UAT or production. All changes originate in dev and travel through the pipeline.
Rollback Strategies
Things go wrong. Knowing how to roll back is as important as knowing how to deploy.
Pipeline-based rollback: The pipeline tracks deployment history. You can redeploy a previous solution version to roll back to a known-good state.
Version-based rollback: Because you increment the solution version before each export, you can re-import a specific older version.
Environment variable isolation: If the issue is configuration (not code), changing environment variable values can resolve problems without redeploying the solution.
Priya’s rule: “If production breaks, roll back first, investigate second. A broken agent in production costs more per minute than any investigation saves.”
Pacific Mutual's claims team reports that policyholders are asking about flood coverage, which the agent doesn't handle. What is the correct lifecycle response?
After a pipeline deployment to production, the agent starts giving incorrect responses. What should Jordan do FIRST?
Which solution type should be deployed to UAT and production environments?
Key Takeaways
- The agent lifecycle has seven phases: plan → build → test → package → deploy → monitor → iterate
- Every component must be in the solution during build — stray components cause deployment failures
- Testing happens in dev first — “If it doesn’t pass in dev, it doesn’t leave dev”
- Managed solutions flow forward (dev → UAT → prod). Never edit directly in deployment environments
- Rollback first, investigate second when production issues arise
- The lifecycle is a loop, not a line — production feedback drives continuous improvement
🎬 Video coming soon
Agent Lifecycle: From Dev to Production — Walkthrough