AI Cost Drivers and ROI: Tokens, Pricing, and Business Cases
Every AI interaction has a cost. Understanding tokens, pricing models, and how to build a compelling ROI case is essential for any leader making AI investment decisions.
What drives the cost of AI?
Think of AI costs like a phone plan. Every text you send (every “token”) costs something.
When you use generative AI, every word you send AND every word you receive gets broken into small pieces called tokens. Each token costs a tiny amount — fractions of a cent. But those fractions add up fast when thousands of employees use AI hundreds of times a day.
The big cost decisions for leaders: How many people use it? Which model do they use (bigger = more expensive)? How often? And can you measure the value it creates — time saved, revenue gained, errors avoided?
Understanding tokens
Tokens are the currency of generative AI. Every interaction is measured in tokens.
| Concept | Explanation | Example |
|---|---|---|
| What is a token? | A chunk of text — roughly 3/4 of a word in English | ”Microsoft” = 1 token. “Copilot” = 2 tokens. A 500-word email is about 670 tokens. |
| Input tokens | The prompt you send to the AI (including any context or documents) | “Summarise this 10-page report” + the report text = thousands of input tokens |
| Output tokens | The response the AI generates | A 200-word summary = about 270 output tokens |
| Context window | The maximum number of tokens the model can process in a single interaction | GPT-4o can handle 128,000 tokens — about 96,000 words |
Why token costs matter for leaders
As a leader, you don’t need to count tokens yourself. But you need to understand the cost levers:
- Model choice is the biggest lever. GPT-4o costs roughly 10x more per token than GPT-4o mini. Using the cheapest model that meets quality requirements saves enormously at scale.
- Prompt design matters. Bloated prompts with unnecessary context burn input tokens. Well-crafted prompts are cheaper AND produce better results.
- Output length is controllable. Setting maximum response lengths prevents runaway token costs.
- RAG efficiency affects cost. Retrieving and sending irrelevant documents to the model wastes input tokens.
Pricing models for AI services
Different AI products charge in different ways:
| Feature | How you pay | Predictability | Best for |
|---|---|---|---|
| Per-user licence (Copilot) | $30/user/month flat fee | Highly predictable — fixed cost regardless of usage | Knowledge worker productivity across the organisation |
| Pay-as-you-go (Azure OpenAI) | Per 1,000 tokens consumed | Variable — depends on usage volume | Custom applications where usage varies or is hard to predict |
| Commitment tier (Azure AI) | Pre-purchase a volume of tokens at a discount | Predictable — committed spend with lower unit cost | High-volume applications with steady, predictable usage patterns |
| Copilot Studio (consumption) | Metered usage — consumption-based billing | Semi-predictable — pay based on agent activity | Custom agents where usage is moderate and grows over time |
Building the ROI case
Every AI investment needs a business case. Here’s a framework:
The ROI equation
ROI = (Value created - Cost of AI) / Cost of AI x 100%
Where the value comes from
| Value Category | How to Measure | Example |
|---|---|---|
| Time savings | Hours saved x average hourly cost | Copilot saves each consultant 5 hours/week = 200 staff x 5 hrs x $75/hr = $75,000/week |
| Quality improvement | Error reduction, rework avoided | AI-drafted proposals have 40% fewer review cycles |
| Revenue acceleration | Faster deal cycles, more proposals sent | 20% more proposals submitted per quarter → $X in additional revenue |
| Employee satisfaction | Reduced tedious work, better engagement | Less time on admin → more time on high-value client work |
| Innovation | New products/services enabled by AI | AI-powered research portal (new service line for clients) |
Scenario: Elena builds the Copilot business case
Elena wants to deploy Microsoft 365 Copilot to all 200 consultants at Meridian.
Costs:
- Licences: 200 users x $30/month = $6,000/month = $72,000/year
- Training and change management: $15,000 (one-time)
- Total first-year cost: $87,000
Value (realistic estimates assuming 80% adoption):
- Time saved: 3 hours/user/week x 160 active users x 48 weeks x $75/hr = $1,728,000 theoretical
- Capturable value (
60% of theoretical): **$1,037,000** — not all saved time converts to revenue or cost reduction - Faster proposal turnaround: 10% more proposals = $200,000 additional revenue
- Total first-year capturable value: ~$1,237,000
ROI: ($1,237,000 - $87,000) / $87,000 ≈ 1,322%
Why theoretical vs capturable value matters
Not every hour “saved” by AI converts to measurable business value:
- Not everyone adopts equally. Even well-run rollouts see 70-85% active usage, not 100%. Budget for 80% as a realistic ceiling.
- Saved time ≠ revenue. A consultant saving 3 hours/week might spend that time on higher-value work — or on coffee breaks. Only a portion (~60%) of theoretical time savings typically converts to capturable value (more billable hours, fewer contractors, avoided hires).
- Pilot before scaling. Run a 90-day pilot with 50 users to measure ACTUAL time savings before projecting to the full organisation.
- Sensitivity analysis matters. If adoption is 60% instead of 80%, or savings are 2 hours instead of 3, ROI drops to ~500-700%. Still strong — but very different from 1,300%.
The exam rewards answers that show awareness of adoption risk. A perfect ROI model with 100% adoption assumptions is a red flag, not a strength.
Exam tip: ROI questions always include hidden costs
The exam likes to test whether you consider ALL costs, not just licence fees:
- Training costs — users need to learn how to prompt effectively
- Change management — adoption doesn’t happen automatically
- Infrastructure costs — for custom solutions (compute, storage, networking)
- Ongoing maintenance — model updates, data refreshes, monitoring
- Opportunity cost — what else could you invest this budget in?
Also watch for the trap: high ROI percentage means nothing if the absolute value is small or the risk is high.
Common cost pitfalls
| Pitfall | What Goes Wrong | How to Avoid It |
|---|---|---|
| Over-provisioning licences | Buying Copilot for everyone when only 60% use it regularly | Start with a pilot group, measure adoption, then expand |
| Wrong model for the job | Using GPT-4o for simple tasks that GPT-4o mini handles fine | Match model capability to task complexity |
| Ignoring training | Deploying AI without teaching users how to prompt | Budget 10-15% for training and adoption support |
| Unmeasured value | Can’t justify renewal because nobody tracked the impact | Define success metrics BEFORE deployment |
| Token sprawl | Custom app sends entire documents when a summary would suffice | Design efficient prompts and retrieval strategies |
Key flashcards
Knowledge check
Tomás is deploying Copilot to 5,000 manufacturing workers. The CFO asks about predictable budgeting. Which pricing model should Tomás recommend?
Elena's firm calculated Copilot would save ~$1.2M (capturable value) and cost $87K in year one. The board asks what's missing from this analysis. What should Elena add?
🎬 Video coming soon
Next up: Challenges of Generative AI — fabrications, bias, reliability, and what leaders must know before deploying AI.