AI Cost Drivers and ROI: Tokens, Pricing, and Business Cases

What drives the cost of AI?

Simple explanation

Think of AI costs like a phone plan. Every text you send (every “token”) costs something.

When you use generative AI, every word you send AND every word you receive gets broken into small pieces called tokens. Each token costs a tiny amount — fractions of a cent. But those fractions add up fast when thousands of employees use AI hundreds of times a day.

The big cost decisions for leaders: How many people use it? Which model do they use (bigger = more expensive)? How often? And can you measure the value it creates — time saved, revenue gained, errors avoided?

Understanding tokens

Tokens are the currency of generative AI. Every interaction is measured in tokens.

Concept	Explanation	Example
What is a token?	A chunk of text — roughly 3/4 of a word in English	”Microsoft” = 1 token. “Copilot” = 2 tokens. A 500-word email is about 670 tokens.
Input tokens	The prompt you send to the AI (including any context or documents)	“Summarise this 10-page report” + the report text = thousands of input tokens
Output tokens	The response the AI generates	A 200-word summary = about 270 output tokens
Context window	The maximum number of tokens the model can process in a single interaction	GPT-4o can handle 128,000 tokens — about 96,000 words

Why token costs matter for leaders

As a leader, you don’t need to count tokens yourself. But you need to understand the cost levers:

Model choice is the biggest lever. GPT-4o costs roughly 10x more per token than GPT-4o mini. Using the cheapest model that meets quality requirements saves enormously at scale.
Prompt design matters. Bloated prompts with unnecessary context burn input tokens. Well-crafted prompts are cheaper AND produce better results.
Output length is controllable. Setting maximum response lengths prevents runaway token costs.
RAG efficiency affects cost. Retrieving and sending irrelevant documents to the model wastes input tokens.

Pricing models for AI services

Different AI products charge in different ways:

AI pricing models — each suits different business scenarios. Prices shown are approximate list prices and may vary by agreement and region.
Feature	How you pay	Predictability	Best for
Per-user licence (Copilot)	$30/user/month flat fee	Highly predictable — fixed cost regardless of usage	Knowledge worker productivity across the organisation
Pay-as-you-go (Azure OpenAI)	Per 1,000 tokens consumed	Variable — depends on usage volume	Custom applications where usage varies or is hard to predict
Commitment tier (Azure AI)	Pre-purchase a volume of tokens at a discount	Predictable — committed spend with lower unit cost	High-volume applications with steady, predictable usage patterns
Copilot Studio (consumption)	Metered usage — consumption-based billing	Semi-predictable — pay based on agent activity	Custom agents where usage is moderate and grows over time

Building the ROI case

Every AI investment needs a business case. Here’s a framework:

The ROI equation

ROI = (Value created - Cost of AI) / Cost of AI x 100%

Where the value comes from

Value Category	How to Measure	Example
Time savings	Hours saved x average hourly cost	Copilot saves each consultant 5 hours/week = 200 staff x 5 hrs x $75/hr = $75,000/week
Quality improvement	Error reduction, rework avoided	AI-drafted proposals have 40% fewer review cycles
Revenue acceleration	Faster deal cycles, more proposals sent	20% more proposals submitted per quarter → $X in additional revenue
Employee satisfaction	Reduced tedious work, better engagement	Less time on admin → more time on high-value client work
Innovation	New products/services enabled by AI	AI-powered research portal (new service line for clients)

Scenario: Elena builds the Copilot business case

Elena wants to deploy Microsoft 365 Copilot to all 200 consultants at Meridian.

Costs:

Licences: 200 users x $30/month = $6,000/month = $72,000/year
Training and change management: $15,000 (one-time)
Total first-year cost: $87,000

Value (realistic estimates assuming 80% adoption):

Time saved: 3 hours/user/week x 160 active users x 48 weeks x $75/hr = $1,728,000 theoretical
Capturable value (60% of theoretical): **$1,037,000** — not all saved time converts to revenue or cost reduction
Faster proposal turnaround: 10% more proposals = $200,000 additional revenue
Total first-year capturable value: ~$1,237,000

ROI: ($1,237,000 - $87,000) / $87,000 ≈ 1,322%

Why theoretical vs capturable value matters

Not every hour “saved” by AI converts to measurable business value:

Not everyone adopts equally. Even well-run rollouts see 70-85% active usage, not 100%. Budget for 80% as a realistic ceiling.
Saved time ≠ revenue. A consultant saving 3 hours/week might spend that time on higher-value work — or on coffee breaks. Only a portion (~60%) of theoretical time savings typically converts to capturable value (more billable hours, fewer contractors, avoided hires).
Pilot before scaling. Run a 90-day pilot with 50 users to measure ACTUAL time savings before projecting to the full organisation.
Sensitivity analysis matters. If adoption is 60% instead of 80%, or savings are 2 hours instead of 3, ROI drops to ~500-700%. Still strong — but very different from 1,300%.

The exam rewards answers that show awareness of adoption risk. A perfect ROI model with 100% adoption assumptions is a red flag, not a strength.

Exam tip: ROI questions always include hidden costs

The exam likes to test whether you consider ALL costs, not just licence fees:

Training costs — users need to learn how to prompt effectively
Change management — adoption doesn’t happen automatically
Infrastructure costs — for custom solutions (compute, storage, networking)
Ongoing maintenance — model updates, data refreshes, monitoring
Opportunity cost — what else could you invest this budget in?

Also watch for the trap: high ROI percentage means nothing if the absolute value is small or the risk is high.

Common cost pitfalls

Pitfall	What Goes Wrong	How to Avoid It
Over-provisioning licences	Buying Copilot for everyone when only 60% use it regularly	Start with a pilot group, measure adoption, then expand
Wrong model for the job	Using GPT-4o for simple tasks that GPT-4o mini handles fine	Match model capability to task complexity
Ignoring training	Deploying AI without teaching users how to prompt	Budget 10-15% for training and adoption support
Unmeasured value	Can’t justify renewal because nobody tracked the impact	Define success metrics BEFORE deployment
Token sprawl	Custom app sends entire documents when a summary would suffice	Design efficient prompts and retrieval strategies

Key flashcards

Question

What are the four primary cost drivers for generative AI?

Click or press Enter to reveal answer

Answer

1. Token consumption (input + output). 2. Model selection (bigger models cost more). 3. Scale (more users, more interactions). 4. Infrastructure (fine-tuning, hosting, RAG, compute).

Click to flip back

Question

What is a token in generative AI?

Click or press Enter to reveal answer

Answer

A token is a chunk of text — roughly 3/4 of a word in English. Both the prompt you send (input tokens) and the response generated (output tokens) are measured in tokens, and each token has a cost.

Click to flip back

Question

What is the biggest cost lever when using Azure OpenAI?

Click or press Enter to reveal answer

Answer

Model selection. GPT-4o costs roughly 10x more per token than GPT-4o mini. Using the cheapest model that meets quality requirements is the most impactful cost-saving decision.

Click to flip back

Question

What costs beyond licences should be included in an AI ROI calculation?

Click or press Enter to reveal answer

Answer

Training, change management, infrastructure (compute/storage), ongoing maintenance (model updates, monitoring), and opportunity cost. Licence fees are often the smallest part of total cost.

Click to flip back

Knowledge check

Knowledge Check

Tomás is deploying Copilot to 5,000 manufacturing workers. The CFO asks about predictable budgeting. Which pricing model should Tomás recommend?

Knowledge Check

Elena's firm calculated Copilot would save ~$1.2M (capturable value) and cost $87K in year one. The board asks what's missing from this analysis. What should Elena add?

🎬 Video coming soon

Next up: Challenges of Generative AI — fabrications, bias, reliability, and what leaders must know before deploying AI.