πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided AI-901 Domain 2
Domain 2 β€” Module 12 of 15 80%
23 of 26 overall

AI-901 Study Guide

Domain 1: AI Concepts and Capabilities

  • What is AI? Your First 10 Minutes Free
  • Responsible AI: The Six Principles Free
  • How Generative AI Actually Works Free
  • Choosing the Right AI Model Free
  • Deploying AI Models: Options & Settings
  • AI Workloads at a Glance
  • Text Analysis: Keywords, Entities & Sentiment
  • Speech: Recognition & Synthesis
  • Computer Vision: Seeing the World
  • Image Generation: Creating with AI
  • Information Extraction: From Chaos to Structure

Domain 2: Implement AI Solutions Using Foundry

  • Prompting Fundamentals: System & User Prompts
  • Microsoft Foundry: Your AI Command Center Free
  • Building a Chat App with the Foundry SDK
  • Agents in Foundry: Create & Test
  • Building an Agent Client App
  • Building a Text Analysis App
  • Multimodal: Responding to Speech
  • Azure Speech in Foundry Tools
  • Visual Prompts: Images as Input
  • Generating Images with AI
  • Building a Vision App
  • Content Understanding: Documents & Forms
  • Multimodal Extraction: Images, Audio & Video
  • Building an Extraction App
  • Exam Prep: Putting It All Together

AI-901 Study Guide

Domain 1: AI Concepts and Capabilities

  • What is AI? Your First 10 Minutes Free
  • Responsible AI: The Six Principles Free
  • How Generative AI Actually Works Free
  • Choosing the Right AI Model Free
  • Deploying AI Models: Options & Settings
  • AI Workloads at a Glance
  • Text Analysis: Keywords, Entities & Sentiment
  • Speech: Recognition & Synthesis
  • Computer Vision: Seeing the World
  • Image Generation: Creating with AI
  • Information Extraction: From Chaos to Structure

Domain 2: Implement AI Solutions Using Foundry

  • Prompting Fundamentals: System & User Prompts
  • Microsoft Foundry: Your AI Command Center Free
  • Building a Chat App with the Foundry SDK
  • Agents in Foundry: Create & Test
  • Building an Agent Client App
  • Building a Text Analysis App
  • Multimodal: Responding to Speech
  • Azure Speech in Foundry Tools
  • Visual Prompts: Images as Input
  • Generating Images with AI
  • Building a Vision App
  • Content Understanding: Documents & Forms
  • Multimodal Extraction: Images, Audio & Video
  • Building an Extraction App
  • Exam Prep: Putting It All Together
Domain 2: Implement AI Solutions Using Foundry Premium ⏱ ~14 min read

Content Understanding: Documents & Forms

Turn messy documents into clean, structured data. Azure Content Understanding reads invoices, receipts, ID cards, and forms β€” extracting exactly the fields you need.

Extracting data from documents

β˜• Simple explanation

Content Understanding is like a super-powered data entry clerk β€” it reads any document and pulls out exactly the data you need.

Hand it a stack of invoices β†’ it gives you a spreadsheet of vendor names, amounts, and dates. Hand it receipts β†’ it gives you items and totals. Hand it ID cards β†’ it gives you names and ID numbers. All automatically, all accurate, all in seconds.

The magic is that it doesn’t just read text (that’s OCR). It understands the document structure β€” it knows that the number next to β€œTotal:” is the total amount, not just a random number.

Azure Content Understanding (part of Foundry Tools) is a multimodal extraction service that processes documents, images, audio, and video to extract structured data. For documents, it combines OCR, layout analysis, and field extraction to identify and extract specific data fields with their values β€” going beyond raw text extraction to understand document structure and semantics.

What Content Understanding extracts

Pre-built document models

Content Understanding includes prebuilt analyzers for common document types:

Document TypeFields Extracted
InvoicesVendor name, invoice number, date, line items, subtotal, tax, total, payment terms
ReceiptsMerchant name, date, items, prices, total, tax, tip
ID documentsName, date of birth, document number, nationality, expiry date
Business cardsName, title, company, phone, email, address
Tax forms (W-2, 1099)Employee/employer info, wages, tax withheld
Health insurance cardsMember name, ID, group number, plan type

Custom analyzers

For documents specific to your business, you can train custom analyzers:

  1. Upload sample documents
  2. Label the fields you want to extract
  3. Train the model
  4. Deploy and use in your application

GreenLeaf scenario: GreenLeaf receives supplier invoices in 20 different formats. They use the pre-built invoice model to extract vendor, amount, and due date β€” no training needed. For their custom crop inspection reports, they train a custom model to extract field location, crop type, and health rating.

Building a document extraction app

from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.core.credentials import AzureKeyCredential

client = DocumentIntelligenceClient(
    endpoint="https://your-resource.cognitiveservices.azure.com/",
    credential=AzureKeyCredential("your-key")
)

# Analyse an invoice
with open("invoice.pdf", "rb") as f:
    result = client.begin_analyze_document(
        analyzer_id="prebuilt-invoice",
        body=f.read()
    ).result()

# Extract fields
for document in result.documents:
    vendor = document.fields.get("VendorName")
    total = document.fields.get("InvoiceTotal")
    date = document.fields.get("InvoiceDate")

    print(f"Vendor: {vendor.content if vendor else 'N/A'}")
    print(f"Total: {total.content if total else 'N/A'}")
    print(f"Date: {date.content if date else 'N/A'}")

How it works under the hood

Content Understanding processes documents in layers:

LayerWhat Happens
1. OCRReads all text from the document (printed and handwritten)
2. Layout analysisIdentifies tables, headers, paragraphs, sections, and page structure
3. Field mappingMaps specific text regions to named fields based on the model
4. Confidence scoringEach extracted field includes a confidence score (0.0 to 1.0)
5. ValidationChecks formats β€” dates look like dates, amounts look like amounts
πŸ’‘ Confidence scores and handling uncertainty

Every extracted field includes a confidence score:

  • 0.90-1.00 β€” High confidence, likely correct
  • 0.70-0.89 β€” Medium confidence, may need review
  • Below 0.70 β€” Low confidence, likely needs human verification

Best practice: Set a threshold (e.g., 0.85) and flag documents below it for human review. This gives you automation speed with human accuracy.

Exam relevance: The exam may test your understanding of confidence thresholds and when to involve human review β€” this connects to the reliability and safety responsible AI principle.

🎬 Video walkthrough

🎬 Video coming soon

Content Understanding: Documents β€” AI-901 Module 23

Content Understanding: Documents β€” AI-901 Module 23

~14 min

Flashcards

Question

What is the difference between OCR and Content Understanding?

Click or press Enter to reveal answer

Answer

OCR extracts raw text from images. Content Understanding goes further β€” it understands document structure (tables, headers) and extracts specific named fields with their values. OCR gives you text; Content Understanding gives you structured data.

Click to flip back

Question

What prebuilt analyzers does Content Understanding include?

Click or press Enter to reveal answer

Answer

Invoices, receipts, ID documents, business cards, tax forms (W-2, 1099), and health insurance cards. Each extracts specific fields relevant to that document type.

Click to flip back

Question

What is a confidence score in Content Understanding?

Click or press Enter to reveal answer

Answer

A number from 0.0 to 1.0 indicating how confident the model is in each extracted field value. High (0.90+) = likely correct, medium (0.70-0.89) = may need review, low (below 0.70) = needs human verification.

Click to flip back

Question

When would you train a custom Content Understanding model?

Click or press Enter to reveal answer

Answer

When your documents are specific to your business and not covered by prebuilt analyzers. Upload sample documents, label the fields you want to extract, train, then deploy.

Click to flip back

Knowledge Check

Knowledge Check

GreenLeaf processes invoices from 20 different suppliers, each with a different format. They want to extract vendor name, total amount, and due date from each. What's the best approach?

Knowledge Check

Content Understanding extracts an invoice total with a confidence score of 0.65. What should the application do?


Next up: Multimodal Extraction β€” pulling data from images, audio, and video using Content Understanding.

← Previous

Building a Vision App

Next β†’

Multimodal Extraction: Images, Audio & Video

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.