Embeddings: Design, Chunk, and Generate
Turn your data into vectors that AI can search and compare β choose which columns to embed, design chunking strategies, select maintenance methods, and generate embeddings at scale.
Turning text into searchable numbers
Imagine every product in a store as a point on a map.
Similar products are close together: all the red shoes are in one area, all the running shoes in another, and red running shoes are where those areas overlap. That map is what embeddings create β except instead of physical location, each product is represented as a list of numbers (a vector) where similar items have similar numbers.
To create this map, you need to: (1) decide what information matters (product name? description? reviews?), (2) break long text into manageable pieces (chunks), and (3) send each piece to an AI model that converts it into a vector. This module covers all three steps.
Which columns to embed
Not every column should be embedded. Choose columns that contain semantic meaning users would search for:
| Column Type | Embed? | Why |
|---|---|---|
| Product description | Yes | Rich semantic content users search with natural language |
| Customer review text | Yes | Contains opinions and experiences users want to find |
| Technical documentation | Yes | Long-form content with searchable concepts |
| Product name | Maybe | Short text β useful if combined with description |
| Price, quantity | No | Numeric β use standard SQL comparison operators |
| Status codes, IDs | No | Categorical β use exact match, not semantic search |
| Timestamps | No | Temporal β use date range queries |
Combine related columns for richer embeddings:
-- Concatenate meaningful columns before embedding
SELECT
ProductId,
Name + ' | ' + Category + ' | ' + Description AS EmbeddingInput
FROM Products;
Chunking: breaking long text into pieces
Embedding models have token limits (typically 512-8,192 tokens). Long documents must be split into chunks.
Chunking strategies
| Strategy | How It Works | Pros | Cons |
|---|---|---|---|
| Fixed-size | Split every N characters/tokens | Simple, predictable | May split mid-sentence |
| Sentence-based | Split on sentence boundaries | Preserves meaning | Variable chunk sizes |
| Paragraph-based | Split on paragraph breaks | Natural boundaries | Paragraphs vary in length |
| Semantic | Split where topics change | Best meaning preservation | Most complex to implement |
| Overlapping | Each chunk includes some text from adjacent chunks | Preserves cross-boundary context | More chunks to store and search |
SQL Server 2025: AI_GENERATE_CHUNKS
-- Built-in chunking function
SELECT chunk_id, chunk_text
FROM AI_GENERATE_CHUNKS(
@text,
'text', -- chunk type
512 -- max chunk size (tokens)
);
Exam tip: Chunk size affects search quality
Smaller chunks (256-512 tokens) give more precise search results but lose broader context. Larger chunks (1024-2048 tokens) preserve context but may dilute the relevance signal. The sweet spot depends on your content:
- FAQ entries, short descriptions: no chunking needed (already small)
- Product documentation: 512-token chunks with overlap
- Legal documents: paragraph-based chunking to preserve clause boundaries
- Conversations: message-level chunking (each message = one chunk)
The exam may present a scenario where search results are too broad β the fix is usually smaller chunks.
Generating embeddings
Using AI_GENERATE_EMBEDDINGS (SQL Server 2025)
-- Generate embeddings using a registered model
SELECT
ProductId,
AI_GENERATE_EMBEDDINGS(EmbeddingModel, Description) AS DescriptionVector
FROM Products;
-- Store embeddings in a vector column
ALTER TABLE Products ADD DescriptionEmbedding VECTOR(1536);
UPDATE Products
SET DescriptionEmbedding = AI_GENERATE_EMBEDDINGS(EmbeddingModel, Description);
Maintaining embeddings as data changes
When source data changes, embeddings become stale. Choose a maintenance method based on your data change patterns:
| Method | How It Detects Changes | Latency | Best For |
|---|---|---|---|
| Table triggers | AFTER INSERT/UPDATE trigger regenerates embedding | Immediate (synchronous) | Low-volume tables where latency matters |
| Change Tracking | Lightweight change flags on rows | Near-real-time (polling) | Medium-volume, polling-based sync |
| Azure Functions SQL trigger | Serverless function triggered by row changes | Seconds to minutes | Serverless architecture, event-driven |
| CDC (Change Data Capture) | Full change history in change tables | Near-real-time | When you need the full change history, not just current state |
| CES (Change Event Streaming) | Streams changes to Event Hubs | Real-time streaming | High-volume, event-driven architectures |
| Azure Logic Apps | Workflow triggered by schedule or event | Minutes to hours | Low-code workflow, scheduled batch updates |
| Microsoft Foundry | AI pipeline orchestration | Configurable | Complex AI pipelines with multiple models |
Scenario: Leo's embedding pipeline at SearchWave
SearchWaveβs product catalog has 500,000 products. About 1,000 products change daily (new listings, updated descriptions). Leo designs the embedding maintenance strategy:
- Initial load: batch process all 500K products overnight using a SQL Agent job
- Ongoing updates: Azure Functions with SQL trigger binding β when a productβs description changes, the function automatically regenerates its embedding
- Why not triggers? Calling an external AI model synchronously inside a trigger would block the INSERT/UPDATE transaction. Azure Functions processes the embedding asynchronously.
Priya at Vault Bank has a customer support database with 10 million tickets. Each ticket has a subject line and a 2,000-word description. She wants to enable semantic search. How should she prepare the data for embeddings?
π¬ Video coming soon
Next up: Choose Your Search Strategy β compare full-text, vector, and hybrid search to pick the right approach for your data.