External AI Models: Choose, Create, Manage
Evaluate and integrate external AI models into your SQL database — understand model types, sizes, languages, structured output, and how to create and manage model definitions.
Bringing AI into your database
Think of hiring a specialist consultant.
Your database is great at storing and querying data. But it cannot understand natural language, recognise images, or generate text — that is not its job. External AI models are specialists you hire: one understands English and Japanese, another can read images, and a third generates structured JSON from questions. You register them in your database so your T-SQL code can call them directly.
The key decision: which specialist do you hire for each task? A cheap, fast one for simple jobs? A powerful, expensive one for complex analysis? This module teaches you how to choose.
Evaluating external models
When choosing a model for your SQL-based AI solution, evaluate across four dimensions:
| Dimension | What to Evaluate | Trade-off |
|---|---|---|
| Modality | Text-only, multimodal (text+image+audio), or embedding models | Multimodal models are more versatile but larger and more expensive |
| Language | Monolingual (English) or multilingual (100+ languages) | Multilingual models handle global data but may be less precise in each language |
| Size | Small (fast, cheap), medium (balanced), large (most capable) | Larger models produce better results but cost more per token and have higher latency |
| Structured output | Can the model return JSON, XML, or structured formats reliably? | Essential for database integration — unstructured text is hard to INSERT into tables |
Model size guide
| Size Category | Examples | Tokens/sec | Best For |
|---|---|---|---|
| Small | GPT-4o-mini, Phi-3 | Very fast | Classification, simple extraction, embeddings |
| Medium | GPT-4o | Fast | General Q&A, summarisation, code generation |
| Large | GPT-4, o1-series | Slower | Complex reasoning, multi-step analysis |
| Embedding | text-embedding-3-small/large | Very fast | Vector search, semantic similarity |
Exam tip: Structured output is critical for SQL integration
When calling an AI model from T-SQL (via sp_invoke_external_rest_endpoint), you need the response in a parseable format — typically JSON. Models that reliably produce structured JSON output are essential. If the model returns free-form text, you cannot reliably extract values into table columns.
Look for models that support JSON mode or structured output schemas in their API. The exam may present a scenario where the model returns inconsistent formats — the fix is configuring structured output in the API call.
Creating external models in SQL
External models are database objects that register an AI endpoint:
-- Create an external model definition
CREATE EXTERNAL MODEL EmbeddingModel
WITH (
MODEL_TYPE = 'EMBEDDINGS',
ENDPOINT = 'https://vaultbank-ai.openai.azure.com/',
DEPLOYMENT_NAME = 'text-embedding-3-small',
API_KEY_SECRET_NAME = 'AzureOpenAI_Key',
API_FORMAT = 'OPENAI'
);
-- Create a model for text generation
CREATE EXTERNAL MODEL ChatModel
WITH (
MODEL_TYPE = 'COMPLETIONS',
ENDPOINT = 'https://vaultbank-ai.openai.azure.com/',
DEPLOYMENT_NAME = 'gpt-4o',
API_KEY_SECRET_NAME = 'AzureOpenAI_Key',
API_FORMAT = 'OPENAI'
);
Managing models
-- View all registered models
SELECT * FROM sys.external_models;
-- Alter model (e.g., change deployment)
ALTER EXTERNAL MODEL ChatModel
SET DEPLOYMENT_NAME = 'gpt-4o-2024-11-20';
-- Drop model
DROP EXTERNAL MODEL EmbeddingModel;
Security for model endpoints
Model endpoints require authentication. Options:
- API keys stored as database-scoped credentials
- Managed Identity (preferred for Azure-hosted databases)
- Key Vault references for centralised secret management
-- Using managed identity (no secrets needed)
CREATE EXTERNAL MODEL SecureModel
WITH (
MODEL_TYPE = 'COMPLETIONS',
ENDPOINT = 'https://vaultbank-ai.openai.azure.com/',
DEPLOYMENT_NAME = 'gpt-4o',
CREDENTIAL = [ManagedIdentityCredential],
API_FORMAT = 'OPENAI'
);
Scenario: Priya's model selection at Vault Bank
Priya needs three AI capabilities for the banking platform:
- Fraud detection — classify transactions as suspicious or legitimate. She chooses a small model (GPT-4o-mini) because it is fast and cheap for binary classification at high volume.
- Customer support summarisation — summarise complaint emails in English and Japanese. She chooses a multilingual medium model (GPT-4o) for balanced quality and language support.
- Semantic search — find similar customer queries. She chooses an embedding model (text-embedding-3-small) for generating vector representations.
Three different models, three different use cases — each optimised for its task.
Leo at SearchWave needs to add semantic product search. The system must generate vector embeddings for 2 million product descriptions nightly, and latency per embedding must be under 50ms. Which model choice is best?
🎬 Video coming soon
Next up: Embeddings: Design, Chunk, and Generate — turn your data into vectors that AI can search and compare.