JSON in SQL: Store, Query, and Index
Store semi-structured data as native JSON, query it with built-in functions, and index JSON properties for fast lookups β bridging the gap between relational and document models.
Why JSON in a relational database?
Think of a restaurant that serves both set menus and custom orders.
Your regular tables are like set menus β every dish (column) is predefined. But sometimes customers want something custom β extra toppings, special requests, dietary notes. Instead of adding a new column for every possible request, you store the custom details as a flexible note (JSON) alongside the set menu items.
JSON in SQL gives you the best of both worlds: strict relational structure for your core data, and flexible document storage for data that varies between rows β product attributes, user preferences, API responses, or configuration settings.
The JSON data type (SQL Server 2025)
SQL Server 2025 adds a native JSON data type that stores JSON in binary format β smaller and faster than NVARCHAR.
CREATE TABLE Products (
ProductId INT NOT NULL PRIMARY KEY,
Name NVARCHAR(200) NOT NULL,
Price DECIMAL(10,2) NOT NULL,
Attributes JSON NULL -- Flexible properties that vary by product
);
INSERT INTO Products (ProductId, Name, Price, Attributes)
VALUES (1, 'Wireless Mouse', 29.99,
'{"color": "black", "dpi": 1600, "wireless": true, "battery": "AA"}');
Reading JSON: JSON_VALUE and OPENJSON
JSON_VALUE β extract a single scalar value
-- Get the color of product 1
SELECT Name, JSON_VALUE(Attributes, '$.color') AS Color
FROM Products
WHERE ProductId = 1;
-- Returns: Wireless Mouse, black
The $ represents the root. Navigate nested objects with dots: $.address.city.
OPENJSON β shred JSON into rows and columns
-- Expand a JSON array into rows
DECLARE @json NVARCHAR(MAX) = '[
{"name": "Alice", "role": "Developer"},
{"name": "Bob", "role": "DBA"},
{"name": "Carol", "role": "Architect"}
]';
SELECT name, role
FROM OPENJSON(@json)
WITH (
name NVARCHAR(100) '$.name',
role NVARCHAR(100) '$.role'
);
OPENJSON is essential for converting JSON arrays (from API responses, for example) into relational rows you can join, filter, and aggregate.
Building JSON: JSON_OBJECT, JSON_ARRAY, JSON_ARRAYAGG
These functions construct JSON from relational data β the reverse of OPENJSON.
-- JSON_OBJECT: build a JSON object from key-value pairs
SELECT JSON_OBJECT('id': ProductId, 'name': Name, 'price': Price) AS ProductJson
FROM Products;
-- Returns: {"id":1,"name":"Wireless Mouse","price":29.99}
-- JSON_ARRAY: build a JSON array from values
SELECT JSON_ARRAY(1, 'hello', 3.14, NULL ABSENT ON NULL) AS ArrayResult;
-- Returns: [1,"hello",3.14]
-- JSON_ARRAYAGG: aggregate rows into a JSON array
SELECT JSON_ARRAYAGG(JSON_OBJECT('name': Name, 'price': Price)) AS AllProducts
FROM Products;
-- Returns: [{"name":"Wireless Mouse","price":29.99}, ...]
Exam tip: JSON_ARRAYAGG is an aggregate function
JSON_ARRAYAGG works like STRING_AGG or SUM β it collapses multiple rows into a single JSON array value. You can use it with GROUP BY to create nested JSON structures from relational data. This is commonly needed when preparing data for AI model input (Domain 3 β converting relational data to JSON for language model processing).
Searching JSON: JSON_CONTAINS
JSON_CONTAINS checks whether a JSON document contains a specified value or fragment:
-- Find all products where Attributes contains "wireless": true
SELECT Name, Price
FROM Products
WHERE JSON_CONTAINS(Attributes, 'true', '$.wireless') = 1;
Indexing JSON columns
JSON columns cannot be indexed directly. Instead, create computed columns that extract specific JSON values, then index those computed columns:
-- Add a computed column for the color property
ALTER TABLE Products
ADD Color AS JSON_VALUE(Attributes, '$.color');
-- Index it
CREATE NONCLUSTERED INDEX IX_Products_Color
ON Products (Color);
-- Now this query uses the index
SELECT Name, Price FROM Products WHERE Color = 'black';
For the native JSON type in SQL Server 2025, you can also use JSON indexes directly on JSON path expressions.
| Approach | JSON in NVARCHAR | Native JSON Type (2025) |
|---|---|---|
| Storage format | Text (UTF-16) | Optimised binary |
| Storage size | Larger (text overhead) | Smaller (compressed binary) |
| Validation | ISJSON() check constraint | Automatic β rejects invalid JSON |
| Performance | Parse on every read | Pre-parsed, faster queries |
| Indexing | Computed columns + indexes | JSON indexes on path expressions |
| Compatibility | All SQL Server versions | SQL Server 2025+ and Azure SQL |
Scenario: Leo's product catalog at SearchWave
Leo Torres at SearchWave is building a product catalog where every product category has different attributes. Electronics have βbattery lifeβ and βweight.β Clothing has βsizeβ and βmaterial.β Food has βallergensβ and βexpiry.β
Instead of creating columns for every possible attribute (most would be NULL), Leo stores the variable attributes as JSON:
CREATE TABLE Catalog (
ProductId INT NOT NULL PRIMARY KEY,
CategoryId INT NOT NULL,
Name NVARCHAR(200) NOT NULL,
BasePrice DECIMAL(10,2) NOT NULL,
Specs JSON NULL -- Category-specific attributes
);He creates computed columns and indexes for the most-queried paths (like $.brand and $.rating), and uses OPENJSON for ad-hoc exploration of less common attributes.
This pattern β relational for core data, JSON for variable attributes β is a common exam scenario.
Priya at Vault Bank receives customer KYC (Know Your Customer) data from an external API as JSON arrays. She needs to insert this data into a relational Customers table. Which function should she use to convert the JSON array into rows?
Leo at SearchWave needs to generate a JSON API response that includes each product with its reviews nested as a JSON array. Which approach produces the correct nested JSON?
π¬ Video coming soon
Next up: Programmability: Views, Functions, Procedures, and Triggers β create reusable database objects that encapsulate logic.