🔒 Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided DP-800 Domain 1
Domain 1 — Module 7 of 10 70%
7 of 28 overall

DP-800 Study Guide

Domain 1: Design and Develop Database Solutions

  • Table Design: Columns, Types, and Indexes Free
  • Constraints: Protecting Your Data Free
  • Specialised Tables and Graph Queries Free
  • JSON in SQL: Store, Query, and Index
  • Programmability: Views, Functions, Procedures, and Triggers
  • Advanced T-SQL: CTEs, Windows, and Correlated Queries
  • Pattern Power: Regular Expressions in T-SQL
  • Fuzzy String Matching: Finding Similar Text
  • AI-Assisted SQL with GitHub Copilot Free
  • MCP: Connecting AI to Your Database

Domain 2: Secure, Optimize, and Deploy Database Solutions

  • Encryption: Always Encrypted and Column-Level
  • Dynamic Data Masking and Row-Level Security
  • Permissions, Auditing, and Passwordless Access
  • Transaction Isolation and Concurrency
  • Query Performance: Plans, DMVs, and Query Store
  • SQL Database Projects: Build and Validate
  • CI/CD Pipelines for SQL Databases
  • Data API Builder: REST and GraphQL from SQL
  • Securing AI and API Endpoints
  • Change Detection: CES, CDC, and Change Tracking
  • Azure Integration and Monitoring

Domain 3: Implement AI Capabilities in Database Solutions

  • External AI Models: Choose, Create, Manage
  • Embeddings: Design, Chunk, and Generate
  • Choose Your Search Strategy
  • Vector Data: Types, Indexes, and Storage
  • Vector Search: Distance, ANN, and ENN
  • Hybrid Search and Reciprocal Rank Fusion
  • RAG with SQL: Prompt, Process, Respond

DP-800 Study Guide

Domain 1: Design and Develop Database Solutions

  • Table Design: Columns, Types, and Indexes Free
  • Constraints: Protecting Your Data Free
  • Specialised Tables and Graph Queries Free
  • JSON in SQL: Store, Query, and Index
  • Programmability: Views, Functions, Procedures, and Triggers
  • Advanced T-SQL: CTEs, Windows, and Correlated Queries
  • Pattern Power: Regular Expressions in T-SQL
  • Fuzzy String Matching: Finding Similar Text
  • AI-Assisted SQL with GitHub Copilot Free
  • MCP: Connecting AI to Your Database

Domain 2: Secure, Optimize, and Deploy Database Solutions

  • Encryption: Always Encrypted and Column-Level
  • Dynamic Data Masking and Row-Level Security
  • Permissions, Auditing, and Passwordless Access
  • Transaction Isolation and Concurrency
  • Query Performance: Plans, DMVs, and Query Store
  • SQL Database Projects: Build and Validate
  • CI/CD Pipelines for SQL Databases
  • Data API Builder: REST and GraphQL from SQL
  • Securing AI and API Endpoints
  • Change Detection: CES, CDC, and Change Tracking
  • Azure Integration and Monitoring

Domain 3: Implement AI Capabilities in Database Solutions

  • External AI Models: Choose, Create, Manage
  • Embeddings: Design, Chunk, and Generate
  • Choose Your Search Strategy
  • Vector Data: Types, Indexes, and Storage
  • Vector Search: Distance, ANN, and ENN
  • Hybrid Search and Reciprocal Rank Fusion
  • RAG with SQL: Prompt, Process, Respond
Domain 1: Design and Develop Database Solutions Premium ⏱ ~13 min read

Pattern Power: Regular Expressions in T-SQL

Use the brand-new REGEXP functions in SQL Server 2025 to find, replace, extract, count, and split text with powerful pattern matching — directly in your T-SQL queries.

Pattern matching finally comes to T-SQL

☕ Simple explanation

Think of LIKE as a basic search, and regex as a super-powered search.

LIKE can find “starts with A” or “contains ‘bank’.” But what if you need: “find all phone numbers in any format,” “extract the domain from an email address,” or “replace all credit card numbers with asterisks”? That is what regular expressions (regex) do — they describe patterns, not specific text.

SQL Server 2025 adds seven REGEXP functions. Before this, you had to use CLR assemblies or export data to another language. Now it is all native T-SQL.

SQL Server 2025 introduces seven regular expression functions — a long-awaited feature that brings T-SQL in line with PostgreSQL, Oracle, and MySQL. These functions support full POSIX-style regex patterns with flags for case sensitivity, multiline mode, and dot-matches-newline.

The functions require database compatibility level 170 (for REGEXP_LIKE) or work at all compatibility levels (for the other six). They apply to CHAR, NCHAR, VARCHAR, and NVARCHAR data types, with some supporting VARCHAR(MAX) up to 2 MB.

The seven REGEXP functions

Seven REGEXP functions — each returns a different result type
FunctionReturnsWhat It DoesEquivalent To
REGEXP_LIKEBooleanDoes the pattern match?LIKE but with full regex
REGEXP_REPLACEStringReplace matched textREPLACE but pattern-based
REGEXP_SUBSTRStringExtract matched textSUBSTRING with pattern matching
REGEXP_INSTRIntegerPosition of matchCHARINDEX with patterns
REGEXP_COUNTIntegerCount of matchesNo simple equivalent
REGEXP_MATCHESTableAll captured groups as rowsNo simple equivalent
REGEXP_SPLIT_TO_TABLETableSplit string by pattern into rowsSTRING_SPLIT with regex delimiters

REGEXP_LIKE: does it match?

The simplest function — returns TRUE or FALSE.

-- Find customers with valid email format
SELECT FullName, Email
FROM Customers
WHERE REGEXP_LIKE(Email, '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$');

-- Find phone numbers in any format (with or without dashes/spaces)
SELECT * FROM Contacts
WHERE REGEXP_LIKE(Phone, '^\+?\d[\d\s\-]{7,15}$');

Flags

All REGEXP functions accept an optional flags parameter:

FlagMeaningDefault
cCase-sensitiveYes (default)
iCase-insensitiveNo
mMultiline (^ and $ match line boundaries)No
sDot matches newlineNo
-- Case-insensitive search
SELECT * FROM Products
WHERE REGEXP_LIKE(Description, 'wireless|bluetooth', 'i');

REGEXP_REPLACE: find and transform

-- Mask credit card numbers: 4111-2222-3333-4444 -> ****-****-****-4444
SELECT REGEXP_REPLACE(CardNumber, '\d{4}-\d{4}-\d{4}-', '****-****-****-') AS Masked
FROM Payments;

-- Standardise phone numbers: remove all non-digits
SELECT REGEXP_REPLACE(Phone, '[^\d+]', '') AS CleanPhone
FROM Contacts;

-- Clean up extra whitespace
SELECT REGEXP_REPLACE(Description, '\s+', ' ') AS Cleaned
FROM Products;

Backreferences in replacements

REGEXP_REPLACE supports captured groups with \1, \2, etc.:

-- Reformat dates: 2026-04-21 -> 21/04/2026
SELECT REGEXP_REPLACE('2026-04-21', '(\d{4})-(\d{2})-(\d{2})', '\3/\2/\1');
-- Returns: 21/04/2026

REGEXP_SUBSTR: extract matching text

-- Extract domain from email
SELECT REGEXP_SUBSTR(Email, '@(.+)$') AS Domain
FROM Customers;

-- Extract first number from a mixed string
SELECT REGEXP_SUBSTR('Order #12345 - Priority', '\d+') AS OrderNumber;
-- Returns: 12345

REGEXP_INSTR: find the position

-- Find position of first digit in a string
SELECT REGEXP_INSTR('ABC-123-DEF', '\d') AS FirstDigitPosition;
-- Returns: 5

REGEXP_COUNT: how many matches?

-- Count how many words are in a description
SELECT Description, REGEXP_COUNT(Description, '\b\w+\b') AS WordCount
FROM Products;

-- Count email addresses in a text block
SELECT REGEXP_COUNT(Notes, '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}') AS EmailCount
FROM SupportTickets;

REGEXP_MATCHES: extract all captured groups

Returns a table of all captured subgroups — one row per match.

-- Extract all key=value pairs from a config string
SELECT match_value
FROM REGEXP_MATCHES('host=db.local;port=1433;user=admin', '(\w+)=([^;]+)');

REGEXP_SPLIT_TO_TABLE: split by pattern

Like STRING_SPLIT but the delimiter is a regex pattern.

-- Split a CSV with inconsistent delimiters (commas, semicolons, pipes)
SELECT value
FROM REGEXP_SPLIT_TO_TABLE('apple,banana;cherry|date', '[,;|]');
-- Returns 4 rows: apple, banana, cherry, date
💡 Scenario: Priya's data cleaning at Vault Bank

Vault Bank receives transaction descriptions in messy formats from various payment processors. Priya uses REGEXP functions to standardise them:

-- Extract merchant name (everything before the first number or special char)
UPDATE Transactions
SET MerchantName = TRIM(REGEXP_SUBSTR(RawDescription, '^[A-Za-z\s]+'));

-- Flag suspicious descriptions containing known fraud patterns
SELECT * FROM Transactions
WHERE REGEXP_LIKE(RawDescription, '(TEST|FRAUD|STOLEN|XXX)', 'i');

-- Count how many transactions have non-ASCII characters
SELECT COUNT(*) FROM Transactions
WHERE REGEXP_LIKE(RawDescription, '[^\x00-\x7F]');
💡 Exam tip: REGEXP_LIKE requires compatibility level 170

REGEXP_LIKE specifically requires database compatibility level 170 or above. The other six REGEXP functions work at all compatibility levels. If the exam mentions a database at compatibility level 160, REGEXP_LIKE will not be available — but REGEXP_REPLACE, REGEXP_SUBSTR, etc. will still work.

Check and set compatibility level:

-- Check
SELECT compatibility_level FROM sys.databases WHERE name = DB_NAME();

-- Set to 170
ALTER DATABASE [MyDB] SET COMPATIBILITY_LEVEL = 170;
Question

What is the difference between LIKE and REGEXP_LIKE?

Click or press Enter to reveal answer

Answer

LIKE supports only basic wildcards (% for any characters, _ for one character). REGEXP_LIKE supports full regular expressions — character classes, quantifiers, alternation, anchors, and capture groups. REGEXP_LIKE requires compatibility level 170+.

Click to flip back

Question

Which REGEXP functions return a table instead of a scalar value?

Click or press Enter to reveal answer

Answer

REGEXP_MATCHES returns a table of captured groups (one row per match). REGEXP_SPLIT_TO_TABLE returns a table of substrings split by the pattern. Both are used with FROM or CROSS APPLY.

Click to flip back

Question

How do you use backreferences in REGEXP_REPLACE?

Click or press Enter to reveal answer

Answer

Capture groups with parentheses in the pattern, then reference them in the replacement with \\1, \\2, etc. Example: REGEXP_REPLACE('2026-04-21', '(\\d{4})-(\\d{2})-(\\d{2})', '\\3/\\2/\\1') returns '21/04/2026'.

Click to flip back

Knowledge Check

Ingrid at Nordic Shield Insurance needs to find all policy records where the PolicyNumber field does not match the expected format: two uppercase letters followed by a dash and exactly six digits (e.g., NZ-123456). Which query identifies non-conforming records?

Knowledge Check

Dev at PixelForge Studios needs to extract all hashtags (words starting with #) from user-generated content stored in a Description column. Each row may contain zero or more hashtags. Which approach returns one row per hashtag?

🎬 Video coming soon

Next up: Fuzzy String Matching: Finding Similar Text — use EDIT_DISTANCE and JARO_WINKLER to find strings that are similar but not identical.

← Previous

Advanced T-SQL: CTEs, Windows, and Correlated Queries

Next →

Fuzzy String Matching: Finding Similar Text

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.