Table Design: Columns, Types, and Indexes
Design tables that perform well and store data correctly β choose the right data types, build effective indexes, and understand columnstore for analytics.
Every database starts with a table
Think of a spreadsheet.
A table is like a spreadsheet with strict rules. Each column has a name and a type (text, number, date), and every row follows those rules. You cannot put someoneβs name into a column that expects a number β the database will reject it.
Indexes are like the index at the back of a textbook. Instead of reading every page to find βvector search,β you flip to the index, find the page number, and jump straight there. Without an index, the database reads every single row β which works for 100 rows but is painfully slow for 10 million.
This module teaches you how to pick the right column types, build indexes that make queries fast, and use columnstore indexes for analytics workloads.
Choosing the right data type
Every column needs a data type. Picking the right one affects storage size, query performance, and data integrity.
The golden rule: use the smallest type that fits your data.
| Data | Bad Choice | Good Choice | Why? |
|---|---|---|---|
| Customer age | BIGINT (8 bytes) | TINYINT (1 byte) | Ages fit in 0β255. You save 7 bytes per row. |
| Product price | FLOAT | DECIMAL(10,2) | FLOAT has rounding errors. Money needs exact precision. |
| Country code | VARCHAR(100) | CHAR(2) | ISO codes are always 2 characters. Fixed-length is faster. |
| Email address | VARCHAR(MAX) | VARCHAR(320) | MAX goes off-row. 320 is the RFC maximum for emails. |
| Yes/No flag | VARCHAR(3) | BIT (1 bit) | BIT uses almost no space. βYesβ/βNoβ wastes bytes. |
Exam tip: VARCHAR vs NVARCHAR
VARCHAR stores ASCII (1 byte per character). NVARCHAR stores Unicode (2 bytes per character). If your data includes non-Latin characters (Chinese, Arabic, emoji), you need NVARCHAR. The exam may present a scenario where international data is stored in VARCHAR β that is the wrong choice.
Also watch for VARCHAR(MAX) and NVARCHAR(MAX). These store data off-row when the value exceeds 8,000 bytes, which hurts performance. Only use MAX when you genuinely need unlimited-length text (like document content).
Indexes: the speed layer
Without an index, SQL Server performs a table scan β it reads every row to find matches. With the right index, it jumps directly to the matching rows.
Clustered vs nonclustered
Every table can have one clustered index (which physically sorts the data on disk) and up to 999 nonclustered indexes (which are separate lookup structures).
| Feature | Clustered Index | Nonclustered Index |
|---|---|---|
| How many per table? | One only | Up to 999 |
| Data storage | Data rows stored in index order | Separate structure with pointers to data rows |
| Best for | The column you query by most (usually the PK) | Columns used in WHERE, JOIN, ORDER BY |
| Write overhead | Moderate (data must stay sorted) | Higher per additional index (each insert updates every NCI) |
| Default | Created automatically with PRIMARY KEY | Must be created manually |
Included columns and covering indexes
A covering index contains all the columns a query needs, so SQL Server never has to look up the actual data row. You create one by adding included columns:
-- Covering index for: SELECT Email, Name FROM Customers WHERE City = 'Auckland'
CREATE NONCLUSTERED INDEX IX_Customers_City
ON Customers (City)
INCLUDE (Email, Name);
The INCLUDE columns are stored in the leaf level of the index but not in the key. This makes the index wider but avoids expensive key lookups.
Scenario: Priya's index decision at Vault Bank
Priya Sharma is optimising the transaction lookup at Vault Bank. The most common query is:
SELECT TransactionDate, Amount, Description
FROM Transactions
WHERE AccountId = @AccountId
ORDER BY TransactionDate DESC;She creates a covering index:
CREATE NONCLUSTERED INDEX IX_Transactions_Account
ON Transactions (AccountId, TransactionDate DESC)
INCLUDE (Amount, Description);This index covers the entire query β no key lookups needed. The AccountId in the key handles the WHERE clause, TransactionDate DESC handles the ORDER BY, and Amount and Description in INCLUDE handle the SELECT.
Columnstore indexes: analytics at speed
Traditional (rowstore) indexes are great for looking up individual rows. But when you need to scan millions of rows for analytics β totals, averages, aggregations β columnstore indexes are dramatically faster.
| Aspect | Rowstore Index | Columnstore Index |
|---|---|---|
| Storage | Row by row | Column by column (compressed) |
| Best for | Point lookups, small ranges | Aggregations over millions of rows |
| Compression | Moderate | Excellent (often 10x smaller) |
| Update pattern | Fast for single-row inserts | Optimised for bulk loads |
| Typical use | OLTP (transactions) | OLAP (analytics, reporting) |
-- Create a clustered columnstore index for analytics
CREATE CLUSTERED COLUMNSTORE INDEX CCI_SalesHistory
ON SalesHistory;
You can also create a nonclustered columnstore index on an OLTP table to enable real-time analytics without changing the tableβs rowstore structure:
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Orders_Analytics
ON Orders (OrderDate, ProductId, Quantity, TotalAmount);
Exam tip: When NOT to use columnstore
Columnstore indexes shine for analytical queries (SUM, AVG, COUNT over millions of rows). They are NOT ideal for:
- Frequent single-row lookups (use rowstore B-tree indexes)
- Narrow result sets with many columns (columnstore reads entire column segments)
- Tables with heavy singleton INSERT/UPDATE/DELETE patterns (delta store overhead)
The exam may present a scenario where an OLTP table needs point lookups β columnstore would be the wrong answer.
SEQUENCES: generating values across tables
IDENTITY generates auto-incrementing values for a single table. SEQUENCE is a database-level object that generates values you can use across multiple tables, in any order, and even peek at the next value before inserting.
-- Create a sequence
CREATE SEQUENCE dbo.InvoiceNumber
AS INT
START WITH 1000
INCREMENT BY 1
MINVALUE 1000
MAXVALUE 9999999
CYCLE; -- Restarts at MINVALUE when MAXVALUE is reached
-- Use it in an INSERT
INSERT INTO Invoices (InvoiceNo, CustomerId, Amount)
VALUES (NEXT VALUE FOR dbo.InvoiceNumber, @CustomerId, @Amount);
SEQUENCE vs IDENTITY
| Feature | IDENTITY | SEQUENCE |
|---|---|---|
| Scope | Single table | Database-wide (any table) |
| Get next value without inserting | No | Yes (NEXT VALUE FOR) |
| Reset/cycle | Requires RESEED | Built-in CYCLE option |
| Use in DEFAULT constraint | Automatic | Yes (with NEXT VALUE FOR) |
| Multiple tables sharing one counter | No | Yes |
| Performance | Slightly faster for single-table | Slightly more overhead |
Scenario: Leo's shared sequence at SearchWave
Leo Torres at SearchWave needs a single incrementing reference number shared across three tables: Orders, Returns, and Exchanges. With IDENTITY, each table would have its own counter and numbers would overlap. With a SEQUENCE, all three tables pull from the same counter:
CREATE SEQUENCE dbo.ReferenceNumber AS BIGINT START WITH 100000 INCREMENT BY 1;
-- All three tables use the same sequence
ALTER TABLE Orders ADD CONSTRAINT DF_Orders_Ref DEFAULT NEXT VALUE FOR dbo.ReferenceNumber FOR ReferenceNo;
ALTER TABLE Returns ADD CONSTRAINT DF_Returns_Ref DEFAULT NEXT VALUE FOR dbo.ReferenceNumber FOR ReferenceNo;
ALTER TABLE Exchanges ADD CONSTRAINT DF_Exchanges_Ref DEFAULT NEXT VALUE FOR dbo.ReferenceNumber FOR ReferenceNo;Now every reference number is unique across all three tables β no duplicates, no gaps between tables.
Priya at Vault Bank needs to create a table for daily transaction summaries. The table will store 50 million rows and be queried primarily with aggregate functions (SUM, AVG) for monthly reports. Singleton lookups are rare. Which index strategy should she use?
Leo at SearchWave needs a reference number that is unique across the Orders, Returns, and Exchanges tables. The number must increment by 1 with no duplicates between tables. What should he use?
π¬ Video coming soon
Next up: Constraints: Protecting Your Data β enforce data integrity with PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, and DEFAULT constraints.