Fact Tables, Dimension Tables & Keys
Learn how to structure your data into fact tables and dimension tables — the foundation of every well-designed Power BI data model — and how to create appropriate keys for relationships.
Facts and dimensions: the building blocks
Think of a receipt from a grocery store.
The receipt shows what happened: you bought 3 avocados for $6.00 at 2:30 PM. That’s a fact — a measurable event.
But to understand the receipt, you need context: which product (avocado), which store (Queen Street), when (Tuesday 15 April), who (Customer #4521). Those are dimensions — the “who, what, where, when” that describe the facts.
In Power BI, you organise your data into fact tables (the events with numbers) and dimension tables (the context that describes them). This pattern is called a star schema — and it’s the foundation of every well-designed Power BI model.
Fact tables vs dimension tables
| Feature | Fact Table | Dimension Table |
|---|---|---|
| Contains | Measurable events — numbers you aggregate | Descriptive attributes — text, categories, hierarchies |
| Typical columns | Foreign keys + numeric measures (Revenue, Quantity, Cost) | Primary key + descriptive columns (Name, Category, Address) |
| Row count | Very high (millions of transactions) | Low to moderate (hundreds to thousands) |
| Changes | New rows added frequently (new sales, new orders) | Changes slowly (products updated, customers moved) |
| Examples | Sales, Orders, Appointments, Production, Web Visits | Products, Customers, Dates, Stores, Employees |
Building tables in Power Query
Riley at Coastal Fresh (🛒) receives a single flat table from her point-of-sale system with 50 columns and 2 million rows. She needs to split it into a star schema.
Step 1: Identify the fact table
The fact table is the event — each row is a sale. She keeps:
OrderID(unique identifier)ProductID(foreign key → Products)StoreID(foreign key → Stores)OrderDate(foreign key → Date)Quantity(measure)Revenue(measure)Cost(measure)
Step 2: Extract dimension tables
For each descriptive group, she creates a new query:
Products dimension: Reference the original query → select ProductID, ProductName, Category, Supplier → Remove Duplicates on ProductID
Stores dimension: Reference the original query → select StoreID, StoreName, Region, Address → Remove Duplicates on StoreID
Date dimension: She’ll create this separately (covered in the Date Tables module) — a dedicated calendar table covering all dates in her data range.
Step 3: Remove descriptive columns from the fact table
Now that products and stores have their own tables, Riley removes ProductName, Category, Supplier, StoreName, Region, and Address from the fact table. She keeps only the foreign keys (ProductID, StoreID) and the measures (Quantity, Revenue, Cost).
Why not just use the flat table?
A flat table with 50 columns and 2 million rows works — Power BI can handle it. But it has problems:
- Larger file size: ProductName “Organic Avocado” stored 50,000 times instead of once
- Slower performance: VertiPaq compresses dimension columns better when they’re in separate tables
- Harder DAX: Filter context doesn’t flow as intuitively in a flat table
- Maintenance headaches: Updating a product category means updating every row that references it
Star schema solves all of these. The exam strongly expects you to recognise and build star schemas.
Creating appropriate keys
Every relationship in Power BI needs a key — a column that uniquely identifies rows on one side of the relationship.
Key types
| Key Type | What It Is | Example |
|---|---|---|
| Primary key | A column with unique values in a dimension table | ProductID in the Products table |
| Foreign key | A column in the fact table that references a primary key | ProductID in the Sales table |
| Surrogate key | An artificial integer key (1, 2, 3…) created when no natural key exists | CustomerKey added via Index Column |
| Composite key | A key made from combining two or more columns | Region + StoreNumber → unique StoreID |
Creating a surrogate key in Power Query
Dr. Ethan Moss at Bayview Medical Centre (🏥) has a patient referrals table where no single column uniquely identifies a record. He creates a surrogate key:
- Add Column → Index Column → From 1
- Rename the column to
ReferralKey - This gives each row a unique identifier — useful for deduplication and operational tracking. For relationships, the key columns will typically be the foreign keys linking to dimension tables (DoctorID, PatientID, etc.), not the surrogate key itself.
Creating a composite key
Kenji at Apex Manufacturing (🏭) has factory data where FactoryCode + LineNumber together form a unique identifier, but neither alone is unique.
- Select both columns → Add Column → Merge Columns
- Choose a separator (e.g., hyphen)
- Result: “TKY-01”, “TKY-02”, “SHA-01” — now each row has a unique key
Exam tip: key validation
Before creating a relationship, verify your key is truly unique in the dimension table. Use Column Distribution (from the profiling module) — the distinct count should equal the row count.
If it doesn’t, you have duplicates. Common causes:
- Multiple entries for the same entity (fix: Remove Duplicates)
- The wrong column was chosen as the key
- You need a composite key instead of a single column
Knowledge check
Riley has a flat sales table with 2 million rows and 50 columns including ProductName, Category, StoreName, and Region. What's the best approach for building a Power BI model?
Dr. Ethan has a referrals table where no single column uniquely identifies a row. What should he create?
🎬 Video coming soon
Next up: Configure Data Loading — control how Power Query loads data into your model.