Analytical Workloads: Finding the Insights
Transactional systems record what happens. Analytical systems answer why it happened and what to do next. Let's explore the world of OLAP, data warehouses, and reporting.
What is an analytical workload?
Analytical workloads answer big-picture questions using lots of data.
Imagine Priya at FreshMart. The point-of-sale system records every transaction (that’s transactional). But Priya’s job is different — she asks questions like “Which stores had the highest sales last quarter?” or “Are organic products growing faster than conventional?”
To answer those questions, she doesn’t need individual receipts. She needs summaries, trends, and comparisons across millions of records. That’s an analytical workload.
How analytical workloads differ from transactional
| Feature | Transactional (OLTP) | Analytical (OLAP) |
|---|---|---|
| Question type | ”What just happened?" | "What happened over time?” |
| Data volume per query | Single records or small batches | Millions to billions of records |
| Operations | INSERT, UPDATE, DELETE | SELECT with aggregations (SUM, AVG, COUNT) |
| Data freshness | Real-time (current state) | Near-real-time to periodic (often loaded in batches) |
| Users | Operational staff (cashiers, dispatchers) | Analysts, managers, executives |
| Schema design | Normalised (reduce duplication) | Denormalised (optimise for reading) |
| Azure services | Azure SQL Database | Microsoft Fabric, Azure Databricks, Power BI |
The analytical pipeline
Data doesn’t start in an analytical system — it flows there from transactional sources through a pipeline:
- Source systems — Transactional databases, CRM, ERP, IoT devices, logs
- Ingestion — Extract data from sources (ETL or ELT)
- Storage — Land data in a data warehouse or data lake
- Modelling — Organise data into dimensions and measures (star schema)
- Visualisation — Build dashboards and reports (Power BI)
Priya’s pipeline at FreshMart:
- Sales data flows from 50 store POS systems → 2. Azure Data Factory extracts and transforms it nightly → 3. Lands in a Fabric data warehouse → 4. Priya builds a star schema (stores, products, dates, sales amounts) → 5. Power BI dashboards show trends to regional managers
ETL vs ELT — what's the difference?
Both are ways to move data from source to analytical storage:
- ETL (Extract, Transform, Load): Extract data, transform it (clean, reshape, aggregate) in a separate engine, then load the result into the warehouse. Traditional approach.
- ELT (Extract, Load, Transform): Extract data, load it raw into the data lake or warehouse, then transform it inside the storage system using its compute power. Modern approach — especially with Fabric and Databricks.
Key difference: In ELT, the raw data is preserved in the lake, so you can re-transform it later if business requirements change. In ETL, only the transformed result is stored.
Key concepts in analytical workloads
Data warehouses
A data warehouse is a centralised repository of structured, historical data optimised for analytical queries. Data is cleaned, transformed, and organised before loading.
Data lakes
A data lake stores raw data in its original format — structured, semi-structured, and unstructured. It’s a landing zone for everything. Analytics tools query the lake directly.
Lakehouses
A lakehouse combines the best of both — the raw storage of a data lake with the query performance and structure of a data warehouse. Microsoft Fabric uses this model.
Exam tip: OLAP scenario recognition
Look for these signals in exam questions:
- “Management wants a report on last quarter’s performance” → Analytical
- “The dashboard shows sales trends by region” → Analytical
- “Data is aggregated from multiple source systems” → Analytical
- “A star schema organises data for reporting” → Analytical
If the scenario mentions “recording a sale,” “placing an order,” or “updating a record” — that’s transactional.
Flashcards
Knowledge check
Priya needs to create a monthly dashboard showing total sales by store, product category, and region for the past 12 months. What type of workload is this?
A company loads raw sales data into a data lake first, then uses Fabric to clean, transform, and model it for reporting. What is this approach called?
🎬 Video coming soon
Next up: Data Roles: DBA, Engineer & Analyst — meet the people who build and manage these data systems.