πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided DP-700 Domain 1
Domain 1 β€” Module 7 of 8 88%
7 of 26 overall

DP-700 Study Guide

Domain 1: Implement and Manage an Analytics Solution

  • Workspace Settings: Your Fabric Foundation
  • Version Control: Git in Fabric
  • Deployment Pipelines: Dev to Production
  • Access Controls: Who Gets In
  • Data Security: Control Who Sees What
  • Governance: Labels, Endorsement & Audit
  • Orchestration: Pick the Right Tool
  • Pipeline Patterns: Parameters & Expressions

Domain 2: Ingest and Transform Data

  • Delta Lake: The Heart of Fabric Free
  • Loading Patterns: Full, Incremental & Streaming Free
  • Dimensional Modeling: Prep for Analytics Free
  • Data Stores & Tools: Make the Right Choice Free
  • OneLake Shortcuts: Data Without Duplication
  • Mirroring: Real-Time Database Replication
  • PySpark Transformations: Code Your Pipeline
  • Transform Data with SQL & KQL
  • Eventstreams & Spark Streaming: Real-Time Ingestion
  • Real-Time Intelligence: KQL & Windowing

Domain 3: Monitor and Optimize an Analytics Solution

  • Monitoring & Alerts: Catch Problems Early
  • Troubleshoot Pipelines & Dataflows
  • Troubleshoot Notebooks & SQL
  • Troubleshoot Streaming & Shortcuts
  • Optimize Lakehouse Tables: Delta Tuning
  • Optimize Spark: Speed Up Your Code
  • Optimize Pipelines & Warehouses
  • Optimize Streaming: Real-Time Performance

DP-700 Study Guide

Domain 1: Implement and Manage an Analytics Solution

  • Workspace Settings: Your Fabric Foundation
  • Version Control: Git in Fabric
  • Deployment Pipelines: Dev to Production
  • Access Controls: Who Gets In
  • Data Security: Control Who Sees What
  • Governance: Labels, Endorsement & Audit
  • Orchestration: Pick the Right Tool
  • Pipeline Patterns: Parameters & Expressions

Domain 2: Ingest and Transform Data

  • Delta Lake: The Heart of Fabric Free
  • Loading Patterns: Full, Incremental & Streaming Free
  • Dimensional Modeling: Prep for Analytics Free
  • Data Stores & Tools: Make the Right Choice Free
  • OneLake Shortcuts: Data Without Duplication
  • Mirroring: Real-Time Database Replication
  • PySpark Transformations: Code Your Pipeline
  • Transform Data with SQL & KQL
  • Eventstreams & Spark Streaming: Real-Time Ingestion
  • Real-Time Intelligence: KQL & Windowing

Domain 3: Monitor and Optimize an Analytics Solution

  • Monitoring & Alerts: Catch Problems Early
  • Troubleshoot Pipelines & Dataflows
  • Troubleshoot Notebooks & SQL
  • Troubleshoot Streaming & Shortcuts
  • Optimize Lakehouse Tables: Delta Tuning
  • Optimize Spark: Speed Up Your Code
  • Optimize Pipelines & Warehouses
  • Optimize Streaming: Real-Time Performance
Domain 1: Implement and Manage an Analytics Solution Premium ⏱ ~14 min read

Orchestration: Pick the Right Tool

Choose between Dataflows Gen2, pipelines, and notebooks for data orchestration. Design schedules and event-based triggers to automate your workflows.

Three tools, three use cases

β˜• Simple explanation

Think of three ways to get to work.

Walking (Dataflows Gen2) β€” simple, visual, no special skills needed. Perfect for short trips. You see every step clearly.

Driving (Pipelines) β€” more powerful, handles complex routes, can carry passengers (other activities). But you need to know the roads.

Flying (Notebooks) β€” maximum power and flexibility. Go anywhere, do anything. But you need a pilot’s licence (coding skills).

The exam tests whether you know which one to pick for a given scenario. The answer is usually the simplest tool that gets the job done.

Fabric provides three primary orchestration tools: Dataflows Gen2 (visual ETL using Power Query), Data Pipelines (workflow orchestration with activities), and Notebooks (code-first processing with PySpark, Spark SQL, Scala, and R). Each has different strengths, and the exam expects you to choose the right tool for a given scenario.

In practice, most solutions combine all three β€” a pipeline orchestrates a dataflow that cleans data, then triggers a notebook that transforms it at scale.

The decision framework

Choose the simplest tool that meets the requirement
FactorDataflows Gen2PipelinesNotebooks
InterfaceVisual (Power Query drag-and-drop)Visual (canvas with activities) + JSONCode (PySpark, SQL, Scala, R)
Best forSimple data cleaning and shaping from 150+ connectorsOrchestrating multiple activities (copy, dataflow, notebook, stored proc)Complex transformations, ML, custom logic, large-scale processing
Coding required?No β€” M language generated automaticallyMinimal β€” expressions and parametersYes β€” PySpark, SQL, or Scala
ScaleSmall to medium datasets (Power Query engine)Orchestrates at any scale (delegates to other engines)Large datasets (distributed Spark processing)
Error handlingBasic retry on refresh failureRich β€” retry, conditional paths, failure activities, alertsCustom β€” try/except in code, widget notifications
SchedulingBuilt-in refresh scheduleTriggers: schedule, tumbling window, event-basedBuilt-in schedule, via pipeline (notebook activity), or manual run
Output destinationsLakehouse, warehouse, KQL databaseNo output itself β€” orchestrates other tools that produce outputLakehouse (Delta tables), warehouse, files

When to use what β€” exam decision patterns

ScenarioBest ToolWhy
Load CSV from blob storage, clean column names, filter rows, write to lakehouseDataflows Gen2Simple ETL, no code needed, Power Query handles it
Run a dataflow, then a notebook, then refresh a semantic model β€” with retry on failurePipelineMulti-step orchestration with error handling
Join 500M rows across three Delta tables, calculate running averages, write to warehouseNotebookScale + complex logic needs distributed Spark
Copy data from Azure SQL to lakehouse (no transformation)Pipeline (Copy activity)Pure data movement β€” no transformation needed
Transform data using stored procedures in a warehousePipeline (Stored Procedure activity)Calls existing SQL logic without a notebook
Apply machine learning model to incoming dataNotebookML libraries (scikit-learn, MLflow) only available in code
πŸ’‘ Scenario: Carlos's orchestration design

Carlos at Precision Manufacturing needs to load daily production data:

  1. Copy raw CSV files from an SFTP server to the lakehouse β†’ Pipeline Copy activity
  2. Clean column names, filter invalid records, standardise date formats β†’ Dataflows Gen2 (visual, quick)
  3. Transform β€” join with dimension tables, calculate defect rates, build fact table β†’ Notebook (500M rows, complex joins)
  4. Refresh the Power BI semantic model β†’ Pipeline (semantic model refresh activity)

Carlos wraps steps 1-4 in a single Pipeline that orchestrates all the activities in sequence, with retry logic on the copy activity and an email alert if the notebook fails.

Schedules and triggers

Once you’ve built your orchestration, you need to make it run automatically.

Trigger types

Trigger TypeHow It WorksBest For
ScheduleRuns at fixed intervals (every 6 hours, daily at 3 AM, every Monday)Regular batch processing on predictable cadence
Tumbling windowLike schedule, but windows don’t overlap and catch up on missed runsTime-partitioned data loads (process yesterday’s data)
Event-basedFires when something happens β€” new file in storage, message in Event HubReal-time or near-real-time ingestion
On-demandManual trigger or API callTesting, ad-hoc runs, CI/CD-triggered deployments
Schedule for regular jobs, tumbling window for catch-up, event-based for real-time
FeatureSchedule TriggerTumbling WindowEvent-Based
Runs onFixed clock timesFixed intervals, catches up on missedExternal event (file arrival, message)
Overlap possible?Yes β€” if previous run hasn't finishedNo β€” windows don't overlapN/A β€” each event triggers one run
Backfill?No β€” missed runs are skippedYes β€” runs for each missed windowNo β€” only fires on new events
Typical useDaily refresh at 3 AMProcess data for each hour, catching up after downtimeNew file in ADLS triggers ingestion immediately
πŸ’‘ Exam tip: Tumbling window vs schedule

The exam often presents a scenario where a pipeline missed runs during a capacity outage. The question: β€œHow do you ensure all missed time windows are processed?”

Answer: Tumbling window trigger. Unlike a schedule trigger (which skips missed runs), a tumbling window trigger keeps track of each window and catches up on any that were missed.

Pattern: β€œGuaranteed processing of every time window” β†’ tumbling window.

ℹ️ Scenario: Anika's event-driven pipeline

ShopStream receives order data as JSON files dropped into Azure Blob Storage by the payment gateway. Anika configures an event-based trigger:

  • Event: New blob created in orders/incoming/ container
  • Action: Pipeline starts β†’ Copy activity moves the file to the lakehouse β†’ Notebook parses JSON, validates, and appends to the orders Delta table

Orders appear in the analytics dashboard within 5 minutes of payment. No scheduled polling β€” the pipeline runs only when there’s work to do.


Question

When should you use a Dataflow Gen2 instead of a notebook?

Click or press Enter to reveal answer

Answer

When the transformation is simple (cleaning, filtering, renaming, basic joins), the dataset is small-to-medium, and you want a visual, no-code experience. Notebooks are for complex logic, large-scale data, or when you need code libraries (ML, custom functions).

Click to flip back

Question

What is the difference between a schedule trigger and a tumbling window trigger?

Click or press Enter to reveal answer

Answer

Schedule: runs at fixed times, skips missed runs. Tumbling window: runs at fixed intervals, catches up on any missed windows (guaranteed processing of every time period). Use tumbling window when you need backfill capability.

Click to flip back

Question

What is an event-based trigger in Fabric?

Click or press Enter to reveal answer

Answer

A trigger that fires when an external event occurs β€” typically a new file arriving in Azure Blob Storage or a message in an Event Hub. The pipeline runs automatically in response to the event, without polling or scheduling.

Click to flip back

Question

Can a pipeline contain a dataflow AND a notebook?

Click or press Enter to reveal answer

Answer

Yes. Pipelines are orchestrators β€” they can contain Dataflow Gen2 activities, Notebook activities, Copy activities, Stored Procedure activities, and more. A typical pattern: Copy β†’ Dataflow (clean) β†’ Notebook (transform) β†’ Semantic model refresh.

Click to flip back


Knowledge Check

A data engineer needs to load 800 million rows from three Delta tables, calculate rolling 7-day averages, and write results to a warehouse. Which tool should they use?

Knowledge Check

Carlos's pipeline runs on a daily schedule at 3 AM. Over the weekend, the Fabric capacity was paused for maintenance, and Saturday and Sunday runs were missed. On Monday, the pipeline runs once. How many days of data were processed?

🎬 Video coming soon

Next up: Pipeline Patterns: Parameters & Expressions β€” make your orchestration reusable with dynamic expressions and parameterised pipelines.

← Previous

Governance: Labels, Endorsement & Audit

Next β†’

Pipeline Patterns: Parameters & Expressions

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.