πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided DP-700 Domain 3
Domain 3 β€” Module 3 of 8 38%
21 of 26 overall

DP-700 Study Guide

Domain 1: Implement and Manage an Analytics Solution

  • Workspace Settings: Your Fabric Foundation
  • Version Control: Git in Fabric
  • Deployment Pipelines: Dev to Production
  • Access Controls: Who Gets In
  • Data Security: Control Who Sees What
  • Governance: Labels, Endorsement & Audit
  • Orchestration: Pick the Right Tool
  • Pipeline Patterns: Parameters & Expressions

Domain 2: Ingest and Transform Data

  • Delta Lake: The Heart of Fabric Free
  • Loading Patterns: Full, Incremental & Streaming Free
  • Dimensional Modeling: Prep for Analytics Free
  • Data Stores & Tools: Make the Right Choice Free
  • OneLake Shortcuts: Data Without Duplication
  • Mirroring: Real-Time Database Replication
  • PySpark Transformations: Code Your Pipeline
  • Transform Data with SQL & KQL
  • Eventstreams & Spark Streaming: Real-Time Ingestion
  • Real-Time Intelligence: KQL & Windowing

Domain 3: Monitor and Optimize an Analytics Solution

  • Monitoring & Alerts: Catch Problems Early
  • Troubleshoot Pipelines & Dataflows
  • Troubleshoot Notebooks & SQL
  • Troubleshoot Streaming & Shortcuts
  • Optimize Lakehouse Tables: Delta Tuning
  • Optimize Spark: Speed Up Your Code
  • Optimize Pipelines & Warehouses
  • Optimize Streaming: Real-Time Performance

DP-700 Study Guide

Domain 1: Implement and Manage an Analytics Solution

  • Workspace Settings: Your Fabric Foundation
  • Version Control: Git in Fabric
  • Deployment Pipelines: Dev to Production
  • Access Controls: Who Gets In
  • Data Security: Control Who Sees What
  • Governance: Labels, Endorsement & Audit
  • Orchestration: Pick the Right Tool
  • Pipeline Patterns: Parameters & Expressions

Domain 2: Ingest and Transform Data

  • Delta Lake: The Heart of Fabric Free
  • Loading Patterns: Full, Incremental & Streaming Free
  • Dimensional Modeling: Prep for Analytics Free
  • Data Stores & Tools: Make the Right Choice Free
  • OneLake Shortcuts: Data Without Duplication
  • Mirroring: Real-Time Database Replication
  • PySpark Transformations: Code Your Pipeline
  • Transform Data with SQL & KQL
  • Eventstreams & Spark Streaming: Real-Time Ingestion
  • Real-Time Intelligence: KQL & Windowing

Domain 3: Monitor and Optimize an Analytics Solution

  • Monitoring & Alerts: Catch Problems Early
  • Troubleshoot Pipelines & Dataflows
  • Troubleshoot Notebooks & SQL
  • Troubleshoot Streaming & Shortcuts
  • Optimize Lakehouse Tables: Delta Tuning
  • Optimize Spark: Speed Up Your Code
  • Optimize Pipelines & Warehouses
  • Optimize Streaming: Real-Time Performance
Domain 3: Monitor and Optimize an Analytics Solution Premium ⏱ ~12 min read

Troubleshoot Notebooks & SQL

Identify and resolve Spark notebook errors and T-SQL failures β€” OOM errors, data skew, schema mismatches, query timeouts, and debugging techniques.

Notebook errors

β˜• Simple explanation

Think of a Spark notebook as a team of workers processing data.

Errors happen when: a worker runs out of desk space (OOM β€” out of memory), one worker gets all the heavy files while others sit idle (data skew), the input data doesn’t match expectations (schema mismatch), or the instructions themselves are wrong (code error).

Spark notebook errors fall into categories: infrastructure (OOM, cluster startup, capacity), data (schema mismatch, corrupt files, null values), code (syntax errors, wrong column names, type mismatches), and performance (skew, shuffle, broadcast join thresholds). The Spark UI and notebook output cells provide the diagnostic information needed to identify root causes.

Common notebook errors

Read the error message carefully β€” Spark errors are verbose but informative
ErrorCauseFix
java.lang.OutOfMemoryErrorDataset too large for driver/executor memoryIncrease pool size, reduce data with filters before collect(), avoid collect() on large DataFrames
AnalysisException: cannot resolve columnColumn name doesn't exist (typo or schema change)Check column names with df.printSchema(), verify source data
Data skew (one task takes 10x longer)One partition key has far more data than othersRepartition data, use salting technique, or broadcast smaller table
Py4JJavaError with NullPointerExceptionNull values in a column used for operationsFilter nulls before processing, use coalesce() or fillna()
SchemaConflictException on writeDataFrame schema doesn't match existing Delta tableUse mergeSchema option or fix DataFrame to match
Cluster startup timeoutNo available capacity for Spark nodesWait and retry, use starter pool, or request capacity increase
πŸ’‘ Scenario: Carlos debugs an OOM error

Carlos’s transformation notebook crashes with OutOfMemoryError on the driver. He investigates:

  1. The line that failed: result = df_500m_rows.collect() β€” collects all 500M rows to the driver!
  2. Root cause: collect() pulls the entire distributed DataFrame into the single driver node’s memory
  3. Fix: Replace collect() with .write.format("delta").save() to write directly to the lakehouse without pulling data to the driver

Rule: Never collect() large DataFrames. Write to Delta tables or use show(20) to preview.

Common T-SQL errors

ErrorCauseFix
Query timeoutComplex query exceeds time limitOptimize query (add WHERE filters, simplify joins), check for missing statistics
Insufficient permissionsUser lacks READ/WRITE on tableGrant appropriate permissions (ReadAll for queries, Contributor role for writes)
Invalid object nameTable or view doesn’t exist (typo, wrong schema)Verify object name and schema β€” use SELECT * FROM INFORMATION_SCHEMA.TABLES
Data type conversion failedINSERT/UPDATE with incompatible typesCast data explicitly: CAST(column AS DECIMAL(10,2))
DeadlockTwo queries blocking each otherReview query execution plans, reduce transaction scope, retry with backoff

Debugging techniques

TechniqueToolWhen to Use
Spark UIBuilt into notebookInvestigate slow stages, data skew, shuffle metrics
df.printSchema()PySparkVerify column names and types before operations
df.show(5)PySparkPreview data at each transformation step
EXPLAINT-SQL / Spark SQLView query execution plan
Cell-by-cell executionNotebookIsolate which transformation step fails

Question

What is the most common cause of OOM errors in Spark notebooks?

Click or press Enter to reveal answer

Answer

Using collect() on large DataFrames (pulling millions of rows to the single driver node), or processing very wide datasets without filtering first. Fix: write to Delta tables instead of collecting, filter early, increase pool memory.

Click to flip back

Question

How do you diagnose data skew in a Spark notebook?

Click or press Enter to reveal answer

Answer

Check the Spark UI β€” look for tasks within a stage where one task takes much longer or processes much more data than others. High shuffle read/write on one executor is a key indicator.

Click to flip back


Knowledge Check

A PySpark notebook fails with 'AnalysisException: cannot resolve column order_total.' The DataFrame was loaded from a Delta table. What should the engineer check first?

Knowledge Check

A T-SQL query in a Fabric warehouse times out after 10 minutes. The query joins two large tables without WHERE filters. What is the best first step?

🎬 Video coming soon

Next up: Troubleshoot Streaming & Shortcuts β€” resolve Eventhouse, Eventstream, and OneLake shortcut errors.

← Previous

Troubleshoot Pipelines & Dataflows

Next β†’

Troubleshoot Streaming & Shortcuts

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.