Combining Data: Merge, Append & Queries

Bringing data together

Simple explanation

Think of two puzzle pieces.

Merge is like connecting two pieces side by side — matching them on a shared edge (a common column). You get a wider table. This is like a SQL JOIN.

Append is like stacking one piece on top of another — both have the same shape (same columns), and you just add more rows. This is like a SQL UNION.

This module covers both operations, plus how to handle messy JSON/XML data and when to use reference vs duplicate queries.

Merge Queries: combining tables side by side

Riley at Coastal Fresh (🛒) has two tables:

Sales table: OrderID, ProductID, Quantity, Revenue Products table: ProductID, ProductName, Category, Supplier

She needs product names on her sales data. She merges Sales with Products on the ProductID column — this adds product details to each sales row.

The six join types

Left Outer is the default and most common join type in Power Query
Join Type	What It Returns	SQL Equivalent
Left Outer	All rows from the first table, matched rows from second (nulls where no match)	LEFT JOIN
Right Outer	All rows from second table, matched rows from first	RIGHT JOIN
Inner	Only rows that match in both tables	INNER JOIN
Full Outer	All rows from both tables (nulls where no match on either side)	FULL OUTER JOIN
Left Anti	Only rows from first table that have NO match in second	LEFT JOIN WHERE right IS NULL
Right Anti	Only rows from second table that have NO match in first	RIGHT JOIN WHERE left IS NULL

How to Merge:

Home → Merge Queries (or Merge Queries as New)
Select the first table and key column
Select the second table and matching column
Choose join type (Left Outer is default)
Expand the resulting nested column to select which fields to include

Exam tip: anti joins

Left Anti and Right Anti joins are exam favourites. They find rows that DON’T have a match.

Example: “Find all products that have never been sold.” Merge the Products table (first) with the Sales table (second) using a Left Anti join on ProductID. The result contains only products with zero sales.

Append Queries: stacking tables on top

Nadia at Prism Agency (📊) receives campaign data from Google Ads, Meta Ads, and LinkedIn — three separate tables with identical column structures. She appends them into a single unified campaign table.

Append = same columns, more rows.

How to Append:

Home → Append Queries (or Append Queries as New)
Choose “Two tables” or “Three or more tables”
Select the tables to stack
Columns are matched by name — extra columns become null

Merge vs Append at a glance:

Merge: Different columns, same key → wider table
Append: Same columns, different data → taller table

Semi-structured data: JSON and XML

Dr. Ethan Moss at Bayview Medical Centre (🏥) connects to a REST API that returns patient appointment data in JSON format. The raw import shows a single column of nested records.

Converting JSON to a table:

Power Query imports JSON as a single column of Record or List values
Click the expand icon (↔) on the column header
Select which fields to include
Repeat for nested objects

Key M functions for semi-structured data:

Json.Document() — parse JSON text into records/lists
Table.FromRecords() — convert a list of records into a table
Record.FieldValues() — extract values from a record

Real-world: expanding nested JSON

API responses often have nested structures. For example, a patient record might contain a name field and an appointments array, where each appointment has a date and doctor name.

You expand the top-level record first to get the name, then expand the appointments list, which creates new rows — one per appointment per patient. This row multiplication is normal when expanding nested lists.

Reference vs Duplicate queries

When you need to create a variation of an existing query, you have two options:

Reference reuses work; Duplicate starts fresh
Feature	Reference Query	Duplicate Query
How it works	Points back to the original query's output as its starting point	Creates an independent copy of all steps from the original
Dependency	Depends on the original — if original changes, reference updates too	Independent — changes to original don't affect the duplicate
Processing	Original runs once, reference builds on its result	Both queries run their full step sequences independently
Use when	You want to branch from a common base (e.g., filter the same cleaned data differently)	You want a completely independent copy to modify freely
Performance	More efficient — avoids re-processing shared steps	Less efficient — duplicates all processing work

Example: Kenji at Apex Manufacturing (🏭) has a clean FactoryData query. He creates:

A Reference query called TokyoFactory that filters to Tokyo
A Reference query called ShanghaiFactory that filters to Shanghai

Both reference queries share the same cleaning steps from FactoryData. If he adds a new cleaning step to the original, both references benefit automatically.

Knowledge check

Question

What's the difference between Merge and Append?

Click or press Enter to reveal answer

Answer

Merge combines columns from two tables based on a matching key (like SQL JOIN — makes tables wider). Append stacks rows from tables with the same columns (like SQL UNION — makes tables taller).

Click to flip back

Question

What join type finds rows that DON'T have a match?

Click or press Enter to reveal answer

Answer

Anti joins. Left Anti returns rows from the first table with no match in the second. Right Anti returns rows from the second table with no match in the first.

Click to flip back

Question

What's the difference between a Reference query and a Duplicate query?

Click or press Enter to reveal answer

Answer

Reference points back to the original query's output (dependent, efficient). Duplicate creates an independent copy of all steps (independent, less efficient). Use Reference when branching from a shared base.

Click to flip back

Knowledge Check

Riley has a Products table and a Sales table. She wants to find all products that have never been sold. Which operation should she use?

Knowledge Check

Nadia needs to create two filtered versions of a cleaned campaign dataset — one for Google Ads and one for Meta Ads. She wants changes to the shared cleaning steps to apply to both. What should she use?

Next up: Fact Tables, Dimension Tables and Keys — design the right table structure for Power BI.