Row Filters, Column Masks & Retention
Dynamically filter rows and mask columns based on who's querying, plus manage data retention policies — fine-grained governance that the exam tests heavily.
Dynamic data protection
Row filters and column masks are like smart glass walls in an office.
A row filter is a wall that only lets certain people into certain rooms. Analysts in APAC can only see APAC data — the EMEA rooms are invisible to them.
A column mask is like frosted glass on sensitive windows. Everyone can see the room (table), but certain details (columns) are blurred. An analyst sees ”****@email.com” instead of the real email address.
The clever part: both work dynamically. The same table shows different data to different people — no need to create separate copies.
Row filters
Row filters restrict which rows a user can see:
-- Create a row filter function
CREATE OR REPLACE FUNCTION prod_sales.functions.region_row_filter(region_value STRING)
RETURN
IS_ACCOUNT_GROUP_MEMBER('global-admins') -- admins see all
OR region_value = CURRENT_USER_ATTRIBUTE('region'); -- others see their region only
-- Apply the filter to a table
ALTER TABLE prod_sales.curated.daily_revenue
SET ROW FILTER prod_sales.functions.region_row_filter ON (region);
What happens at query time:
| User | Group | Query Result |
|---|---|---|
| Ravi (APAC) | data-engineers | Only APAC rows |
| Analyst (EMEA) | bi-analysts | Only EMEA rows |
| Sarah (admin) | global-admins | All rows |
The filter is invisible to users — they run SELECT * FROM daily_revenue and get only their rows. No WHERE clause needed.
Removing a row filter
ALTER TABLE prod_sales.curated.daily_revenue DROP ROW FILTER;
Column masks
Column masks transform what value a user sees in a specific column:
-- Create a column mask function
CREATE OR REPLACE FUNCTION prod_sales.functions.email_mask(email_value STRING)
RETURN
CASE
WHEN IS_ACCOUNT_GROUP_MEMBER('pii-readers') THEN email_value -- full email
ELSE CONCAT(LEFT(email_value, 2), '****@', SPLIT(email_value, '@')[1]) -- masked
END;
-- Apply the mask to a column
ALTER TABLE prod_sales.curated.customers
ALTER COLUMN email SET MASK prod_sales.functions.email_mask;
What users see:
| User | Group | email Column Value |
|---|---|---|
| Mei Lin | pii-readers | john.smith@freshmart.com |
| BI analyst | bi-analysts | jo****@freshmart.com |
-- More masking patterns
-- Hash the value (useful for joins without revealing data)
CREATE FUNCTION hash_mask(val STRING) RETURN SHA2(val, 256);
-- Show NULL instead of the value
CREATE FUNCTION null_mask(val STRING) RETURN NULL;
-- Show last 4 digits only (credit cards)
CREATE FUNCTION card_mask(val STRING) RETURN CONCAT('****-****-****-', RIGHT(val, 4));
| Feature | Row Filter | Column Mask |
|---|---|---|
| Controls | Which rows are visible | What value a column shows |
| Applied to | Table (one filter per table) | Column (one mask per column) |
| Function returns | BOOLEAN (TRUE = visible) | Transformed column value |
| User awareness | Invisible — filtered rows don't appear | Visible — masked values appear different |
| Use case | Regional data isolation, tenant separation | PII protection, partial redaction |
| Remove | ALTER TABLE ... DROP ROW FILTER | ALTER TABLE ... ALTER COLUMN ... DROP MASK |
Exam tip: Row filters vs column-level access
Don’t confuse row filters with column-level GRANT:
- Column-level GRANT (Module 6): restricts which columns a user can SELECT. If they try to access a restricted column, they get an error.
- Column mask: lets the user SELECT the column but transforms the value (redaction, hashing, partial masking). No error — just a different value.
If the exam says “hide the column entirely” → column-level GRANT. If it says “show a masked/redacted version” → column mask.
Data retention policies
Data retention controls how long data is kept before it must be deleted or archived:
Delta Lake retention settings
-- Set table-level retention for VACUUM (default: 7 days)
ALTER TABLE prod_sales.curated.daily_revenue
SET TBLPROPERTIES ('delta.deletedFileRetentionDuration' = 'interval 30 days');
-- Set log retention (how long the transaction log keeps history)
ALTER TABLE prod_sales.curated.daily_revenue
SET TBLPROPERTIES ('delta.logRetentionDuration' = 'interval 90 days');
| Property | Default | Purpose |
|---|---|---|
delta.deletedFileRetentionDuration | 7 days | How long deleted data files are kept before VACUUM removes them |
delta.logRetentionDuration | 30 days | How long the transaction log (time travel history) is preserved |
Retention and compliance
At NovaPay, Tomás must comply with financial regulations requiring 7-year data retention:
- Log retention: Set to 7 years (allows time travel queries for audit)
- File retention: Set high enough that VACUUM doesn’t delete files needed for compliance
- Governance tags: Tag tables with
retention_years=7for audit tracking
VACUUM and retention interaction
VACUUM removes old data files that are no longer referenced by the current table version. The deletedFileRetentionDuration controls the safety window:
- Files newer than the retention period are never deleted by VACUUM
- This protects time travel and concurrent reads
- Setting retention too low risks deleting files that active queries still need
Exam tip: The default 7-day retention is safe for most workloads. Lowering it below 7 days requires explicitly disabling the safety check — and the exam considers this a red flag.
🎬 Video coming soon
Knowledge check
Tomás needs to protect customer email addresses in NovaPay's fraud investigation table. Fraud analysts should see partial emails (first 2 characters + masked domain), while compliance officers should see full emails. Which approach should he use?
Mei Lin's compliance team at Freshmart requires that deleted transaction data files must be recoverable for at least 90 days after deletion (for audit purposes). Which Delta Lake property should she configure?
Next up: Lineage, Audit Logs & Delta Sharing — track data flow, log access, and share data securely across organisations.