Medallion Architecture in Microsoft Fabric: Build Bronze to Gold

Medallion Architecture in Microsoft Fabric: Build Bronze to Gold
By Neetu Singla6 min read

Medallion architecture in Microsoft Fabric organizes data into three Delta Lake layers - bronze (raw), silver (cleansed), and gold (aggregated) - inside a Fabric Lakehouse stored on OneLake. Data Factory pipelines orchestrate each tier hop while Spark notebooks handle the transformations. The result is an auditable, restartable data foundation with ACID compliance and time-travel at every layer - exactly what healthcare and finance teams need before connecting Power BI to production data.

Key Takeaways

  • Bronze stores raw, immutable source copies; silver applies validation and deduplication; gold delivers aggregated, report-ready Delta tables.
  • OneLake unifies all three layers under one storage endpoint, eliminating cross-account data movement that complicated legacy Azure Data Lake Gen2 setups.
  • Data Factory pipeline templates with sequential notebook activities make each bronze-to-gold transition auditable and restartable without custom orchestration code.
  • Delta tables provide ACID transactions and time-travel at every tier - critical for HIPAA audit trails in healthcare and SOX traceability in finance.
  • A single parameterized pipeline template handles claims data, EHR feeds, general ledger transactions, and revenue cycle files with minimal code changes.

What Is Medallion Architecture in Microsoft Fabric?

Medallion architecture is a layered data design pattern that separates raw ingestion from curated analytics. In Microsoft Fabric, each layer lands as a Delta table inside a Fabric Lakehouse - all stored on OneLake, the single logical storage namespace that spans your entire Fabric tenant without requiring data copies between storage accounts or subscriptions.

The three tiers carry distinct contracts. Bronze preserves source data exactly as it arrived - immutable, append-only, zero transformation. Silver enforces business rules: schema validation, null handling, deduplication, and referential integrity checks. Gold reshapes data into the dimensional or flat structures that Power BI semantic models and business users consume directly. Because every layer is a Delta table, every write is transactional, every read is consistent, and any prior snapshot is queryable through Delta time-travel syntax - turning audit logging from a manual export task into a native platform feature.

According to Future Market Insights (2025), the AI consulting services market is projected to grow from USD 11.07 billion in 2025 to USD 90.99 billion by 2035 at a 26.2% CAGR. Organizations competing for that growth consistently report that ungoverned, untrusted data - not model quality - is the primary AI bottleneck. Medallion architecture resolves that blocker before it surfaces in a board presentation.

Healthcare and finance leaders exploring how governed data foundations enable clinical and financial AI use cases will find our AI analytics guide for healthcare finance teams a useful companion to this technical walkthrough.

How Do You Build Bronze, Silver, and Gold Delta Tables in a Fabric Lakehouse?

Illustration: Bronze Layer — Raw Ingestion
Illustration: Silver to Gold — Transformation and Aggregation

Building the medallion pattern inside a Fabric Lakehouse requires four components: a configured Fabric workspace, a Lakehouse with three table paths, a set of Spark notebooks, and a Data Factory pipeline to sequence them. The walkthrough below uses healthcare claims data as the example domain, though the pattern applies equally to finance GL, AR, and revenue forecasting workloads.

Step 1 - Create the Lakehouse and Define Layer Paths

In your Fabric workspace, create a single Lakehouse - for example, `enterprise_lh`. Under the Tables section, define three managed Delta table categories: `bronze_claims` for raw source records, `silver_claims` for validated records, and `gold_claims_summary` for payer-level aggregations. Using managed tables rather than unmanaged file paths lets the Fabric SQL analytics endpoint expose them as queryable views automatically, without additional catalog registration steps. This single-Lakehouse approach also means OneLake stores all three layers under one logical path, which simplifies governance policy application and reduces cross-workspace permission complexity.

Step 2 - Ingest to Bronze with a Copy Data Activity

Configure a Copy Data activity in Data Factory. Set the source connector to your origin system - SQL Server on-premises, SFTP file delivery, REST API, or HL7 FHIR endpoint for healthcare. Set the sink to Delta format at the bronze table path, with write mode set to append only. Bronze must never overwrite historical data - immutability at this layer is what makes downstream reprocessing safe and auditable. Inject two system columns into every record: `_ingestion_ts` (current UTC timestamp) and `_pipeline_run_id` (the Data Factory pipeline run ID), so every row is traceable to a specific pipeline execution for lineage reporting and debugging.

Step 3 - Cleanse to Silver via Spark Notebook

The silver notebook reads the bronze Delta table and applies your domain's business rules. The critical operation is a MERGE INTO on a natural key - the Delta Lake upsert pattern that handles late-arriving source records without creating duplicate rows. In a claims context, typical silver transformations include: filtering null `claim_id` values, casting `paid_amount` to `decimal(18,2)`, deduplicating on `(claim_id, service_date)`, and standardizing payer name strings to a controlled vocabulary. Document each business rule in the notebook cell that applies it, so transformation logic stays co-located with the code rather than drifting into a separate wiki that teams stop maintaining.

Step 4 - Aggregate to Gold for Reporting

Gold tables are purpose-built for a specific reporting need. Group by the dimensions your business users filter on - payer, facility, service month, cost center - and pre-aggregate the measures: total paid, claim count, average days-to-payment, denial rate. Name every column in plain business language rather than source system codes. Power BI connects to the gold layer through the Fabric SQL analytics endpoint or via Direct Lake mode - no scheduled data export or refresh is required.

For finance teams building revenue cycle or P&L dashboards on top of this layer, the reporting architecture in our Power BI financial dashboard guide for healthcare maps directly to gold table schemas and measure design.

How Do Data Factory Pipelines Orchestrate the Medallion Pattern?

A single Data Factory pipeline with three sequential Notebook activities is the recommended orchestration pattern. Each notebook activity receives `pipeline_run_id` as a parameter, ensuring all three layers share the same lineage identifier for audit queries and operational debugging.

The pipeline proceeds in four stages. A Copy Data activity lands source records to bronze in append mode; on failure, the pipeline halts before silver is touched, preventing corrupt data from propagating downstream. A Notebook activity labeled `bronze_to_silver` fires only on Copy success and applies the deduplication and MERGE logic. A second Notebook activity labeled `silver_to_gold` fires only on silver success, producing the final aggregated tables. A Set Variable activity closes the run by writing completion metadata - domain name, layer, run ID, row counts, and duration - to a `pipeline_control` Delta table that gives operations teams a queryable execution log without external monitoring tooling.

Configure the depends on condition for each activity to require the previous step's explicit success status. This ensures a data quality failure in silver automatically prevents a stale or partial gold table from reaching Power BI users - a safeguard that manually scripted ETL rarely implements consistently, and one that auditors in regulated industries will expect to see documented and tested.

Teams running azure synapse analytics to microsoft fabric migration projects will find this pipeline canvas familiar: Data Factory activities map one-to-one between platforms, and existing linked services for SQL Server or ADLS Gen2 carry over with connection string updates only. Our AI consulting guide for healthcare data analytics covers the broader migration strategy, including workspace governance and Fabric capacity planning for mid-market organizations.

Fabric Lakehouse vs Fabric Warehouse: Which Fits the Medallion Pattern?

Both Fabric compute surfaces can query Delta tables, but each suits a different medallion layer. The choice is not either-or: most mid-market teams run bronze and silver in the Fabric Lakehouse and expose gold tables as Warehouse shortcuts for SQL-first BI developers, combining the strengths of both surfaces without duplicating storage.

DimensionFabric LakehouseFabric Warehouse
Primary computeApache Spark + SQL analytics endpointT-SQL engine only
Schema enforcementFlexible (schema-on-read or -write)Enforced at write
Best medallion layerBronze and silver (ETL, streaming)Gold (BI queries, semantic models)
Delta table supportNative writer and readerRead via shortcuts; no native write
Streaming ingestionSpark Structured Streaming (native)Not supported
ACID complianceDelta Lake (full ACID)SQL engine (full ACID)
Power BI connectionVia SQL analytics endpointNative SQL endpoint
Ideal team profileData engineers, Python/SparkSQL analysts, BI developers

According to Market Research Future (2025), the Healthcare Financial Analytics Market is projected to grow at an 8.58% CAGR from 2025 to 2035. The organizations positioned to capture that growth are building flexible, multi-persona data platforms today - not single-stack solutions that force SQL analysts and Spark engineers into the same compute surface. The Lakehouse-Warehouse hybrid approach scales cleanly as both team size and data volume increase.

For teams weighing whether to build and maintain this architecture in-house or work with a specialist, our managed Power BI vs in-house BI team guide covers the cost and capability tradeoffs that apply equally to Fabric implementations.

How Should Healthcare and Finance Teams Structure Gold Tables for Compliance?

Gold tables are where medallion architecture meets regulatory requirements directly. Healthcare organizations subject to HIPAA need to demonstrate that every PHI data element is traceable to its source, access-controlled, and retained within policy limits. Finance teams under SOX need an immutable audit trail confirming that figures in board reports match source transactions exactly.

Delta Lake time-travel provides this trail natively, without a separate audit log system. Using `VERSION AS OF` or `TIMESTAMP AS OF` syntax, any analyst or auditor can query the exact state of a gold table at any prior timestamp. For HIPAA, this satisfies data lineage documentation requirements. For SOX, it supplies the version history that external auditors request during financial statement reviews - available on demand, not reconstructed from backup tapes.

Recommended gold table controls for regulated environments include:

  • Column-level security via Fabric workspace data access policies, restricting PHI columns by Entra ID role
  • Row-level security enforced at the SQL analytics endpoint, limiting facility-level data to authorized facility users
  • Delta log retention set to a minimum of 90 days via the `delta.logRetentionDuration` table property
  • Automated data classification tags applied during silver-to-gold transformation, before data reaches any reporting consumer

For healthcare analytics teams building HIPAA-compliant Power BI dashboards on top of these gold tables, the HIPAA-compliant analytics dashboard best practices checklist covers the reporting-layer controls that complement Lakehouse-level data security.

MedInsight (2025) identified AI-driven analytics as one of three dominant themes reshaping health system data strategy - a trend being built on governed, layered data architectures, not ad hoc SQL exports or point-in-time data pulls. Finance teams applying the same medallion pattern to GL, accounts receivable, and budgeting data will find reporting design guidance in our AI-powered Power BI consulting guide for finance teams.

Reusable Medallion Architecture Pattern Diagram for Microsoft Fabric

The diagram below represents the parameterized three-layer medallion pattern as deployed in a Fabric Lakehouse. Swapping the source connector and domain-specific business rules is all that is needed to reuse the same pipeline template across claims, EHR feeds, GL transactions, and revenue cycle data. Finance teams connecting the gold layer to monthly reporting workflows can pair this pattern with the approach in our monthly financial reporting automation guide for Power BI.

```

Source Systems (SQL Server / REST API / SFTP / HL7 FHIR / Event Hub)

|

v

[Data Factory - Copy Data Activity]

| append mode | injects _ingestion_ts and _pipeline_run_id

v

+---------------------------------------------+

BRONZE LAYER
bronze_[domain] (Delta table)
Raw, immutable, append-only
Partitioned by source_date

+---------------------------------------------+

|

v

[Notebook Activity: bronze_to_silver]

| MERGE INTO on natural key | schema enforcement | dedup

v

+---------------------------------------------+

SILVER LAYER
silver_[domain] (Delta table)
Cleansed, validated, deduplicated
Business rules applied

+---------------------------------------------+

|

v

[Notebook Activity: silver_to_gold]

| aggregation | business naming | security tags

v

+---------------------------------------------+

GOLD LAYER
gold_[domain]_summary (Delta table)
Report-ready, pre-aggregated
Time-travel enabled (90-day log retention)

+---------------------------------------------+

|

v

[Fabric SQL Analytics Endpoint / Direct Lake]

|

v

[Power BI Semantic Model / Reports]

```

Every pipeline run appends one row to a `pipeline_control` Delta table capturing domain, layer, run ID, row count, and duration. This control table gives operations teams - and compliance auditors - a queryable history of every medallion run without deploying external monitoring infrastructure.

If your team is ready to implement a governed medallion architecture on Microsoft Fabric - with Power BI semantic models connected directly to your gold layer - our Power BI and Fabric consulting practice designs and delivers this pattern for healthcare and finance clients across the US and Canada.

---

About Lets Viz: Lets Viz has delivered data analytics and BI consulting engagements for healthcare systems, revenue cycle management organizations, and finance teams across the US and Canada since 2020. Our practitioners hold Microsoft Fabric and Power BI certifications and have designed medallion architecture pipelines, HIPAA-compliant data models, and financial reporting solutions for mid-market clients navigating both regulatory compliance and AI readiness.

Frequently Asked Questions

The three layers are bronze (raw, immutable source data), silver (cleansed and deduplicated records), and gold (aggregated, report-ready tables). In Microsoft Fabric, each layer is stored as a Delta table inside a Fabric Lakehouse on OneLake. Bronze uses append-only writes, silver uses MERGE upserts on natural keys, and gold pre-aggregates data for direct Power BI consumption via the Fabric SQL analytics endpoint.

Related blogs

From Lets Viz

Ready to build your own finance dashboard?

We deliver Managed Power BI retainers for SaaS finance and ops teams — named analyst, change requests with a 2-business-day SLA, and automated refresh monitoring from $5K/mo.

Named analyst · 2-day SLA · From $5K/mo