-Anas Lahmamsi from LinkedIn
Medallion:
This architecture ensures atomicity, consistency, isolation, and durability (ACID) as data moves through multiple layers of validation and transformation before being stored in an optimized format for efficient analytics.
The terms Bronze (raw), Silver (validated), and Gold (enriched) describe the level of data quality and refinement at each stage.
- Bronze contains raw ingested data
- Silver contains cleaned and validated data
- Gold contains enriched, business-ready datasets
This layered approach improves data reliability, traceability, and analytical performance.
The Bronze layer ingests raw data directly from source systems, such as:
- Cloud storage (files, logs, exports)
- Streaming platforms (Kafka, Event Hubs)
- Databases
- APIs
Data in the Bronze layer is stored as-is, with little or no cleaning or validation in most cases.
The main objective is to preserve the original data and maintain a reliable historical record for downstream processing.
Silver Layer : Cleaned and Validated Data
The Silver layer is where data cleaning and validation take place.
In this layer, raw datasets from the Bronze layer are processed, standardized, and validated to improve data quality and reliability.
Typical transformations include:
- Removing null or invalid values
- Standardizing formats
- Deduplicating records
- Applying business validation rules
- Joining related datasets
The Gold layer is designed for business and professional users.
It contains highly curated and aggregated datasets, typically fewer in number than the Bronze and Silver layers.
These datasets are optimized for reporting, dashboards, and business analytics.
PS: Medallion Architecture can be implemented for both Batch and Real-Time processing. The concept is independent of the processing mode, it defines how data is organized by quality (Bronze –> Silver –> Gold), not how fast it is processed.

Details


Designed with WordPress










