? Objective
To design and implement a robust, scalable, and well-documented SQL Server-based data warehouse that ingests, transforms, and models data from heterogeneous sources (CRM & ERP), enabling downstream analytics and reporting for business stakeholders.
? Specifications
Data Sources:
Two source systems: CRM and ERP, each providing structured CSV files.
Files are loaded directly into the Bronze layer without transformation.
Data Quality:
Cleanse and resolve data quality issues (e.g., nulls, duplicates, inconsistent formats) during Silver layer processing.
Implement data validation rules and logging for auditability.
Integration:
Combine CRM and ERP datasets into a unified, user-friendly dimensional model in the Gold layer.
Apply star schema design with fact and dimension tables for optimized querying.
Scope:
Focus on the latest dataset only—historical versioning is not required.
All ETL processes are batch-driven, with full-load and truncate-insert patterns.
Documentation:
Clear documentation of the data model, including entity relationships, column definitions, and transformation logic.
Schema diagrams and process flowcharts are provided for stakeholder clarity.