Project Showcase: Building a Modern Data Warehouse for Brazilian E-Commerce with the Olist Dataset
I'm excited to share a detailed breakdown of my recent data warehouse project centered on the Brazilian E-commerce dataset from Olist. The primary goal was to construct a robust, end-to-end data pipeline to consolidate fragmented data, enabling powerful analytics and informed business strategies.
This project addresses the critical need for a unified view of sales, customer, and product data to improve everything from demand forecasting to marketing effectiveness.
Key Technologies & Architecture
I leveraged a modern data stack to build a scalable and efficient solution:
️ Data Warehouse: Snowflake served as the cloud data warehouse, structured with a multi-layered architecture:
Bronze Layer: For ingesting and storing raw, untouched data.
Silver Layer: Where data was cleaned, standardized, and enriched.
Gold Layer: To hold business-ready, aggregated data, modeled for analytics.
Data Transformation: dbt was used for all transformation tasks. It allowed me to build modular, testable, and well-documented data models, ensuring high data quality and integrity as data moved from the Silver to the Gold layer.
️ Workflow Orchestration: Apache Airflow orchestrated the entire workflow. I designed a DAG to automate the pipeline, from loading the initial datasets to running dbt transformations and tests, ensuring a reliable and repeatable process.
Data Visualization: Power BI was used to connect to the Gold layer in Snowflake to build interactive and insightful dashboards.
Dashboards & Key Insights
The Power BI dashboards offer deep dives into several key areas of the business:
Executive Summary: A high-level overview of critical metrics like total revenue, number of unique customers, and order volumes.
Customer Analysis: Insights into the geographic distribution of customers and the average revenue generated per customer.
Product & Seller Performance: Identification of the top-performing product categories and the highest-earning sellers on the platform.
Operations & Logistics: A look into operational efficiency by analyzing on-time delivery percentages and average delivery times.
This project was an incredible, hands-on experience in designing and implementing a complete data warehousing solution. I am passionate about how a well-structured data foundation can unlock powerful business insights.
I'm always open to connecting with fellow data professionals and discussing the latest in data engineering and analytics!
#DataEngineering #DataWarehouse #Snowflake #dbt #Airflow #PowerBI #ETL #DataAnalytics #BusinessIntelligence #DataPipeline #SQL #Olist #PortfolioProject