Designed and deployed an end-to-end Dockerized pipeline on Azure VM using Airflow, Kafka, and Spark for batch and real-
time data from APIs.
● Implemented medallion architecture with Azure Data Lake (Bronze), PostgreSQL (Silver, SCD2), and ClickHouse (Gold Layer)
for anomaly detection and low-latency analytics.
● Developed anomaly detection logic by integrating live vehicle GPS with static stop-times to identify delays and disruptions.
● Built Metabase dashboards and Power BI reports to monitor delays, disruptions, and operational KPIs in near real-time.
● Optimized pipeline performance through resource tuning and fault-tolerant design