Processed 995K+ user events with PySpark, improving scalability by 30% via parallel ETL. Achieved 96.7% model accuracy using XGBoost for predicting customer behavior. Reduced data pipeline runtime by 30%, and improved insight extraction efficiency by 40%.