Project Name: Automated Profit Prediction Pipeline with MLOps
Description: Built a robust machine learning regression pipeline to predict retail profits using the Superstore dataset. The project integrates MLOps best practices using MLflow for experiment tracking and model management.
Key Features:
Advanced Feature Engineering: Implemented Cyclical Encoding (Sine/Cosine transformation) for temporal features to capture seasonality accurately.
MLOps Integration: Utilized MLflow to log experiments, compare 5 different algorithms (including XGBoost & Random Forest), and track performance metrics.
Automated Pipeline: Developed a Scikit-learn Pipeline for automated preprocessing (scaling, encoding) and outlier handling to prevent data leakage.
Model Optimization: Automated the selection of the best-performing model based on R2 score and deployed it using Joblib.
Reporting: Scripted automated generation of performance reports (Markdown) and comparative visualizations.
Tools: Python, MLflow, Scikit-learn, Pandas, Matplotlib, Joblib