Developed a machine learning model to predict retail sales for Walmart stores based on historical data and various influencing factors. The project involved building a complete data science pipeline, including data preprocessing, exploratory analysis, model development, and performance optimization.
I performed data cleaning and preprocessing by handling missing values, encoding categorical features, and scaling numerical variables. Conducted Exploratory Data Analysis (EDA) using Pandas, NumPy, Matplotlib, and Seaborn to uncover trends, seasonality, and relationships between variables such as store type, holidays, temperature, fuel prices, and promotional markdowns.
Implemented multiple regression models including Linear Regression, Decision Trees, and Random Forest to forecast weekly sales. Evaluated model performance using metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and R² score. Applied cross-validation and hyperparameter tuning to improve model accuracy and generalization.
The final model provided reliable sales predictions and valuable insights into the factors affecting retail performance, supporting data-driven decision-making. This project enhanced my skills in time-related data analysis, feature engineering, and building scalable predictive models for real-world business applications.