The Challenge:
Wholesale distributors often struggle to target their marketing and logistics because they treat all customers the same. The goal of this project was to analyze a dataset of 440 clients to identify distinct purchasing patterns.
The Solution:
I developed a Machine Learning pipeline that:
Cleaned and Transformed Data: Handled skewness using Log Transformation and Feature Scaling to ensure model accuracy.
Clustering Models: Applied both K-Means and DBSCAN algorithms to segment customers based on their spending across categories (Fresh, Milk, Grocery, etc.).
Advanced Visualization: Used PCA (Principal Component Analysis) for 2D visualization and Radar Charts to create "Customer Personas."
The Result (Business Impact):
I successfully identified three distinct customer profiles:
Horeca (Hotels/Restaurants): High "Fresh" goods demand.
Retailers: High "Grocery" and "Detergents" demand.
High-Value VIPs: Massive spenders across all categories.
Businesses can use these insights to optimize supply chains and create personalized loyalty programs.
Tools Used: Python (Pandas, Scikit-Learn), Seaborn, Matplotlib, Google Colab.