Customer Segmentation Project — Final Report
Objective
The goal of this project was to segment mall customers into meaningful groups based on their demographic and spending behavior. This enables the business to better understand its customer base, design targeted marketing strategies, and optimize resource allocation.
________________________________________
Data Overview
•Dataset size: 200 customers
•Features used:
oAge
oAnnual Income (k$)
oSpending Score (1-100)
oGender (encoded)
________________________________________
Exploratory Data Analysis (EDA)
•Age: Mostly between 20–50 years old.
•Annual Income: Broad range, concentrated around 40–70k$.
•Spending Score: Well distributed from low to high spenders.
•Gender: Balanced distribution between Male and Female.
•Correlation analysis showed weak correlation between Income and Spending Score, which supports clustering (no clear linear relationship).
________________________________________
? Models Applied
1.KMeans Clustering
oTested values of k from 2 to 15.
oSilhouette Score was highest for k=10 (0.42).
oClear but sometimes forced partitions.
2.Hierarchical Clustering
oTried k=6 and k=10.
oClusters were interpretable but less compact compared to KMeans.
3.DBSCAN (Density-Based Clustering)
oDetected 9 clusters + noise (-1) automatically.
oSilhouette Score = 0.54, better than KMeans.
oAdvantage: No need to predefine k, and it identifies outliers naturally.
________________________________________
Final Decision
•DBSCAN was chosen as the final model.
•Reasoning:
oHigher Silhouette Score (0.54).
oNatural detection of outliers (customers that don’t belong to any cluster).
oFlexibility in capturing irregular-shaped clusters.
________________________________________
Business Insights
•High-income, high-spending cluster → Potential VIP customers → Target with loyalty programs.
•Low-income, high-spending cluster → May represent price-sensitive but valuable customers → Target with discounts/promotions.
•High-income, low-spending cluster → Possibly savers or cautious spenders → Need customized offers to increase spending.
•Noise points (-1) → Customers with unusual behavior that don’t fit into patterns → Can be studied separately.
________________________________________
Conclusion
The project successfully segmented mall customers into meaningful groups.
•DBSCAN proved to be the best model.
•Segmentation provides actionable insights for marketing and customer relationship strategies.
•Next steps could include:
oEnriching dataset with additional features (purchase history, frequency).
oTesting other clustering methods (Gaussian Mixture Models).
oDeploying the segmentation model into a real-time recommendation system