Project Overview:
This project involves a complete end-to-end Data Science pipeline to analyze and segment the laptop market. Using a dataset of over 1,300 laptops, I developed a system that automatically categorizes hardware into market tiers and predicts the category of new entries.
Key Features & Implementation:
Data Cleaning & Engineering: Cleaned raw data by handling missing values and converting technical strings (like "8GB" and "1.5kg") into numeric formats for analysis.
Outlier Detection: Applied statistical Z-score methods to ensure the model's accuracy by removing anomalies.
Unsupervised Clustering: Implemented K-Means Clustering to segment the market into three logical tiers: Budget, Mid-Range, and Flagship.
Supervised Classification: Developed a K-Nearest Neighbors (KNN) model to classify new hardware specs into the identified segments with high precision.
Data Visualization: Created insightful charts (Scatter plots, Pie charts, and Boxplots) to visualize market trends and price drivers.