I'm excited to share my latest project Through my journy at NTI,
where I scraped real-world data from Amazon.eg, cleaned it, and applied data analysis.
Here's a breakdown of what I did:
1. Web Scraping (Selenium + BeautifulSoup)
Scraped data for gaming laptops (Name, Price, Rating, Availability)
Extracted multi-page results using dynamic page loading
Exported clean results to CSV
? 2. Data Cleaning & Feature Extraction
Extracted specs from product names (RAM, Processor, GPU) using regex
Unified formats (e.g., converting “16GB” text to numeric value)
Filled missing values using the most common entries
Removed duplicates and handled outliers
3. Data Analysis
Visualized distributions using boxplots and histograms
Analyzed top GPU trends and their frequencies
Explored relationships between RAM, Rating, Brand vs. Price
Used bar plots for brand insights
️ Tools Used:
Python, Pandas, Matplotlib, Seaborn, Selenium, BeautifulSoup
This project sharpened my skills in:
End-to-end data projects
Text feature engineering from messy real-world data