This project involved analyzing customer reviews from an e-commerce shopping dataset to extract insights about user sentiment and product recommendations for a college project. Using Python, I conducted extensive exploratory data analysis (EDA) and applied natural language processing (NLP) techniques to preprocess, visualize, and interpret customer feedback.
Data Preprocessing:
Loaded and cleaned the dataset by removing irrelevant columns and handling missing values.
Performed feature selection to focus on important attributes such as review text, product category, and recommendation indicators.
Exploratory Data Analysis (EDA):
Analyzed the frequency distribution of ratings.
Created visualizations including bar charts and word clouds to identify common words in positive and negative reviews.
Developed a butterfly chart to compare the percentage of recommended vs. non-recommended products across different departments.
Text Preprocessing:
Tokenization, lowercasing, punctuation removal, and stopword filtering.
Applied lemmatization to retain meaningful root words.
Extracted features such as word count, hashtags, mentions, and sentiment-related expressions.
Sentiment Analysis:
Classified reviews into positive and negative categories based on user recommendations.
Generated word clouds to visualize frequently used terms in positive and negative reviews and titles.
Analyzed sentiment trends across product departments and rating levels.
Visualization and Insights:
Plotted sentiment distributions to identify key patterns in customer opinions.
Highlighted the impact of product categories on customer satisfaction.
Provided actionable insights for businesses to improve products and customer engagement.
This project demonstrated proficiency in data preprocessing, NLP techniques, sentiment analysis, and data visualization, making it a valuable resource for businesses aiming to enhance customer experience based on user feedbac