Built a Sentiment Analysis model for IMDB movie reviews with 89% accuracy:
Data Exploration - Loaded and analyzed 50,000 reviews (balanced 25k positive/25k negative)
Text Preprocessing - Implemented cleaning pipeline (HTML removal, lowercase, punctuation removal)
Model Development - Built a sklearn pipeline using TF-IDF Vectorizer + Logistic Regression
Evaluation - Achieved 90% precision and 87-91% recall on both classes
Live Predictions - Demonstrated the model on sample reviews with confidence scores
Skills Demonstrated
NLP/ML: Text preprocessing, TF-IDF vectorization, logistic regression classification
Python: pandas, scikit-learn, regex for text cleaning
Model Evaluation: Classification reports with precision/recall/F1 metrics
Pipeline Design: End-to-end ML pipeline from raw text to predictions