This project applies machine learning to the famous Titanic dataset to predict passenger survival based on socio-economic and demographic factors. The pipeline demonstrates a complete end-to-end approach, including data
preprocessing, feature engineering, model training, hyperparameter tuning, and evaluation.
Key Features:
1- Data Preprocessing:
Handling missing values, encoding categorical variables, and feature scaling.
Feature engineering: created new variables such as Title, FamilySize, and IsAlone to capture hidden relationships.
Modeling:
Implemented multiple machine learning models including Support Vector Machine (SVM) and K-Nearest Neighbors (KNN).
Optimized model performance using GridSearchCV to tune hyperparameters.
Achieved strong generalization with an accuracy score of ~83% on the test set.