The goal of this project was to build a machine learning model that predicts house prices based on various features such as location, number of rooms, population, and proximity to the ocean. Accurate house price prediction helps real estate businesses, buyers, and sellers make data-driven decisions.
Key Steps:
Data Preprocessing
Cleaned missing values and handled outliers.
Performed feature engineering (e.g., created new features like Rooms per Household).
Standardized numerical data for SVM and encoded categorical features.
Exploratory Data Analysis (EDA)
Visualized distributions and correlations using Matplotlib & Seaborn.
Identified key factors influencing house prices (e.g., median income, location).
Modeling
Support Vector Machine (SVM): Applied regression with RBF kernel for capturing complex relationships.
XGBoost: Used gradient boosting to handle non-linearities and interactions efficiently.
Model Evaluation
Compared models using RMSE (Root Mean Squared Error) and R² Score.
Found that XGBoost outperformed SVM with lower prediction error and better generalization.
Results
XGBoost achieved strong predictive accuracy, making it suitable for real-world deployment.
Delivered visualizations showing actual vs. predicted house prices.
Tech Stack:
Python, Pandas, NumPy
Scikit-learn
XGBoost
Matplotlib, Seaborn
Impact:
This project demonstrates the ability to apply advanced machine learning techniques to real-world problems, providing insights that can assist in real estate valuation and decision-making.