.This project predicts the forest cover type based on cartographic and environmental features
:It applies Machine Learning models such as
Random Forest (baseline)-
XGBoost (with hyperparameter tuning)-
-----------------------------------------------
Project Workflow:
Data Exploration & Cleaning – inspected data types, missing values, and separated continuous vs one-hot features-
Feature Preparation – encoded categorical variables (Wilderness Area, Soil Type)-
Model Training – applied Random Forest and XGBoost-
Evaluation – compared Precision, Recall, F1-Score, Accuracy, and Cross-validation result-
-----------------------------------------------
Results:
- Random Forest → Accuracy: ~0.954 | Strong baseline with stable performance & clear feature importance.
- XGBoost (tuned) → Accuracy: ~0.973 | Outperformed Random Forest with higher F1, Precision, and Recall.
XGBoost was the final choice, delivering the most accurate predictions.
-----------------------------------------------
Business Impact:
With highly accurate cover type predictions, XGBoost reduces classification errors in ecological and forestry applications. This allows decision-makers to rely less on manual surveys and make faster, data-driven environmental assessments.
-----------------------------------------------
Tech Stack:
Python | Pandas, NumPy | Matplotlib, Seaborn | scikit-learn, XGBoost