Obesity AI - Machine Learning Classification Project
? Project Overview
This project implements a comprehensive machine learning solution for obesity classification using various algorithms. The system analyzes lifestyle and health data to predict obesity levels, providing insights for health professionals and individuals.
? Problem Statement
Obesity is a significant global health concern. This project aims to:
Classify individuals into different obesity categories based on lifestyle factors
Provide accurate predictions using machine learning algorithms
Help healthcare professionals assess obesity risk factors
Enable individuals to understand their obesity classification
? Features
Core Functionality
Multi-class Classification: Predicts 7 different obesity categories
Multiple ML Algorithms: Implements 4 different machine learning models
Interactive Prediction: Command-line interface for real-time predictions
Model Persistence: Saves trained models for future use
Comprehensive Evaluation: Detailed performance metrics and analysis
Obesity Categories
Insufficient_Weight - Underweight individuals
Normal_Weight - Healthy weight range
Overweight_Level_I - Slightly overweight
Overweight_Level_II - Moderately overweight
Obesity_Type_I - Class I obesity
Obesity_Type_II - Class II obesity
Obesity_Type_III - Class III obesity (severe)
? Dataset Features
The model uses 16 input features to predict obesity levels:
Physical Attributes
Age - Individual's age in years
Height - Height in meters
Weight - Weight in kilograms
Lifestyle Factors
FAVC - Frequent consumption of high caloric food (yes/no)
FCVC - Frequency of consumption of vegetables (1-3 scale)
NCP - Number of main meals (1-4)
CAEC - Consumption of food between meals (Never/Sometimes/Frequently/Always)
CH2O - Daily consumption of water (liters)
FAF - Physical activity frequency (0-3 scale)
TUE - Time using technology devices (hours)
Health & Family History
family_history_with_overweight - Family history of overweight (yes/no)
SMOKE - Smoking habit (yes/no)
SCC - Calorie consumption monitoring (yes/no)
CALC - Consumption of alcohol (Never/Sometimes/Frequently)
Transportation
MTRANS - Transportation used (Automobile/Bike/Public_Transportation/Walking/Motorbike)
?️ Architecture
Data Pipeline
Data Loading - Training and test datasets
Data Preprocessing - Handling missing values, encoding categorical variables
Feature Engineering - BMI calculation, outlier handling
Data Scaling - Standardization of numerical features
Model Training - Multiple algorithm implementation
Evaluation - Performance metrics and comparison
Model Persistence - Saving trained models