Objective: Develop a predictive model to identify individuals at high risk of developing Type 2 diabetes based on clinical and lifestyle factors.
? Tools & Tech:
Python (Scikit-learn, Pandas, Matplotlib)
Jupyter Notebook
Streamlit (for building a simple web app)
Dataset Includes:
Age, BMI, blood pressure
Glucose levels, insulin levels
Physical activity, family history
Outcome label (diabetic or not)
Workflow:
Data cleaning and normalization
Exploratory data analysis to find correlations
Train/test split and model training (e.g., logistic regression, decision tree)
Evaluate model performance (ROC curve, confusion matrix)
Deploy a basic app where users can input values and get a risk score
Outcome: Model achieved 88% accuracy. Found that high BMI and glucose levels were the strongest predictors. The app can be used in clinics for early screening.