Customer Churn Analysis — Telecom Industry | End-to-End Data Science Project
Role: Data Scientist / Data Analyst
Type: Explainable Data Science & Statistical Analytics
Built an end-to-end churn analysis project using telecom subscription data to identify and validate the main drivers of customer churn through a structured, statistically grounded workflow. The project focuses on interpretability, hypothesis testing, and business decision support rather than black-box prediction.
Objectives: Identify churn drivers, validate relationships statistically, engineer behavior-based features, reduce dimensionality, and convert findings into retention strategies.
Workflow: Data cleaning & quality management, EDA, feature engineering, multi-method feature selection, hypothesis testing, PCA, and business insight consolidation.
Key Findings: Highest churn occurs in first 6–12 months; month-to-month contracts and high monthly charges strongly increase churn risk; low service engagement is a major factor. Statistical tests were significant at α = 0.05, and PCA confirmed feature redundancy patterns.
Business Impact: Supports targeted retention actions including early onboarding programs, long-term contract incentives, pricing personalization for high-risk customers, and service bundling strategies.
Methods: EDA, feature engineering, t-test, Chi-Square, ANOVA, correlation analysis, RFE, L1-regularized logistic regression, PCA, outlier handling (IQR), encoding & standardization.
Tools: Python, Pandas, NumPy, Scikit-learn, SciPy, Matplotlib, Seaborn, Jupyter, Power BI.
Deliverables: Clean dataset, reproducible notebooks, feature pipelines, statistical reports, PCA results, business insights.