Alexandria University
Faculty of Computers and Data Science
Data Science Methodology Fall 2025
Final Project
Project Overview:
In this project, teams of 3-5 members will collaborate to collect, inspect, clean, and analyze a dataset.
Tasks:
1. Data Collection:
Collect data using web scraping or APIs. Ensure that the data is realistic and may contain
inconsistencies or errors that require cleaning.
2. Data Inspection:
Examine the collected data to identify issues such as missing values, outliers, duplicates, or
inaccurate entries.
3. Data Cleaning:
Apply appropriate cleaning techniques to correct or handle the issues found during your data
inspection and document the cleaning process.
4. Exploratory Data Analysis (EDA):
Perform an in-depth analysis to explore the structure and characteristics of the data. Use
descriptive statistics and visualizations to identify patterns, trends, anomalies, and relationships
between features.
Deliverables:
Your submitted notebook must include:
i. Data collection process.
ii. An overview of the dataset, explaining the types and nature of features.
iii. A summary of the data inspection process and identified issues.
iv. A comparison of the dataset before and after cleaning.
v. Insights and visualizations from the EDA.
Deadline and Discussion:
Project Deadline: Friday, December 5th at 11:59 PM.
The discussions will be held between December 6th and December 11th. Exact time slots will be
announced later.