This project aims to predict students' final grades using a linear regression model. The steps involved are:
Data Loading: The task5.csv dataset is loaded into a pandas DataFrame.
Missing Value Handling: Missing values in attendance_rate, assignment_score, study_hours, and final_grade columns are filled using the mean, median, or mode, respectively. Data types are also ensured to be numeric.
Feature and Target Preparation: The features (X) are selected as 'attendance_rate', 'assignment_score', and 'study_hours', and the target (y) is 'final_grade'. The data is then split into training and testing sets.
Model Training: A Linear Regression model is trained using the training data.
Predictions: The trained model is used to make predictions on both the training and testing sets.
Model Evaluation: The Mean Squared Error (MSE) is calculated for both the training and testing sets to evaluate the model's performance.
Results: The actual and predicted final grades from the test set are displayed for comparison.