Abstract
This project aims to build a recommendation system to predict the most appropriate university and major for students who have obtained their baccalaureate degree in Algeria. Using data collected from 23,195 students, we built two Random Forest models: one for predicting the major and another for predicting the university. The data underwent extensive preprocessing, including cleaning, normalization, and visualization, before selecting Random Forest as the final model due to its superior performance over other algorithms like SVM.
Introduction
Selecting the right university and major is crucial for students' future careers and personal satisfaction. This project addresses the challenge by developing a recommendation system tailored for Algerian students. We collected data via a survey, analyzed it, and built machine learning models to make accurate recommendations.
Dataset Description
The dataset consists of responses from 23,195 Algerian students who are currently pursuing their studies at various universities. The survey includes questions about personal demographics, academic performance, and preferences related to university and major choices.
Data Cleaning: Removed missing values and handled inconsistencies.
Normalization: Scaled continuous features like 'Bac Mark' for uniformity.
Data Visualization: Generated plots to understand distributions and relationships in the data.
Model Selection
After experimenting with various machine learning algorithms, including SVM and others, Random Forest was chosen due to its superior performance. Two Random Forest models were trained: one for predicting the major and another for predicting the university.
Model Evaluation
Metrics Used: Accuracy, Precision, Recall, F1-Score
Model Performance:
Major Prediction: Achieved an accuracy of 96%
Conclusion
The Random Forest models provide reliable recommendations for both majors and universities for Algerian students. This system can assist students in making informed decisions based on their personal preferences and academic performance.
Future Work
Expansion of Dataset: Collect more data to improve model accuracy.
Incorporation of Additional Features: Include more features like extracurricular activities and family background.
Model Improvement: Explore advanced models and ensemble techniques to enhance prediction accuracy.
اسم المستقل | Kaouther B. |
عدد الإعجابات | 0 |
عدد المشاهدات | 4 |
تاريخ الإضافة |