student
Russian Federation
This article examines modern approaches to predicting students' academic risk using machine learning methods. A comparative analysis of these technologies is conducted, including ensemble methods (Random Forest, Gradient Boosting, AdaBoost) and other classification algorithms. The study developed a methodology for assessing key factors influencing academic performance, such as academic activity (StudyTimeWeekly), attendance (Absences), parental involvement (ParentalSupport), and extracurricular activities. The study utilized a dataset of 2,392 students that underwent comprehensive preprocessing, including correlation analysis to identify the impact of GPA and absences, and stratified separation into training and test sets. A comparative evaluation of the models was implemented using the following classification metrics: Accuracy, Precision, Recall, and F1-score. The study revealed the high efficiency of ensemble algorithms, with the AdaBoost method demonstrating the highest performance with an accuracy of 92.48 %, an F1-score of 92.21 %, and a ROC-AUC of 93.81 %. Confusion matrix analysis confirmed the model's balance, with a minimal number of false positives (38) and high-risk missed errors (32). An assessment of feature importance revealed the role of GPA (0.689), as well as the significant influence of self-study time and the number of missed errors, ensuring the model's interpretability. Further development paths for the intelligent system are proposed, including the creation of an interactive web application, dataset expansion, the implementation of adaptive calibration mechanisms, and integration into learning management systems (LMS) for practical implementation in the educational process to early identify at-risk students and optimize educational trajectories.
MACHINE LEARNING, CLASSIFICATION ALGORITHMS, PREDICTIVE MODELS, LABELS, REINFORCEMENT LEARNING, FEATURE IMPACT ANALYSIS, INTELLIGENT SYSTEM, FORECASTING



