Article
Predictive Analytics for Student Engagement in E-Learning Systems
DOI:
https://doi.org/10.47344/88c0mc52Keywords:
student engagement, learning analytics, feature engineering, preprocessing pipelines, classical machine learning, Logistic Regression, Random Forest, OULADAbstract
To increase the success of students' education, it is important to be able to predict the level of their involvement in the online educational environment. This study uses the Open University Learning Analytics (OULAD) open dataset to develop a systematic and reproducible approach to classifying student engagement. On the other hand, many other studies depend on specific datasets or limited definitions of engagement. A full cycle of data preprocessing and feature extraction was implemented, aimed at obtaining informative behavioral indicators based on click data and evaluation results. We trained and tested two traditional supervised machine learning model, Random Forest and Logistic Regression, using weight and macro-average metrics. The random forest model demonstrated high efficiency across all interaction classes and showed higher accuracy (0.926) compared to logistic regression (0.896). The results obtained emphasize the importance of high-quality data preprocessing and thoughtful design of features. In addition, they confirm that such signs provide valuable information for the development of early warning systems and the further development of educational analytics in higher education institutions.