Machine Learning-Based Classification of Student Adaptability in Online Learning with Feature Engineering

Yasin Efendi

doi:10.38043/tiers.v6i1.6806

Authors

Yasin Efendi Universitas Muhammadiyah Jakarta, Indonesia

DOI:

https://doi.org/10.38043/tiers.v6i1.6806

Keywords:

Feature Engineering, Machine Learning, Online Learning, SHAP, Student Adaptability

Abstract

Student adaptability in online learning environments has become increasingly important in contemporary education. This study introduces a feature engineering approach guided by SHAP (SHapley Additive exPlanations) to enhance the classification of student adaptability levels. Unlike prior studies that primarily utilize exploratory analysis or statistical importance scores, this method leverages SHAP values to construct new features by considering both statistical contribution and semantic meaning. Three additional features were created by combining original variables, representing educational level and session duration, digital access quality, and socioeconomic context. The dataset was evaluated using four classic machine learning models, namely Support Vector Machine (SVM), k-Nearest Neighbors (KNN), Decision Tree, and Random Forest, both before and after applying the engineered features. Results show that SHAP-based feature engineering improved model performance in most cases. The most notable gains were observed in Decision Tree and Random Forest models, where the F1-score increased from 84.87% to 89.34% and from 85.80% to 89.34%, respectively, while accuracy rose from 88.38% to 90.08% and from 89.63% to 90.08%, respectively. The SVM model also recorded an increase in recall from 82.49% to 87.28%, whereas KNN showed a slight drop in accuracy but improved in ROC AUC from 91.55% to 93.83%. These findings demonstrate that explainable feature design not only enhances accuracy and F1-score, particularly in tree-based models, but also supports model interpretability, enabling more transparent, reliable, and effective educational decision-making systems.

Downloads

Download data is not yet available.

References

R. Wu and Z. Yu, “Relationship between university students’ personalities and e-learning engagement mediated by achievement emotions and adaptability,” Educ. Inf. Technol., vol. 29, no. 9, pp. 10821–10850, 2024, doi: 10.1007/s10639-023-12222-5.

S. P. Kar, A. K. Das, R. Chatterjee, and J. K. Mandal, “Assessment of learning parameters for students’ adaptability in online education using machine learning and explainable AI,” Educ. Inf. Technol., vol. 29, no. 6, pp. 7553–7568, 2024, doi: 10.1007/s10639-023-12111-x.

S. A. Salloum, A. Salloum, R. Alfaisal, A. Basiouni, and K. Shaalan, “Predicting Student Adaptability to Online Education Using Machine Learning,” in Breaking Barriers with Generative Intelligence (BBGI), 2024, pp. 187–196. doi: 10.1007/978-3-031-65996-6_16.

R. Arifudin, S. Subhan, and Y. N. Ifriza, “Student Adaptability Level Optimization using GridsearchCV with Gaussian Naive Bayes and K-Nearest Neighbor Methods as an Effort to Improve Online Education Predictions,” J. Nas. Pendidik. Tek. Inform. JANAPATI, vol. 14, no. 2, pp. 287–295, 2025, doi: 10.23887/janapati.v14i2.88972.

M. M. H. Suzan, N. A. Samrin, A. A. Biswas, and M. A. Pramanik, “Students’ Adaptability Level Prediction in Online Education using Machine Learning Approaches,” in International Conference on Computing Communication and Networking Technologies (ICCCNT), IEEE, 2021, pp. 1–7. doi: 10.1109/ICCCNT51525.2021.9579741.

O. Iparraguirre-Villanueva et al., “Comparison of Predictive Machine Learning Models to Predict the Level of Adaptability of Students in Online Education,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 4, pp. 494–503, 2023, doi: 10.14569/IJACSA.2023.0140455.

R. Diallo, C. Edalo, and O. O. Awe, “Machine Learning Evaluation of Imbalanced Health Data: A Comparative Analysis of Balanced Accuracy, MCC, and F1 Score,” in Practical Statistical Learning and Data Science Methods, O. O. Awe and E. A. Vance, Eds., Cham: Springer Nature Switzerland, 2025, pp. 283–312. doi: 10.1007/978-3-031-72215-8_12.

S. Wang, Y. Dai, J. Shen, and J. Xuan, “Research on expansion and classification of imbalanced data based on SMOTE algorithm,” Sci. Rep., vol. 11, no. 1, pp. 1–11, 2021, doi: 10.1038/s41598-021-03430-5.

S. Yang and S. Zhu, “Identifying factors influencing online learning outcomes for middle-school students — a re-examination based on XGBoost and SHAP,” Educ. Inf. Technol., vol. 30, no. 11, pp. 15071–15094, 2025, doi: 10.1007/s10639-025-13405-y.

N. Bosch, “AutoML Feature Engineering for Student Modeling Yields High Accuracy, but Limited Interpretability,” J. Educ. Data Min., vol. 13, no. 2, pp. 55–79, 2021, doi: 10.5281/zenodo.5275314.

H. Wang, Q. Liang, J. T. Hancock, and T. M. Khoshgoftaar, “Feature selection strategies: a comparative analysis of SHAP-value and importance-based methods,” J. Big Data, vol. 11, no. 1, 2024, doi: 10.1186/s40537-024-00905-w.

J. C. J. Luza and C. Rodriguez, “Predictive Attributes in Machine Learning for University Academic Performance: A Feature Engineering Approach,” in IEEE International Conference on Computational Intelligence and Communication Networks, IEEE, 2024, pp. 443–456. doi: 10.1109/CICN63059.2024.10847424.

D. Hooshyar and Y. Yang, “Problems With SHAP and LIME in Interpretable AI for Education: A Comparative Study of Post-Hoc Explanations and Neural-Symbolic Rule Extraction,” IEEE Access, vol. 12, no. October, pp. 137472–137490, 2024, doi: 10.1109/ACCESS.2024.3463948.

J. Faouzi and O. Colliot, “Classic Machine Learning Methods,” in Machine Learning for Brain Disorders, O. Colliot, Ed., New York, NY: Springer US, 2023, pp. 25–75. doi: 10.1007/978-1-0716-3195-9_2.

M. M. H. Suzan and N. A. Samrin, “Students Adaptability Level in Online Education,” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/mdmahmudulhasansuzan/students-adaptability-level-in-online-education

I. O. Muraina, “Ideal Dataset Splitting Ratios in Machine Learning Algorithms: General Concerns for Data Scientists and Data Analysts,” in International Mardin Artuklu Scientific Researches Conference, 2022, pp. 496–505.

T. Wongvorachan, S. He, and O. Bulut, “A Comparison of Undersampling, Oversampling, and SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining,” Information, vol. 14, no. 1, 2023, doi: 10.3390/info14010054.

N. A. Azhar, M. S. M. Pozi, A. M. Din, and A. Jatowt, “An Investigation of SMOTE Based Methods for Imbalanced Datasets With Data Complexity Analysis,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 7, pp. 6651–6672, 2023, doi: 10.1109/TKDE.2022.3179381.

Z. L. Chia, M. Ptaszynski, F. Masui, G. Leliwa, and M. Wroczynski, “Machine Learning and feature engineering-based study into sarcasm and irony classification with application to cyberbullying detection,” Inf. Process. Manag., vol. 58, no. 102600, 2021, doi: https://doi.org/10.1016/j.ipm.2021.102600.

Y. Guan, F. Wang, and S. Song, “Interpretable machine learning for academic performance prediction: A SHAP-based analysis of key influencing factors,” Innov. Educ. Teach. Int., pp. 1–20, doi: 10.1080/14703297.2025.2532050.

L. C. Nnadi, Y. Watanobe, M. M. Rahman, and A. M. John-Otumu, “Prediction of Students’ Adaptability Using Explainable AI in Educational Machine Learning Models,” Appl. Sci., vol. 14, no. 12, p. 5141, 2024, doi: 10.3390/app14125141.

B. Etaati, A. Jahangiri, G. Fernandez, M. Tsou, and S. G. Machiani, “Understanding Active Transportation to School Behavior in Socioeconomically Disadvantaged Communities: A Machine Learning and SHAP Analysis Approach,” Sustainability, vol. 16, no. 1, p. 48, 2023, doi: 10.3390/su16010048.

W. Chow, “Improving early dropout detection in undergraduate students: Exploring key predictors through SHAP values,” Proceedings of the 35th Annual Conference of the Australasian Association for Engineering Education (AAEE 2024). Engineers Australia, Christchurch, New Zealand, Dec. 18, 2024. [Online]. Available: https://search.informit.org/doi/10.3316/informit.T2025032000014092035132356

H. Yang, W. Lee, and J. Kim, “Identification of Key Factors Influencing Teachers’ Self-Perceived AI Literacy: An XGBoost and SHAP-Based Approach,” Appl. Sci., vol. 15, no. 8, 2025, doi: 10.3390/app15084433.

N. Syam and R. Kaul, “Random Forest, Bagging, and Boosting of Decision Trees,” in Machine Learning and Artificial Intelligence in Marketing and Sales, Emerald Publishing Limited, 2021, pp. 139–182. doi: 10.1108/978-1-80043-880-420211006.

H. Liu, X. Chen, and X. Liu, “Factors influencing secondary school students’ reading literacy: An analysis based on XGBoost and SHAP methods,” Front. Psychol., vol. 13, no. September, 2022, doi: 10.3389/fpsyg.2022.948612.

S. S. Shanto and A. I. Jony, “Interpretable Ensemble Learning Approach for Predicting Student Adaptability in Online Education Environments,” Knowledge, vol. 5, no. 2, p. 10, 2025, doi: 10.3390/knowledge5020010.

A. Aldino, A. Saputra, A. Nurkholis, and S. Setiawansyah, “Application of Support Vector Machine (SVM) Algorithm in Classification of Low-Cape Communities in Lampung Timur,” Build. Informatics, Technol. Sci., vol. 3, no. 3 SE-Articles, Dec. 2021, doi: 10.47065/bits.v3i3.1041.

D. Kurniadi, A. Mulyani, and I. Muliana, “Prediction System for Problem Students using k-Nearest Neighbor and Strength and Difficulties Questionnaire,” J. Online Inform., vol. 6, no. 1, pp. 53–62, 2021, doi: 10.15575/join.v6i1.701.

A. Arista, “Comparison Decision Tree and Logistic Regression Machine Learning Classification Algorithms to determine Covid-19,” Sinkron, vol. 7, no. 1, pp. 59–65, 2022, doi: 10.33395/sinkron.v7i1.11243.

M. Nachouki, E. A. Mohamed, R. Mehdi, and M. Abou Naaj, “Student course grade prediction using the random forest algorithm: Analysis of predictors’ importance,” Trends Neurosci. Educ., vol. 33, no. 100214, 2023, doi: https://doi.org/10.1016/j.tine.2023.100214.

M. S. Mohosheu, F. Abrar Shams, M. A. al Noman, S. R. Abir, and Al-Amin, “ROC Based Performance Evaluation of Machine Learning Classifiers for Multiclass Imbalanced Intrusion Detection Dataset,” in International Conference on Recent Advances and Innovations in Engineering (ICRAIE), 2023, pp. 1–6. doi: 10.1109/ICRAIE59459.2023.10468177.

A. M. Carrington et al., “Deep ROC Analysis and AUC as Balanced Average Accuracy, for Improved Classifier Selection, Audit and Explanation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 1, pp. 329–341, 2023, doi: 10.1109/TPAMI.2022.3145392.