Predicting Student Performance Using Demographic and Attendance Data with Machine Learning

Authors

  • Aman Patel Department of Computer Science, MIT Arts, Commerce and Science College, Pune, India
  • Samir Ambre Department of Computer Science, MIT Arts, Commerce and Science College, Pune, India
  • Vaibhav Gawade Department of Computer Science , MIT Arts, Commerce and Science College, Pune, India

DOI:

https://doi.org/10.5281/zenodo.19606501

Keywords:

Machine Learning, Ensemble Methods, Predictive Analytics, Educational Data

Abstract

This study investigates the influence of demographic, academic, and behavioral factors on student performance using advanced machine learning techniques. In modern education, success is shaped not only by classroom learning but also by background, habits, and daily behavior. The dataset included variables such as gender, family income, SSC and HSC scores, hometown, computer usage, preparation time, attendance, social media activity (watching short videos or “Reels”), and part time jobs. These features were analyzed alongside semester GPA and overall results to identify the strongest predictors of academic success. To evaluate these factors, both traditional classifiers (Random Forest, KNN, SVM, Naïve Bayes, Logistic Regression) and advanced boosting algorithms (CatBoost, LightGBM, XGBoost) were applied, along with ensemble methods such as Voting, Bagging, and Stacking. Among all models, Stacking Ensemble consistently achieved the highest accuracy of 95%, outperforming other approaches. Analysis showed that attendance, prior exam scores, and preparation time were the most influential predictors, while heavy social media use reduced concentration and performance. Visualizations including accuracy comparison charts, feature importance plots, and confusion matrices confirmed these findings. The CatBoost confusion matrix demonstrated reliable classification, with most predictions correctly aligned along the diagonal. The study highlights that while demographic factors such as income and hometown contribute to differences in performance, disciplined study habits and consistent attendance can help students overcome these challenges. These insights provide actionable guidance for educators and institutions. Schools can promote attendance, encourage effective study routines, and raise awareness about the impact of social media use. By leveraging machine learning, institutions can identify at risk students early and design targeted interventions to improve outcomes. Overall, this research demonstrates that student success depends on a combination of background, behavior, and daily habits. By integrating advanced machine learning methods, the study not only achieves high predictive accuracy but also offers practical strategies for supporting students. With proper support, motivation, and discipline, every student has the potential to enhance their performance and reach academic success.

Author Biographies

  • Aman Patel, Department of Computer Science, MIT Arts, Commerce and Science College, Pune, India

    M.Sc Computer Science Student, Department of Computer Science, MIT Arts, Commerce and Science College, Pune, India

  • Samir Ambre, Department of Computer Science, MIT Arts, Commerce and Science College, Pune, India

    M.Sc Computer Science student, Department of Computer Science, MIT Arts, Commerce and Science College, Pune, India

  • Vaibhav Gawade, Department of Computer Science , MIT Arts, Commerce and Science College, Pune, India

    M.Sc Computer Science student, Department of Computer Science, MIT Arts, Commerce and Science College, Pune, India

References

Downloads

Published

2026-04-16

How to Cite

Predicting Student Performance Using Demographic and Attendance Data with Machine Learning. (2026). JOURNAL UGC-CARE IJCRT (2349-3194) | ISSN Approved Journal, 16(2), 51257-51274. https://doi.org/10.5281/zenodo.19606501