Show simple item record

dc.contributor.authorNdambuki, Peter M
dc.date.accessioned2023-02-27T08:25:19Z
dc.date.available2023-02-27T08:25:19Z
dc.date.issued2021
dc.identifier.urihttps://repository.kcau.ac.ke/handle/123456789/1291
dc.description.abstractIn the present era of data deluge, institutions have accumulated huge amounts of data in their databases. Educational institutions all over the world are not an exception, having as well accumulated large amounts of data in their various educational management information systems databases of various forms and formats. The accumulation of such data in various educational institutions has led to the rise of two research fields namely; Educational data mining and learning analytics in an effort to discover hidden knowledge (insights) that can greatly improve operations in educational institutions. Among the hidden knowledge include but not limited to; predicting students’ performance, students’ drop out, discovering students interest which could avert popular student’s unrest in various institutions etc. This study seeks to take advantage of such an opportunity and develop a model using dataset obtained from public secondary schools in Kitui west constituency that can be used to predict students’ academic performance. There has been attempts from various researchers all over the globe to address this problem. Although such studies achieved some level of success, various limitation discussed in details in the empirical review militated against the performance of the earlier models. Desk research methodology was used to extract relevant secondary data from various schools’ departments within Kitui west constituency. Then preprocessing which includes feature selection after which the cleaned dataset was loaded to staging Data Lake in Hadoop. Data was queried from the Data Lake to python using Pyspark where data analysis procedures took place. Dataset consisting of optimal subset of features was used to train four machine-learning algorithms: Gradient boost classifier, Random forest classifier, Decision tree classifier and Deep Neural Network classifier. Generally, Decision tree and Random forest classifiers registered the best performance overall, with an accuracy of 97%, but after stratified Kfold cross validation, Decision tree classifier’s performance proved more stable with an average of 97% compared to Random forest classifier with 93%. Thus, Decision tree classifier was recommended for deployment in predicting students ‘academic performance for its reliable accuracy and relatively good precision on predicting the study’s target group. The developed Model will place students in to two groups: PASS and FAIL. The aim being to arouse an initiation of intervention from various stakeholders to reduce dismal performance among public secondary schools in Kitui west constituency.en_US
dc.language.isoenen_US
dc.publisherKCA Universityen_US
dc.subjectEducational datamining, learning analytics, Machine learning, feature selection, desk research, diagnostic research, experimental research, data deluge, minimum redundancy maximum relevance.en_US
dc.titleA Model For Predicting Students Academic Performance In Public Secondary Schools In Kitui West Constituencyen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record