Penerapan Metode K-Nearest Neighbor dan Information Gain pada Klasifikasi Kinerja Siswa

research
  • 22 Apr
  • 2020

Penerapan Metode K-Nearest Neighbor dan Information Gain pada Klasifikasi Kinerja Siswa

Pendidikan merupakan masalah yang sangat penting dalam perkembangan suatu negara. Salah satu cara untuk mencapai tingkat kualitas pendidikan adalah dengan memprediksi kinerja akademik siswa.  Metode yang dilakukan masih 

menggunakan cara yang tidak efektif karena evaluasi hanya berdasarkan pada penilaian pendidik terhadap informasi kemajuan pembelajaran siswa. Informasi kemajuan pembelajaran siswa tidak cukup untuk membentuk indikator dalam mengevaluasi kinerja siswa serta membantu para siswa dan pendidik untuk melakukan perbaikan dalam pembelajaran dan pengajaran. K-Nearest Neighbor merupakan metode yang efektif untuk klasifikasi kinerja siswa, namun K-Nearest Neighbor memiliki masalah dalam hal dimensi vektor yang besar. Penelitian ini bertujuan untuk memprediksi kinerja akademik siswa menggunakan algoritma K-Nearest Neighbor  dengan metode seleksi fitur Information Gain untuk mengurangi dimensi vektor. Beberapa percobaan dilakukan untuk mendapatkan arsitektur yang optimal dan menghasilkan klasifikasi yang akurat. Hasil dari 10 percobaan dengan nilai k (1 sampai dengan 10) pada dataset student performance dengan metode K-Nearest Neighbor didapatkan rata-rata akurasi terbesar yaitu 74,068 sedangkan dengan metode K-Nearest Neighbor dan Information Gain didapatkan ratarata akurasi terbesar yaitu 76,553. Dari hasil pengujian tersebut maka dapat disimpulkan bahwa Information Gain mampu mengurangi dimensi vektor, sehingga penerapan K-Nearest Neighbor dan Information Gain dapat meningkatkan akurasi klasifikasi kinerja siswa yang lebih baik dibanding dengan menggunakan metode K-Nearest Neighbor saja. 

Unduhan

 

REFERENSI

Adeniyi, D. A., Wei, Z., & Yongquan, Y. (2016). Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN) classification method. Applied Computing and Informatics, 12(1), 90–108. https://doi.org/10.1016/j.aci.2014.10.001 

Aghbari, Z. Al. (2005). Array-index: A plug&search K nearest neighbors method for highdimensional data. Data and Knowledge Engineering, 52(3), 333–352. https://doi.org/10.1016/j.datak.2004.06.01 5 

Al-Shehri, H., Al-Qarni, A., Al-Saati, L., Batoaq, A., Badukhen, H., Alrashed, S., … Olatunji, S. O. (2017). Student performance prediction using Support Vector Machine and K-Nearest Neighbor. Canadian Conference on Electrical and Computer Engineering, 17–20. https://doi.org/10.1109/CCECE.2017.79468 47 Alkhasawneh, R., & Hobson, R. (2011). Modeling student retention in science and engineering disciplines using neural networks. In 2011 IEEE Global Engineering Education Conference, EDUCON 2011 (pp. 660–663). https://doi.org/10.1109/EDUCON.2011.577 3209 

Conijn, R., Snijders, C., Kleingeld, A., & Matzat, U. (2017). Predicting student performance from LMS data: A comparison of 17 blended courses using moodle LMS. IEEE Transactions on Learning Technologies, 10(1), 17–29. https://doi.org/10.1109/TLT.2016.2616312 

Cortez, P., & Silva, A. (2008). Using Data Mining to Predict Secondary School Student Performance. In A. Brito and J. Teixeira Eds., Proceedings of 5th FUture BUsiness TEChnology Conference (FUBUTEC 2008), 5– 12. 

de Vries, A. P., Mamoulis, N., Nes, N., & Kersten, M. (2003). Efficient k-NN search on vertically decomposed data (p. 322). https://doi.org/10.1145/564728.564729 

Gallager, R. G. (2001). Claude E. Shannon: A retrospective on his life, work, and impact. IEEE Transactions on Information Theory, 47(7), 2681–2695. https://doi.org/10.1109/18.959253 

George Gorman. (2003). An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 3, 1289–1305. 

Gou, J., Zhan, Y., Rao, Y., Shen, X., Wang, X., & He, W. (2014). Improved pseudo nearest neighbor classification. Knowledge-Based Systems, 70, 361–375. https://doi.org/10.1016/j.knosys.2014.07.0 20 

Hamsa, H., Indiradevi, S., & Kizhakkethottam, J. J. (2016). Student Academic Performance Prediction Model Using Decision Tree and Fuzzy Genetic Algorithm. Procedia Technology, 25, 326–332. https://doi.org/10.1016/j.protcy.2016.08.11 4 

Han, J., Kamber, M., & Pei, J. (2012). Data Mining. In Data Mining (pp. 1–38). https://doi.org/10.1016/B978-0-12381479-1.00001-0 Hand, D. J. (2007). Principles of data mining. Drug Safety, 30(7), 621–622. https://doi.org/10.2165/00002018200730070-00010 

Ibrahim, Z., & Rusli, D. (2007). Predicting Students’ Academic Performance: Comparing Artificial Neural Network, Decision tree And Linear Regression. Proceedings of the 21st Annual SAS Malaysia Forum, (September), 1–6. Retrieved from https://www.researchgate.net/profile/Daliel a_Rusli/publication/228894873_Predicting_ Students’_Academic_Performance_Comparin g_Artificial_Neural_Network_Decision_Tree_a nd_Linear_Regression/links/0deec51bb04e7 6ed93000000.pdf 

Koncz, P., & Paralic, J. (2011). An approach to feature selection for sentiment analysis. In INES 2011 - 15th International Conference on Intelligent Engineering Systems, Proceedings (pp. 357–362). https://doi.org/10.1109/INES.2011.595477 3 

Lin, Y., Li, J., Lin, M., & Chen, J. (2014). A new nearest neighbor classifier via fusing neighborhood information. Neurocomputing, 143, 164–169. https://doi.org/10.1016/j.neucom.2014.06.0 09 

Lopez Guarin, C. E., Guzman, E. L., & Gonzalez, F. A. (2015). A Model to Predict Low Academic Performance at a Specific Enrollment Using Data Mining. Revista Iberoamericana de Tecnologias Del Aprendizaje, 10(3), 119–125. https://doi.org/10.1109/RITA.2015.245263 2 

Lu, L. R., & Fa, H. Y. (2004). A Density-Based Method for Reducing the Amount of Training Data in kNN Text Classification [J]. Journal of Computer Research and Development, 4, 003. 

Pandey, M., & Taruna, S. (2016). Towards the integration of multiple classifier pertaining to the Student’s performance prediction. Perspectives in Science, 8, 364–366. https://doi.org/10.1016/j.pisc.2016.04.076 

Setiyorini, T., & Asmono, R. T. (2017). Penerapan Gini Index dan K-Nearest Neighbor untuk Klasifikasi Tingkat Kognitif Soal pada Taksonomi Bloom. Jurnal Pilar Nusa Mandiri, 13(2), 209–216. 

Setiyorini, T., & Asmono, R. T. (2019). Laporan Akhir Penelitian Mandiri. 

Shahiri, A. M., Husain, W., & Rashid, N. A. (2015). A Review on Predicting Student’s Performance Using Data Mining Techniques. Procedia Computer Science, 72, 414–422. https://doi.org/10.1016/j.procs.2015.12.157 

Vercellis, C. (2009). Data mining and optomization for decision making. Business Intelligence (Vol. 1). https://doi.org/10.1017/CBO97811074153 24.004 

Wang, S., Li, D., Song, X., Wei, Y., & Li, H. (2011). A feature selection method based on improved fisher’s discriminant ratio for text sentiment classification. Expert Systems with Applications, 38(7), 8696–8702. https://doi.org/10.1016/j.eswa.2011.01.077 

Won Yoon, J., & Friel, N. (2015). Efficient model selection for probabilistic K nearest neighbour classification. Neurocomputing, 149(PB), 1098–1108. https://doi.org/10.1016/j.neucom.2014.07.0 23 

Xu, T., Peng, Q., & Cheng, Y. (2012). Identifying the semantic orientation of terms using S-HAL for sentiment analysis. Knowledge-Based Systems, 35, 279–289. https://doi.org/10.1016/j.knosys.2012.04.0 11 

Yang, F., & Li, F. W. B. (2018). Study on student performance estimation, student progress analysis, and student potential prediction based on data mining. Computers and Education, 123(October 2017), 97–108. https://doi.org/10.1016/j.compedu.2018.04. 006 

Zhang, J., & Tan, S. (2008). An empirical study of sentiment analysis for chinese documents. EXPERT SYSTEMS WITH APPLICATIONS, 34(4), 2622–2629. https://doi.org/10.1016/j.eswa.2007.05.028