Prediction of Teachers' Lateness Factors Coming to School Using C4.5, Random Tree, Random Forest Algorithm

research
  • 14 Dec
  • 2020

Prediction of Teachers' Lateness Factors Coming to School Using C4.5, Random Tree, Random Forest Algorithm

Abstract—Lateness arrives at work can be experienced by anyone, including teachers. Teachers who are late arriving at school have shown examples of bad behavior for students. It takes a study to determine the factors that cause a teacher to arrive late to school. Data Mining is selected to process the data that has been available. Processing uses 3 classification algorithms which are decision tree (C4.5, Random Tree, and Random Forest) algorithms. All three algorithms will be tested for known performance, where the best algorithm is determined by accuracy and AUC. The results of the research were obtained that Random Forest with pruning and pre-pruning is the best for accuracy value with 74.63% and also AUC value with 0.743. The teacher's delay in this study is often done by teachers who have a vehicle compared to those who do not have a vehicle.


Unduhan

 

REFERENSI

[1] Badan Pengembangan dan Pembinaan Bahasa, Kementerian Pendidikan dan Kebudayaan Republik Indonesia, Keterlambatan [Online] Retrivied from https://kbbi.kemdikbud.go.id/entri/keterlambatan, accessed on 25 November 2017.

[2] Haryati. L, Upaya meningkatkan disiplin guru dalam kehadiran mengajar di kelas melalui penerapan “Reward and Punishment”. MEDIA DIDAKTIKA, vol. 2(2), pp. 191–200, 2016.

[3] Sariana, “Upaya meningkatkan disiplin guru dalam kehadiran mengajar di kelas melalui waskat kepala sekolah pada smp negeri 4 rimba melintang kabupaten rokan hilir”. Perspektif Pendidikan Dan Keguruan, vol. VIII(1), pp. 12–17, 2017.

[4] Karim. M., and Rahman. R.M, “Decision Tree and Naïve Bayes Algorithm for Classification and Generation of Actionable Knowledge for Direct Marketing”. Journal of Software Engineering and Applications, (6), pp. 196–206, 2013.

[5] Wajhillah. R, “Optimasi algoritma klasifikasi c4.5 berbasis particle swarm optimization untuk prediksi penyakit jantung”. SWABUMI, vol. I(1), pp. 26–36, 2014.

[6] Defiyanti. S, and Pardede. D.L.C, “Perbandingan kinerja algoritma id3 dan c4.5 dalam klasifikasi spam-mail”. ReCALL, 2008.

[7] Sewaiwar. P, and Verma. K.K, “Comparative Study of Various Decision Tree Classification Algorithm Using WEKA”. International Journal of Emerging Research in Management & Technology, vol. 4(10), pp. 87– 91, 2015.

[8] Georgina. O, Alhasan. J, and Abdullahi. M.B, “Classification of Crime Data for Crime Control Using C4.5 and Naïve Bayes Techniques”. International Journal of Mathematical Analysis And Optimization: Theory And Applications, pp. 139–153, 2017.

[9] Thaseen. S, and Kumar. C.A, “An analysis of supervised tree based classifiers for intrusion detection system”. Proceedings of the 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering(PRIME), pp. 294–299, 2013.

[10] Kalmegh. S.R, “Comparative Analysis of WEKA Data Mining Algorithm RandomForest, RandomTree and LADTree for Classification of Indigenous News Data”. International Journal of Emerging Technology and Advanced Engineering, vol. 5(1), pp. 507–517, 2015.

[11] Rizal, “Komparasi Algoritma Klasifikasi Data Mining Untuk Memprediksi Penyakit Tuberculosis (TB)”, 2013.

[12] Han. J, Kamber. M, and Pei. J, Data Mining: Concepts and Techniques (3rd ed.). (San Francisco: Morgan Kaufmann), 2012.

[13] Witten. I.H, Frank. E, and Hall. M.A, Data Mining: Practical machine learning tools and techniques (3rd ed.). (Burlington: Morgan Kaufmann), 2011.

[14] Berry. M.J.A, and Linoff. G.S, Data mining techniques: for marketing, sales, and customer relationship management (2nd ed.). (Indiana: Wiley Publishing), 2004.

[15] Anggarwal. C.C, Data Mining: The Textbook. (Switzerland: Springer), 2015.

[16] Yu. L, Chen. G, Koronios. A, Zhu. S, and Guo. X, Application and Comparison of Classification Techniques in Controlling Credit Risk. (Recent Advances in Data Mining of Enterprise Data: Algorithms and Applications), pp. 111-145, 2007.

[17] Wu. X, and Kumar. V, The Top Ten Algorithms in Data Mining. (Boca Raton: CRC Press), 2009.

[18] Shajahaan. S.S, Shanthi. S, and Manochitra. V, “Application of Data Mining Techniques to Model Breast Cancer Data”. International Journal of Emerging Technology and Advanced Engineering, vol. 3(11), pp. 1– 8, 2013.

[19] Pfahringer. B, “Random model trees: an effective and scalable regression method”, 2010.

[20] Random Forest (tm), RandomForests. [Online] Retrivied from https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm, accessed on 31 Desember 2017.

[21] Kothari. C.R, Research Methodology: Methods & Techniques (2nd ed.). (New Delhi: New Age International Publishers), 2004.

[22] Badan Meteorologi Krimatologi dan Geofisika, Data Online Pusat Database BMKG. [Online] Retrivied from http://dataonline.bmkg.go.id/data_iklim, accessed on 30 November 2017. 166 Advances in Social Science, Education and Humanities Research