The current Governor of DKI Jakarta, even though he has been elected since 2017 is always
interesting to talk about or even comment on. Comments that appear come from the media
directly or through social media. Twitter has become one of the social media that is often used
as a media to comment on elected governors and can even become a trending topic on Twitter
social media. Netizens who comment are also varied, some are always tweeting criticism, some
are commenting positively, and some are only re-tweeting. In this research, a prediction of
whether active netizens will tend to always lead to positive or negative comments will be
carried out in this study. Model algorithms used are Decision Tree, Naïve Bayes, Random
Forest and also Ensemble. Twitter data that is processed must go through preprocessing first
before proceeding using Rapidminer. In trials using Rapidminer conducted in four trials by
dividing into two parts, namely testing data and training data. Comparisons made are 10%
testing data: 90% Training data, then 20% testing data: 80% training data, then 30% testing
data: 70% training data and the last is 35% testing data: 65% training data. The average
accuracy for the Decision Tree algorithm is 93.15%, while for the Naïve Bayes algorithm the
accuracy is 91.55%, then for the Random Forest algorithm is 93.41, and the last is the
Ensemble algorithm with an accuracy of 93, 42%.
Key Words: Decision Tree, Naive Bayes, Random Forest, Ensemble, Twitter.
2 TESIS NUSAMANDIRI
Al-Rubaiee, H., Qiu, R., & Li, D. (2016). Analysis of the relationship between Saudi
twitter posts and the Saudi stock market. 2015 IEEE 7th International Conference on
Intelligent Computing and Information Systems, ICICIS 2015, December, 660–665.
https://doi.org/10.1109/IntelCIS.2015.7397193
Alfisahrin, S. N. N., & Mantoro, T. (2013). Data mining techniques for optimization of
liver disease classification. Proceedings - 2013 International Conference on
Advanced Computer Science Applications and Technologies, ACSAT 2013.
https://doi.org/10.1109/ACSAT.2013.81
Alhamad, A., Azis, A. I. S., Santoso, B., & Taliki, S. (2019). Prediksi Penyakit Jantung
Menggunakan Metode-Metode Machine Learning Berbasis Ensemble – Weighted
Vote. 5(3), 352–360.
Attenberg, J., & Ertekin, Ş. (2013). Class imbalance and active learning. In Imbalanced
Learning: Foundations, Algorithms, and Applications.
https://doi.org/10.1002/9781118646106.ch6
Blatnik, A., Jarm, K., & Meža, M. (2014). Movie sentiment analysis based on public
tweets. Elektrotehniski Vestnik/Electrotechnical Review, 81(4), 160–166.
Buntoro, G. A. (2017). Analisis Sentimen Calon Gubernur DKI Jakarta 2017 Di Twitter.
Integer Journal Maret, 1(1), 32–41.
https://www.researchgate.net/profile/Ghulam_Buntoro/publication/316617194_Anali
sis_Sentimen_Calon_Gubernur_DKI_Jakarta_2017_Di_Twitter/links/5907eee445851
52d2e9ff992/Analisis-Sentimen-Calon-Gubernur-DKI-Jakarta-2017-Di-Twitter.pdf
Cureg, M. Q., De La Cruz, J. A. D., Solomon, J. C. A., Saharkhiz, A. T., Balan, A. K. D.,
& Samonte, M. J. C. (2019). Sentiment analysis on tweets with punctuations,
emoticons, and negations. ACM International Conference Proceeding Series, Part
F1483(1), 266–270. https://doi.org/10.1145/3322645.3322657
Da Silva, N. F. F., Hruschka, E. R., & Hruschka, E. R. (2014). Tweet sentiment analysis
with classifier ensembles. Decision Support Systems.
https://doi.org/10.1016/j.dss.2014.07.003
Flux, A. W., & Pareto, V. (1897). Cours d’Economie Politique. The Economic Journal.
https://doi.org/10.2307/2956966
Jiawei Han, & Kamber, M. (2013). Data Mining: Concepts and Techniques Second
Edition. In Morgan Kaufmann. https://doi.org/10.1017/CBO9781107415324.004
Junianto, E., & Riana, D. (2017). Penerapan PSO Untuk Seleksi Fitur Pada Klasifikasi
Dokumen Berita Menggunakan NBC. 4(1), 38–45.
Kartiko, M., & Sfenrianto. (2019). Accuracy for Sentiment Analysis of Twitter Students on
ELearning in Indonesia using Naive Bayes Algorithm Based on Particle Swarm
Optimization. Journal of Physics: Conference Series, 1179(1).
https://doi.org/10.1088/1742-6596/1179/1/012027
Kothari, C. (2004). Research methodology: methods and techniques. In New Age
International.
https://doi.org/http://196.29.172.66:8080/jspui/bitstream/123456789/2574/1/Research
%20Methodology.pdf
70
Program Studi Ilmu Komputer (S2) STMIK Nusa Mandiri
Lee, K., Agrawal, A., & Choudhary, A. (2013). Real-Time disease surveillance using
twitter data:Demonstration on flu and cancer. Proceedings of the ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, Part F1288,
1474–1477. https://doi.org/10.1145/2487575.2487709
Mentari, N. D., Fauzi, M. A., & Muflikhah, L. (2018). Analisis Sentimen Kurikulum 2013
Pada Sosial Media Twitter Menggunakan Metode K-Nearest Neighbor dan Feature
Selection Query Expansion Ranking. Jurnal Pengembangan Teknologi Informasi Dan
Ilmu Komputer (J-PTIIK) Universitas Brawijaya, 2(8), 2739–2743.
Mukminin, A., & Riana, D. (2017). Komparasi Algoritma C4 . 5 , Naïve Bayes Dan Neural
Network Untuk Klasifikasi Tanah. Jurnal Informatika.
Nuraeni, N. (2017). Penentuan Kelayakan Kredit Dengan Algoritma Naïve Bayes
Classifier : Studi Kasus Bank Mayapada Mitra Usaha Cabang PGC. Jurnal Teknik
Komputer AMIK BSI (JTK).
Padmavathi, M., Suresh, R. M., & Mangadu, N. (2015). A Study of Fuzzy Based Block
Selection Stratagem in Bittorrent Like P2P Network. Australian Journal of Basic and
Applied Sciences, 9(January), 183–193.
Paprotny, D., Andrzejewski, P., Terefenko, P., & Furmańczyk, K. (2014). Application of
empirical wave run-up formulas to the polish baltic sea coast. PLoS ONE.
https://doi.org/10.1371/journal.pone.0105437
Pratama, B., Saputra, D. D., Novianti, D., Purnamasari, E. P., Kuntoro, A. Y., Hermanto,
Gata, W., Wardhani, N. K., Sfenrianto, S., & Budilaksono, S. (2019). Sentiment
Analysis of the Indonesian Police Mobile Brigade Corps Based on Twitter Posts
Using the SVM and NB Methods. Journal of Physics: Conference Series, 1201(1).
https://doi.org/10.1088/1742-6596/1201/1/012038
Puyalnithi, T., V, M. V., & Singh, A. (2016). Comparison of Performance of Various Data
Classification Algorithms with Ensemble Methods Using RAPIDMINER. 6(5), 1–6.
Rachmat, A., & Lukito, Y. (2016). Implementasi Sistem Crowdsourced Labelling Berbasis
Web dengan Metode Weighted Majority Voting. Jurnal ULTIMA InfoSys, 6(2), 76–
82. https://doi.org/10.31937/si.v6i2.223
Ratul, A. R., & Engineering, F. (n.d.). A Comparative Study on Crime in Denver City
Based on Machine Learning and Data Mining.
Witten, I. H., Frank, E., & Hall, M. a. (2011). Data Mining: Practical Machine Learning
Tools and Techniques (Google eBook). In Complementary literature None.
http://books.google.com/books?id=bDtLM8CODsQC&pgis=1
Akdon, dan Ridwan.(2013). Rumus dan Data dalam Aplikasi Statistika. Bandung:
Alfabeta.
Blaxter, L., Hughes, C., & Tight, M. (2010). How to Research
(4th ed). Maidenhead: Open University Press.
Breiman, L. (1996). Bagging Predictors. Machine Learning, 123-140.
C.-M. Liaw, Yi-Ching, Leou Maw-Lin, “Fast exact k nearest neighbors search using
anorthogonal search tree,” Pattern Recognit., vol. 43, no. 6, pp. 2351–2358,
Feb. 2010.
71
Program Studi Ilmu Komputer (S2) STMIK Nusa Mandiri
Dawson, C. W. (2009). Projects in Computing and Information Systems a student’s
guide. Harlow, UK: Addison-Wesley.
Gorunescu, F. (2011). Data mining Concepts, Models and Techniques. Verlag Berlin
Heidelbreg: Springer
Han, J., & Kamber, M. (2007). Data mining Concepts and Technique. Morgan
Kaufmann publisher.
Larose, D. T. (2005). Discovering Knowledge in Data. New Jersey: John Willey &
Sons, Inc.
Maimon, O., & Rokach, L. (2010). Data mining and Knowledge Discovery
Handbook. London: Springer.