Komparasi Algoritma Text Mining Untuk Klasifikasi Review Hotel





The use of information technology using the internet is very easy to find information, so its users do not have to go to the place to find information, website and mobile applications internet users can directly get the information you want. Every Manager world tourism now provides the details about the tourism products offered. Very useful information at this time because people tend to look for quick information in the booking through the review of others in social media, blogs and websites. The importance of the review of the hotel as a source of information for travelers will plan trips. Currently known methods of classification of the highest accuracy in classifying hotels Indonesia-language review. So I need to know how Naïve Bayes algorithms of accuracy, Support Vector Machine, Decission Tree (C4.5) and Naïve Bayes Method with Particle Swarm
Optimization Feature Selection. The results obtained from the comparison of four methods of such algorithms, a better level of accuracy in the classification review of hotel indonesian using an algorithm Decission Tree (C4.5) 96.94% While achieving the fit method of optimization of the Nave bayes by using Particle Swarm Optimization feature Selection of 95.91%, accuracy using Naive Bayes Algorithm of 89.98% and the accuracy of the model of
Support Vector Machine of 89.86%.

Kata Kunci: Reviews Hotel, Naive Bayes, PSO, SVM, Decission Tree, C4.5


Bidang ilmu
Data Mining


Charjan, D. S., & Pund, P. M. A. (2013). Pattern Discovery For Text Mining Using Pattern Taxonomy, 4(10), 4550–4555.

Chen, Jingnian, Houkuan Huang, Shengfeng Tian, Y. Q. (2009). Feature selection for text classification with Naïve Bayes.  Expert Systems with Applications: An International Journal,  36(3), 5432–  5435. https://doi.org/10.1016/j.eswa.2008.06.054

Duan, W., Cao, Q., Yu, Y., & Levy, S. (2013). Mining Online User-Generated Content: Using Sentiment Analysis Technique to Study Hotel Service Quality.  2013 46th Hawaii International Conference on System Sciences, 3119–3128. https://doi.org/10.1109/HICSS.2013.400

Gede Suardika, I. (2016). Sentiment analysis system and correlation analysis on hospitality in Bali, 84(1), 88–95.

Gencosman, B. C., & Ozmutlu, Huseyin C., S. O. (2014). Character n-gram application for automatic new topic identification. Information Processing and Management, 50(6), 821–856. https://doi.org/doi.org/10.1016/j.ipm.2014.06.

Haddi, E., Liu, X., & Shi, Y. (2013). The role of text pre-processing in sentiment analysis. Procedia Computer Science,  17, 26–32. https://doi.org/10.1016/j.procs.2013.05.005

Han, J., Kamber, M., & Pei, J. (2012). Data Mining: Concepts and Techniques.  San Francisco, CA, itd: Morgan Kaufmann. https://doi.org/10.1016/B978-0-12-381479-1.00001-0

Kontopoulos, E., Berberidis, C., Dergiades, T., & Bassiliades, N. (2013). Ontology-based sentiment analysis of twitter posts.  Expert
Systems with Applications, 40(10), 4065–4074. https://doi.org/10.1016/j.eswa.2013.01.001

Lu, Y., Liang, M., Ye, Z., & Lichao, C. (2015). Improved particle swarm optimization algorithm and its application in text feature
selection.  Applied Soft Computing,  35, 629–636. https://doi.org/doi.org/10.1016/j.asoc.2015.07.005

Markopoulos, G., Mikros, G., & Iliadi, A. (2015). Cultural Tourism in a Digital Era, 373–383. https://doi.org/10.1007/978-3-319-15859-4

Marrese-Taylor, E., Velásquez, J. D., Bravo-Marquez, F., & Matsuo, Y. (2013). Identifying customer preferences about tourism products
using an aspect-based opinion mining approach.  ProcediaComputer Science,  22, 182–191. https://doi.org/10.1016/j.procs.2013.09.094

Putra, C., & Irawati, E. (2015). Algoritma Support Vector Machine  Untuk Mendeteksi Sms Spam Berbahasa Indonesia, 109–116.

Sukardi, A. S., & Supriyanto, C. (2014). Klasifikasi Spam Email Menggunakan Algoritma C4.5 Dengan Seleksi Fitur.  Jurnal Teknologi Informasi,  10(1), 19–30. Retrieved from http://research.pps.dinus.ac.id/lib/jurnal/Vol 10.1 019-030.pdf

Taufik, A. (2017). Optimasi Particle Swarm Optimization Sebagai Seleksi Fitur Pada Analisis Sentimen Review Hotel Berbahasa Indonesia Menggunakan Algoritma Naïve Bayes. Jurnal Teknik Komputer, III(2), 40–47.

Tsoumakas, G., Katakis, I., & Vlahavas, I. (2010). Data Mining and Knowledge Discovery Handbook.  Journal of Chemical Information
and Modeling. https://doi.org/10.1017/CBO9781107415324.004

Witten, I. H., Frank, E., & Hall, M. a. (2011). Data Mining: Practical Machine Learning Tools and Techniques (Google eBook). Complementary literature None. https://doi.org/0120884070,9780120884070

Ziqiong, Z., Qiang, Y., Zili, Z., & Yijun, L. (2011). Sentiment classification of Internet restaurant reviews written in Cantonese.  Expert  Systems with Applications,  38(6), 7674–7682. https://doi.org/10.1016/j.eswa.2010.12.147