KOMPARASI ALGORITMA NAIVE BAYES DAN K-NEAREST NEIGHBOR BERBASIS PARTICLE SWARM OPTIMIZATION PADA SENTIMENTANALYSIS REVIEW RESTAURANT Tesis Rizki Aulianita (full paper)

Tanggal

2015-08-03

Penulis

Abstraksi

Penelitian ini termasuk ke dalam text mining. Masalah pada penelitian ini yaitu pemilihan seleksi fitur untuk meningkatkan nilai akurasi Naive Bayes dan K-Nearest Neighbor serta membandingkan akurasi yang paling tinggi untuk analisis sentimen review restoran. Kedua metode tersebut, dioptimasi dengan metode Particle Swarm Optimization (PSO) sehingga menghasilkan akurasi Naive Bayes berbasis Particle Swarm Optimization yaitu 83.80% dan AUC sebesar 0.784. Sedangkan metode K-Nearest Neighbor berbasis Particle Swarm Optimization menghasilkan akurasi 80.60% dan AUC sebesar 0.860. Dapat disimpulkan bahwa penerapan optimasi, khususnya PSO dapat meningkatkan hasil akurasi pada Naive Bayes berbasis PSO dan Model Naive Bayes berbasis PSO dapat memberikan solusi terhadap permasalahan klasifikasi review restoran sehingga lebih akurat dan optimal.

Kata Kunci: k-Nearest Neighbors, Naive Bayes, Particle Swarm Optimization, Text mining

URI
https://drive.google.com/file/d/1Uo1S7SDoYH0UnL1D7CeRZ_J1df-6zbKv/view?usp=sharing

Bidang ilmu
Sistem Informasi

References

DAFTAR REFERENSI

 

Achtert et al. (2007), Future trends in data mining. Springer Science+BusinessMedia, 87-97.

 

Alshalabi et al. (2013), Experiments on the Use of Feature Selection and Machine Learning Methods in Automatic Malay Text Categorization. Procedia Technology, 11, 748-754.

 

Bagheri, Saraee, de Jong. (2013), Care more about customers Unsupervised domain-independent aspect detection for sentiment analysis of customer reviews. Knowledge-Based Systems, 52, 201-213.

 

Balahur, Perea-Ortega. (2015), Sentiment analysis system adaptation for multilingual processing The case of tweets. Europe: Institute for the Protection and Security of the Citizen (IPSC).

 

Basari et al. (2013),Opinion Mining of Movie Review using Hybrid Method of Support Vector Machine and Particle Swarm Optimization. Procedia Engineering, 53, 453-462.

 

Bermejo, Gámez, Puerta, (2011), Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance dataset.

 

Bo dan Jie. (2011), Naive Bayesian Classifier Based on Genetic Simulated Annealing Algorithm. Procedia Engineering, 23, 504-509.

 

Chen et al. (2009), Feature selection for text classification with Naïve Bayes. Expert Systems with Applications, 36, 5432-5435

classification: An empirical comparison between SVM and ANN. Expert

 

Di Caro, Grella. (2013), Sentiment analysis via dependency parsing. Computer Standards & Interfaces, 442-453. Dordrecht Heidelberg London: Springer.

 

Farid et al. (2014), Expert Systems with Applications Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks. Expert Systems With Applications, 41, 1937-1946.

 

Feldman, Ronen and Sanger, James. 2007. The Text Mining Handbook Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, New York.Francisco: Diane Cerra.

Ghiassi, Skinner, Zimbra. (2013), Twitter brand sentiment analysis A hybrid system using n-gram analysis and dynamic artificial neural network. Expert Systems with Applications journal, 40, 6266-6282.

 

Habernal, Ptáček, Steinberger. (2015), Reprint of “Supervised sentiment analysis in Czech social media”. Information Processing & Management, 50, 693-707

 

Hamzah, (2012), Klasifikasi teks dengan naïve bayes classifier (nbc) untuk pengelompokan teks berita dan abstract akademis.

 

Han, J., & Kamber, M. (2007). Data Mining Concepts and Techniques. San

 

Hashimi, Hafez, Mathkour. (2015), Selection criteria for text mining approaches. Computers in Human Behavior, 51, 729-733.

 

Hicks et al. (2012), Why people use Yelp.com An exploration of uses and gratifications. Computers in Human Behavior, 28, 2274-2279.

 

Hmeidi, Hawashin, El-Qawasmeh. (2008), Performance of KNN and SVM classifiers on full word Arabic articles. Advanced Engineering Informatics, 22, 106-111

 

Horrigan, Zhang et al. (2011), Sentiment classification of Internet restaurant reviews written in Cantonese. Expert Systems with Applications, 38, 7674-7682.

 

Hwa Lu et al. - 2010 - Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence thr. Knowledge-Based Systems, 23, 598–604.

 

Jiang et al. (2012), An improved K-nearest-neighbor algorithm for text categorization. Expert Systems with Applications, 39, 1503-1509.

 

Kang, Yoo, Han. (2012), Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews. Expert Systems with Applications, 39, 6000-6010.

 

Liu et al. (2014), A particle swarm optimization based simultaneous learning framework for clustering and classification. Pattern Recognition, 47, 2143–2152

 

Maimon, O. (2010). Data Mining And Knowledge Discovery Handbook. New York

Medhat, Hassan, Korashy. (2014), Sentiment analysis algorithms and applications A survey. Ain Shams Engineering Journal, 5, 1093-1113

 

Moraes, R., Valiati, J. F., & Gavião Neto, W. P. (2013). Document-level sentiment

 

Nouaouria, Boukadoum, (2014), Improved global-best particle swarm optimization algorithm with mixed-attribute data classification capability. Applied Soft Computing, 21, 554-567.

 

Prasetyo, Heri. (2014). Data Mining Mengolah Data Menjadi Informasi. Yogyakarta: Andi Offset.

 

Rozi, Hadi, Achmad. (2012), Implementasi Opinion Mining (Analisis Sentimen) untuk Ekstraksi Data Opini Publik pada Perguruan Tinggi. Jurnal EECCIS Vol. 6, No. 1, Juni 2012. Systems with Applications, 40(2), 621–633. doi:10.1016/j.eswa.2012.07.059

 

Sugiyono. 2010. Metode Penelitian Kuantitatif Kualitatif dan R&D. Bandung: Alfabeta.

 

Tan. (2005), Neighbor-weighted K-nearest neighbor for unbalanced text corpus. Expert Systems with Applications, 28, 667-671.

 

Vercellis, C. (2009). Business Intelligence Data Mining And Optimization For Decision Making .United Kingdom: A John Wiley And Sons, Ltd.,Publication.

 

Williams et al.  (2015), The role of idioms in sentiment analysis.

 

Xiang et al. (2015), A novel hybrid system for feature selection based on an improved gravitational search algorithm and k-NN method. Applied Soft Computing, 31, 293-307.

 

Yao, Zhi-Min. (2012), An Optimized NBC Approach in Text Classification. Physics Procedia, 24, 1910-1914

 

Yu, Wang. (2015), World Cup 2014 in the Twitter World A big data analysis of sentiments in U.S. sports fans’ tweets. Computers in Human Behavior, 48, 392-400.

 

Zhang et al. (2014), Sentiment Analysis on Reviews of Mobile Users. Procedia Computer Science. 34, 458-465.