Abstract—Twitter is one of the online media that produces a number of information data on significant disaster situations, namely tweets, some of which have tweets that have valuable information to take humanitarian assistance measures. The purpose of this study is to train models based on information sourced from Twitter by classifying tweets that are informative and not informative by using algorithms for classification. Based on previous research algorithms that are considered appropriate in dealing with this problem are Support Vector Machine algorithms based on that, in research developing research models by adding features of the Smote UpSampling imbalance and Gini Index and adding a Naïve Bayes algorithm to compare the accuracy of the classification algorithm. The data used is the data of tweets related to earthquake disaster events that occurred in Indonesia that were collected using the Rapid Miner application and using GataFramework in text preprocessing. Based on the method proposed in this study the Support Vector Machine algorithm produces 81.03% accuracy superior to the Naïve Bayes algorithm which produces an accuracy of 80.30%, based on the results of accuracy both enter into a good classification algorithm of the resulting accuracy.
Informative Tweet Classification of the Earthquake Disaster Situation In Indonesia
[1] C. W. Hsu and C. J. Lin, “A comparison of methods for multiclass
support vector machines,” IEEE Transactions on Neural Networks,
Vol. 13, No. 2, pp. 415–425, March 2002.R. Nicole, “Title of paper
with only first word capitalized,” J. Name Stand. Abbrev., in press.
[2] Chawla, N. V, Bowyer, K. W., & Hall, L. O. (2002). SMOTE :
Synthetic Minority Over-sampling Technique, 16, 321–357.
[3] Vieweg, S., Hughes, A. L., Starbird, K., & Palen, L. (2010).
Microblogging During Two Natural Hazards Events : What Twitter
May Contribute to Situational Awareness, 1079–1088.
[4] Verma, S., Vieweg, S., Corvey, W. J., Palen, L., Martin, J. H., Palmer,
M., ... Anderson, K. M. (n.d.). Natural Language Processing to the
Rescue ? Extracting “ Situational Awareness ” Tweets During Mass
Emergency, 385–392, 2011.
[5] Cameron, M. A., Power, R., & Robinson, B. (2012). Emergency
Situation Awareness from Twitter for Crisis Management, 695-698.
[6] Doi, I. (2012). 2012 IEEE 28th International Conference on Data
Engineering, 1273–1276. https://doi.org/10.1109/ICDE.2012.125.
[7] Thomson, R., Ito, N., Suda, H., Lin, F., Liu, Y., Hayasaka, R., ...
Wang, Z. (2012). Trusting Tweets : The Fukushima Disaster and
Information Source Credibility on Twitter, (April), 1–10.
[8] Tsytsarau, M., & Palpanas, T. (2012). Survey on mining subjective data
on the web. Data Mining and Knowledge Discovery, 24, 478–514.
[9] Tucker, S., Lanfranchi, V., Ireson, N., Sosa, A., Burel, G., &
Ciravegna, F. (2012). “ Straight to the Information I Need ”: Assessing
Collational Interfaces for Emergency Response, (April), 1–5.
[10] Berlingerio, M., Calabrese, F., & Lorenzo, G. Di. (2013). SaferCity : a
System for Detecting and Analyzing Incidents from Social Media, 1–
4. https://doi.org/10.1109/ICDMW.2013.39
[11] Castillo, C., & Diaz, F. (2013). Extracting Information Nuggets from
Disaster- Related Messages in Social Media, (May), 1–10.
[12] Castillo, C., Mendoza, M., & Poblete, B. (2013). Predicting
information credibility in time-sensitive social media Finally , Section
6 summarizes our main findings and presents directions for future,
23(5), 560–588. https://doi.org/10.1108/IntR-05-2012-0095
[13] Chowdhury, S. R., Asghar, M. R., & Amer-yahia, S. (2013).
Tweet4act : Using Incident-Specific Profiles for Classifying Crisis-
Related Messages Tweet4act : Using Incident-Specific Profiles for
Classifying Crisis-Related Messages.
[14] Haddi, E., Liu, X., & Shi, Y. (2013). The Role of Text Pre-processing
in Sentiment Analysis. Procedia Computer Science, 17, 26–32.
https://doi.org/10.1016/j.procs.2013.05.005.
[15] Imran, M., Diaz, F., Meier, P., & Castillo, C. (n.d.). Practical Extraction
of Disaster-Relevant Information from Social Media, 1021–1024,2013.
[16] R. Moraes, J. F. Valiati, and W. P. Gavi??o Neto, “Document-level
sentiment classification: An empirical comparison between SVM and
ANN,” Expert Syst. Appl., vol. 40, no. 2, pp. 621–633, 2013.
[17] Sano, M., Torisawa, K., Hashimoto, C., Ohtake, K., Kawai, T., Oh, J.,
... Discovery, K. (2013). Aid is Out There : Looking for Help from
Tweets during a Large Scale Disaster, 1619–1629.
[18] Sano, M., Torisawa, K., Hashimoto, C., Ohtake, K., Kawai, T., Oh, J.,
... Discovery, K. (2013). Aid is Out There : Looking for Help from
Tweets during a Large Scale Disaster, 1619–162.
[19] Ao, J., Zhang, P., & Cao, Y. (2014). Estimating the Locations of
Emergency Events from Twitter Streams, 31, 731–739.
https://doi.org/10.1016/j.procs.2014.05.321
[20] Ashktorab, Z., Brown, C., & Culotta, A. (2014). Tweedr : Mining
Twitter to Inform Disaster Response, (May), 354–358.
[21] Blanford, J. I., Bernhardt, J., Savelyev, A., Wong-parodi, G., Carleton,
A. M., Titley, D. W., & Maceachren, A. M. (2014). Tweeting and
Tornadoes, (May), 319–323.
[22] Imran, M., Castillo, C., Lucas, J., Meier, P., & Vieweg, S. (n.d.).
AIDR : Artificial Intelligence for Disaster Response, 159–162, 2014.
[23] Pipek, V., Liu, S. B., & Kerne, A. (2014). Crisis Informatics and
Collaboration : A Brief Introduction, 339–345.
https://doi.org/10.1007/s10606-014-9211-4.
[24] Imran, M., Castillo, C., Diaz, F., & Vieweg, S. (2015). Processing
Social Media Messages in Mass Emergency : A Survey (Vol. 47).
[25] J. Mathew, M. Luo, C. K. Pang, and H. L. Chan, “Kernel-Based
SMOTE for SVM Classification of Imbalanced Datasets,” pp. 1127–
1132, 2015.
[26] J. Wu and J. Xin, “SVM Learning from Imbalanced Microanuerysm
Candidate Datasets used Feature Selection by Gini Index,” no. August,
pp. 1637–1641, 2015.
[27] Olteanu, A., Vieweg, S., & Castillo, C. (2015). What to Expect When
the Unexpected Happens : Social Media Communications Across
Crises, 994–1009.
[28] Wu, J., & Xin, J. (2015). SVM Learning from Imbalanced
Microanuerysm Candidate Datasets used Feature Selection by Gini
Index, (August), 1637–1641.
[29] Caragea, C., Silvescu, A., Tapia, A. H., & Vale, S. (2016). Identifying
Informative Messages in Disaster Events using Convolutional Neural
Networks, (May).
[30] Landwehr, P. M., Wei, W., Kowalchuck, M., & Carley, K. M. (2016).
Using tweets to support disaster planning , warning and response.
Safety Science. https://doi.org/10.1016/j.ssci.2016.04.012
[31] Demidova, L., & Klyueva, I. (2017). SVM Classification :
Optimization with the SMOTE Algorithm for the Class Imbalance
Problem, (June), 17–20.
[32] Wu, L., & Wang, Y. (2017). Fusing Gini Index and Term Frequency
for Text Feature Selection. https://doi.org/10.1109/BigMM.2017.65
[33] Aliady, H., Tuasikal, N. J., Widodo, E., Statistika, P. S., Indonesia, U.
I., Statistika, P. S., ... Forest, R. (2018). IMPLEMENTASI SUPPORT
VECTOR MACHINE ( SVM ) DAN RANDOM FOREST,
2018(Sentika), 23–24.
[34] Ganguly, D., & Ghosh, K. (2018). Contextual Word Embedding : A
Case Study in Clustering Tweets about Emergency Situations Transformed Word Embedding ; Tweets Clustering, (Equation 2), 73–
74.
[35] Ganguly, D., & Ghosh, K. (2018). Contextual Word Embedding : A
Case Study in Clustering Tweets about Emergency Situations
Transformed Word Embedding ; Tweets Clustering, (Equation 2), 73–
74. [17] Sumber Buku: Budiharto, Widodo. ” Machine
Learning & Computational Intelligence,” ANDI, Yogyakarta.
[36] Wardhani, N. I. A. K., Kurniawan, S., & Setiawan, H. (2018).
Sentiment Analysis Article News Coordinator Minister of Maritime
Affairs Using Algorithm Naive Bayes and Ssupport Vector Machine
With Particle Swarm Optimization, 96(24), 8365–8378.