Comparison of the Performance of Machine Learning Algorithms for Sarcasm Detection in Bahasa
Perbandingan Kinerja Algoritma Machine Learning Untuk Mendeteksi Kalimat Sarkasme Dalam Bahasa Indonesia
Abstract
Twitter has become a widely used social media. The amount of data held has led to research such as sentiment analysis. Sentiment analysis has a problem when there are sarcasm sentences, the polarity of the sentiment that should be negative, becomes positive sentiment due to the use of sarcasm sentences. The purpose of this study is to compare the performance of three machine learning methods, namely Support Vector Machine, Randome Forest, and K-Nearest Neighbor to detect sarcasm sentences on Twitter social media. These three methods were chosen because they have a good performance in text classification. The dataset used is taken from Indonesian language twitter with crawling technique. From the results of the study, it was found that the Support Vector Machine method had the best performance with a recall value of 0.97, precision 0.98 and f1-score 0.98.
References
[2] E. Riloff, A. Qadir, P. Surve, L. De Silva, N. Gilbert, and R. Huang, “Sarcasm as contrast between a positive sentiment and negative situation,” EMNLP 2013 - 2013 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., no. October, pp. 704–714, 2013.
[3] D. Antonakaki, D. Spiliotopoulos, C. V. Samaras, P. Pratikakis, S. Ioannidis, and P. Fragopoulou, “Social media analysis during political turbulence,” PLoS One, vol. 12, no. 10, pp. 1–23, 2017.
[4] Y. Yunitasari, A. Musdholifah, and A. K. Sari, “Sarcasm Detection For Sentiment Analysis in Indonesian Tweets,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 13, no. 1, p. 53, 2019.
[5] D. Alita, “Multiclass SVM Algorithm for Sarcasm Text in Twitter,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 8, no. 1, pp. 118–128, 2021.
[6] D. Jain, A. Kumar, and G. Garg, “Sarcasm detection in mash-up language using soft-attention based bi-directional LSTM and feature-rich CNN,” Appl. Soft Comput. J., vol. 91, p. 106198, 2020.
[7] P. Parameswaran, A. Trotman, V. Liesaputra, and D. Eyers, “Detecting the target of sarcasm is hard: Really??,” Inf. Process. Manag., vol. 58, no. 4, p. 102599, 2021.
[8] V. K. Gupta, A. Gupta, D. Kumar, and A. Sardana, “Prediction of COVID-19 confirmed, death, and cured cases in India using random forest model,” Big Data Min. Anal., vol. 4, no. 2, pp. 116–123, 2021.
[9] B. K. Bhavitha, A. P. Rodrigues, and N. N. Chiplunkar, “Comparative study of machine learning techniques in sentimental analysis,” Proc. Int. Conf. Inven. Commun. Comput. Technol. ICICCT 2017, no. Icicct, pp. 216–221, 2017.
[10] W. Xing and Y. Bei, “Medical Health Big Data Classification Based on KNN Classification Algorithm,” IEEE Access, vol. 8, pp. 28808–28819, 2020.
Copyright (c) 2022 Mochamad Alfan Rosid, Fajar Muharram, Ghozali Rusyid Affandi
This work is licensed under a Creative Commons Attribution 4.0 International License.