Comparison of the Performance of Machine Learning Algorithms for Sarcasm Detection in Bahasa

Perbandingan Kinerja Algoritma Machine Learning Untuk Mendeteksi Kalimat Sarkasme Dalam Bahasa Indonesia

  • Mochamad Alfan Rosid Universitas Muhammadiyah Sidoarjo
  • Fajar Muharram Universitas Muhammadiyah Sidoarjo
  • Ghozali Rusyid Affandi Universitas Muhammadiyah Sidoarjo
Keywords: Twitter, Support Vector Machine, Randome Forest, K-Nearest Neighbor

Abstract

Twitter has become a widely used social media. The amount of data held has led to research such as sentiment analysis. Sentiment analysis has a problem when there are sarcasm sentences, the polarity of the sentiment that should be negative, becomes positive sentiment due to the use of sarcasm sentences. The purpose of this study is to compare the performance of three machine learning methods, namely Support Vector Machine, Randome Forest, and K-Nearest Neighbor to detect sarcasm sentences on Twitter social media. These three methods were chosen because they have a good performance in text classification. The dataset used is taken from Indonesian language twitter with crawling technique. From the results of the study, it was found that the Support Vector Machine method had the best performance with a recall value of 0.97, precision 0.98 and f1-score 0.98.

References

[1] K. Rajeswari and P. S. Bala, “Recognization of Sarcastic Emotions of Individuals on Social Network,” Int. J. Pure Appl. Math., vol. 118, no. 7 Special Issue, pp. 253–258, 2018.
[2] E. Riloff, A. Qadir, P. Surve, L. De Silva, N. Gilbert, and R. Huang, “Sarcasm as contrast between a positive sentiment and negative situation,” EMNLP 2013 - 2013 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., no. October, pp. 704–714, 2013.
[3] D. Antonakaki, D. Spiliotopoulos, C. V. Samaras, P. Pratikakis, S. Ioannidis, and P. Fragopoulou, “Social media analysis during political turbulence,” PLoS One, vol. 12, no. 10, pp. 1–23, 2017.
[4] Y. Yunitasari, A. Musdholifah, and A. K. Sari, “Sarcasm Detection For Sentiment Analysis in Indonesian Tweets,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 13, no. 1, p. 53, 2019.
[5] D. Alita, “Multiclass SVM Algorithm for Sarcasm Text in Twitter,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 8, no. 1, pp. 118–128, 2021.
[6] D. Jain, A. Kumar, and G. Garg, “Sarcasm detection in mash-up language using soft-attention based bi-directional LSTM and feature-rich CNN,” Appl. Soft Comput. J., vol. 91, p. 106198, 2020.
[7] P. Parameswaran, A. Trotman, V. Liesaputra, and D. Eyers, “Detecting the target of sarcasm is hard: Really??,” Inf. Process. Manag., vol. 58, no. 4, p. 102599, 2021.
[8] V. K. Gupta, A. Gupta, D. Kumar, and A. Sardana, “Prediction of COVID-19 confirmed, death, and cured cases in India using random forest model,” Big Data Min. Anal., vol. 4, no. 2, pp. 116–123, 2021.
[9] B. K. Bhavitha, A. P. Rodrigues, and N. N. Chiplunkar, “Comparative study of machine learning techniques in sentimental analysis,” Proc. Int. Conf. Inven. Commun. Comput. Technol. ICICCT 2017, no. Icicct, pp. 216–221, 2017.
[10] W. Xing and Y. Bei, “Medical Health Big Data Classification Based on KNN Classification Algorithm,” IEEE Access, vol. 8, pp. 28808–28819, 2020.
Published
2022-07-18
How to Cite
Rosid, M. A., Muharram, F., & Affandi, G. R. (2022). Comparison of the Performance of Machine Learning Algorithms for Sarcasm Detection in Bahasa: Perbandingan Kinerja Algoritma Machine Learning Untuk Mendeteksi Kalimat Sarkasme Dalam Bahasa Indonesia. Procedia of Social Sciences and Humanities, 3, 1192-1195. https://doi.org/10.21070/pssh.v3i.253