Publication: Sentiment Analysis from Turkish News Texts with BERT-Based Language Models and Machine Learning Algorithms
Date
Authors
Demir, Engin
Authors
Demir, E.
Bilgin, M.
Advisor
Language
Type
Publisher:
Institute of Electrical and Electronics Engineers Inc.
Journal Title
Journal ISSN
Volume Title
Abstract
Sentiment analysis is defined as text analysis and is defined as identifying the class that the text wants to express emotionally. In this study, sentiment analysis was performed with BERT-based language models and machine learning algorithms on the data obtained from Turkish news texts. ALBERT, DistilBERT, and RoBERTa were used as BERT-based language models, and Naive Bayes, Support Vector Machine, and Random Forest methods were used as machine learning algorithms. Our dataset contains 5000 two-class (positive-negative) sentences, with 90% of the data used for training and 10% for testing. When the results of the experimental studies are examined, the accuracy values of the studies performed with language models have reached higher values than machine learning algorithms. The success rates of the language models are DistilBERT, RoBERTa, and ALBERT and the values obtained are 80%, 80%, and 77% respectively. The ranking of machine learning algorithms is Naive Bayes, Support Vector Machine, and Random Forest and the values obtained are 71%, 68%, and 68%.
Description
Source:
Keywords:
Keywords
Sentiment Analysis, Machine Learning, Language Models, BERT