COMPARISON OF DIFFERENT CLASSIFICATION MODELSFOR SENTIMENT ANALYSIS
DOI:
https://doi.org/10.47344/sdubnts.v63i2.988Keywords:
Kazakh language, sentiment analysis, Naive Bayes, Random Forest, Support Vector Machine, Logistic Regression, Scikit-learnAbstract
In this work, we explored sentiment analysis techniques of
texts using the example of product comments in the Kazakh language. To do
this, we used machine learning methods such as Naive Bayes, Random Forest,
Logistic Regression and Support Vector Machine, as well as text processing
tools: CountVectorizer and TfidfVectorizer. In the process of work,
experiments were carried out with different configurations of models and
parameters of vectorizers. To assess the quality of the models, we used
accuracy, precision, recall and F1-score metrics. The research findings
indicated that the application of machine learning techniques make it possible
to achieve high accuracy in sentiment analysis of comments. The best results
were obtained using the Support Vector Machine and TfidfVectorizer. This
study can be used to further improve the systems for sentiment analysis of
comments in the Kazakh language, which can be useful in monitoring public
opinion in various areas, including business.