A new statistics-based approach to improve Word2Vec's sentiment classification success

Yükleniyor...
Küçük Resim

Tarih

2021

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Selçuk Üniversitesi

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

Sentiment classification is the process of predicting the emotion that the text wants to give by analyzing the written texts. Studies on estimating the emotion of a sentence or a document rather than the meaning of a word have increased in recent years. In this study, statistical approaches that can be alternative to the current use of the Word2Vec method in sentiment classification are presented. Currently, when a sentiment classification is desired to be made with Word2Vec, the arithmetic average of the vectors created for all words in the relevant document is taken. In this study, the performances of the statistical methods presented as an alternative to the arithmetic mean for 5 different machine learning methods on 2 different data sets were compared. In addition, the results obtained by performing the same studies in Doc2Vec and BoW were compared with Word2Vec. Among the proposed approaches, Median has achieved better results than both the mean and the other two proposed methods. As a reason for this, it can be said that the media shows the central distribution better. Although the Word2Vec-CBOW approach obtained similar values to SG, it was observed that it produced more stable results. Word2Vec has achieved better results than both Doc2Vec and BoW. Among the proposed statistical approaches, it can be said that Median has a positive effect on the success of the system when used with Word2Vec and can be an alternative to the mean approach used in the literature.

Açıklama

Anahtar Kelimeler

Natural Language Processing, Machine Learning, Word Vectors, Sentiment Classification, Deep Learning

Kaynak

Selcuk University Journal of Engineering Sciences

WoS Q Değeri

Scopus Q Değeri

Cilt

20

Sayı

3

Künye

Bilgin, M., (2021). A new statistics-based approach to improve Word2Vec's sentiment classification success. Selcuk University Journal of Engineering Sciences, 20 (03), 63-72.