Comparative Evaluation of Lexicons in Performing Sentiment Analysis
Twitter is one of the fastest growing social media platforms which allows users to express themselves in short text messages on a wide range of topics. The amount of text produced allows for the understanding of human behaviour. One of the analysis that can be performed is sentiment analysis. Even though sentiment analysis has been researched for many years, there are still several difficulties in performing it such as in handling internet slangs, abbreviations, and emoticons which is common in social media. This paper investigates the performance of two lexicons which are VADER and TextBlob in performing sentiment analysis on 7,997 tweets. Out of the 7,997 tweets, 300 tweets were then randomly selected and three experts in psychology and human development were asked to classify the tweets manually based on three polarities. From the study, it is found that both lexicons have an acceptable accuracy rate of 79% for VADER and 73% for TextBlob. Considering all of the performance score, VADER emerged as a better lexicon as compared to TextBlob. The result of this study serves to help researches in deciding which lexicon to use in performing sentiment analysis for social media texts including microblogs.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a copyright form (JACTA) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).