Abstract
The Twitter platform is one of the most popular social media environments that gathers concise messages regarding the topics of the moment expressed by its users. Processing sentiments from tweets is a challenging task due to the natural language complexity, misspelling and short forms of words. The goal of this article is to present a hybrid feature for Twitter Sentiment Analysis, focused on information gathered from this social media. The baseline perspective is presented based on different scenarios that take into consideration preprocessing techniques, data representations, methods and evaluation measures. Also, several interesting features are detailed described: the hashtag-based, the fused one, and the raw text feature. All these perspectives are highlighted for proving the high importance and impact that the analysis of tweets has on social studies and society in general. We conducted several experiments that include all these features, with two granularity tweets (word and bigram) on Sanders dataset. The results reveal the idea that best polarity classification performances are produced by the fused feature, but overall the raw feature is better than other approaches. Therefore, the domain-specific features (in this case Twitter hashtags) represent information that can be an important factor for the polarity classification task, not exploited enough.
Citare
Limboi S., Diosan L., Hybrid Features for Twitter Sentiment Analysis, International Conference on Artificial Intelligence and Soft Computing