Event Details

StretchVADER – A Rule-based Technique to Improve Sentiment Intensity Detection using Stretched Words and Fine-Grained Sentiment Analysis

Presenter: Muhammad Naveed Jokhio
Supervisor:

Date: Mon, January 8, 2024
Time: 14:00:00 - 00:00:00
Place: ZOOM - Please see below.

ABSTRACT

Zoom Details:

Join Zoom Meeting

https://uvic.zoom.us/j/6881672978?pwd=clJ0ckRjUWZrcjdGTmRuQ2dPb01Xdz09

Meeting ID: 688 167 2978

Password: 645201

One tap mobile

+16475580588,,6881672978# Canada

+17789072071,,6881672978# Canada

Dial by your location

        +1 647 558 0588 Canada

        +1 778 907 2071 Canada

Meeting ID: 688 167 2978

Note: Please log in to Zoom via SSO and your UVic Netlink ID

ABSTRACT

Watching a horror movie and someone shouts “HEEEELLLPPPPPPPPP” or someone replies to your joke with a huge “HAHAHAHAHAHAHAHAHAHAHA” is known as word stretching. Word stretching is not only an integral part of spoken language but is also found in many texts. Though, it is very rare in formal writing, it is frequently used on social media. Word stretching emphasizes the meaning of the underlying word, changes the context and impacts the sentiment intensity of the sentence.

In this work, a rule-based fine-grained approach to sentiment analysis named StretchVADER is introduced that extends the capabilities of the rule-based approach called VADER. StretchVADER detects improved sentiment intensity using textual features such as stretched words and smileys by calculating a StretchVADER Score (SVS). This score is also used to label the dataset. It has been observed that many tweets contain stretched words and smileys, e.g. 28.5% in a randomly extracted dataset from Twitter. A dataset is also generated and annotated using SVS which contains detailed features related to stretched words and smileys. Finally, Machine Learning (ML) models are evaluated on two different data encoding techniques, e.g. TF-IDF and Word2Vec. The results obtained show that the XGBoost algorithm with 1500 gradient-boosted trees and TF-IDF data encoding achieved a higher accuracy, precision, recall and F1-score than the other ML models, i.e. 91.24%, 91.11%, 91.24% and 91.08%, respectively.