Textual affect : A Computational Study Towards the Detection, Utility, and Algorithmic Fairness of Emotions in Text
Abstract
Emotions are highly useful in modeling human behavior being at the core of what makes us human. Research in Affective Computing deals with developing computational systems capable of understanding and expressing emotions by adapting human emotional states through heterogeneous modalities such as textual, visual, and audio. Text prevails to be the most commonly used modality to express and share emotions with the boom of online social media and micro-blogging platforms. This Thesis presents a computational study towards emotions in text (or Textual Affect), exploring three different and significant facets of Textual Affective Computing. The first facet attempted in this Thesis is the detection of textual emotions, specifically through readers’ perspective, i.e., Readers’ Emotion Detection. The second facet in tends to study how textual effects can be utilized to improve the performance of a downstream task. Towards this direction of study a very significant application of fake news detection, within the very crucial domain of health is considered. The third facet considers the algorithmic fairness perspective of textual affective computing, a
recent and demanding area of research related to ethics in Artificial Intelligence. In this direction, the study attempts to identify the existence of affective bias, if any, in textual affective computing systems developed using large pre-trained language models.
The first facet of textual affective computing attempted in this Thesis, that of Readers’ Emotion Detection develops a novel deep learning-based model REDAffectiveLM to predict readers’ emotion profiles from short-text documents. The proposed model
is constructed using a transformer-based pre-trained language model in tandem with affect-enriched Bi-LSTM+Attention to leverage the utility of both contextual and affect-enriched representations. To conduct the study two Readers’ Emotion News datasets are procured, along with a benchmark dataset. The extensive set of performance evaluations presented in this study shows that the proposed model significantly outperforms the baselines belonging to various categories. The study also presents behavior evaluation experiments over the affect-enriched Bi-LSTM+Attention network, which shows that the process of affect enrichment helps to identify key terms responsible for readers’ emotion detection, thereby improving the prediction. The second facet considers the utility of textual effects for detecting fake news in the health domain and presents evidence that emotion-cognizant representations are significantly more suited for the task. The study proposes a novel methodology to develop emotion-amplified text representations by leveraging an external emotion lexicon. To conduct the study a dataset containing fake and legitimate health news articles is procured. Evaluations are performed to analyze the utility of emotion- amplified representations over raw text representations for identifying fake news relating to health in various supervised and unsupervised scenarios. The experiments show consistent and notable empirical gains over a range of technique types and parameter settings, establishing the utility of the emotional information in news articles, an often overlooked aspect, for the task of misinformation identification in the health domain. The third facet is a novel direction of inquiry to identify the existence of Affective Bias, if any, in large pre-trained language model-based textual emotion detection models. That is, the study intends to unveil any biased association of emotions such as anger, fear, joy, and sadness, towards any particular gender, racial, or religious group. The study initially analyzes imbalanced affect distribution or imbalanced affect associations with any particular social group, in the large-scale corpora that are used to pre-train and fine-tune the pre-trained language models, to identify corpus-level affective bias. Later, an extensive set of class-based and intensity-based evaluations using synthetic and non-synthetic bias evaluation corpora are conducted to identify
prediction-level affective bias. The entire results could unveil the existence of affective bias with respect to gender, race, and religion, at both the corpus and prediction level of large pre-trained language models.
Collections
- Doctoral Theses [8]