DSpace Repository

A Deep Ensemble Approach of Anger Detection From Audio-Textual Conversation

Show simple item record

dc.contributor.advisor Bayzid, Dr. Md. Shamsuzzoha
dc.contributor.author Nahar, Mahjabin
dc.date.accessioned 2023-08-08T04:27:42Z
dc.date.available 2023-08-08T04:27:42Z
dc.date.issued 2022-05-15
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6427
dc.description.abstract Anger detection from conversations has many real-life applications that include improving interpersonal communications, providing customer services, and enhancing workplace performance. Despite its numerous applications in a variety of domains, anger is one of the least studied basic human emotions. The existing works on anger detection mostly deal with audio-only data, though text transcriptions can be directly obtained from spoken conversations. In this thesis, we propose novel deep learning-based approaches for offline and online anger detection from audio-textual data obtained from real-life conversations. Offline anger detection deals with detecting anger from a pre-collected audio-textual conversation, while online anger detection predicts anger in the subsequent utterances of a conversation from the previous utterances. For offline anger detection, we introduce an ensemble approach that combines handcrafted acoustic features, SincNet-based raw waveform features, and BERT-based textual features in a mid-level fusion scheme within an attention-based CNN architecture. In addition, the model includes a gender classifier to incorporate gender information into offline anger detection. On the other hand, for online anger detection, which predicts the anger of future conversational utterances from current (and past) utterances, we propose a transformer-based technique that combines audio and textual features in a mid-level fusion scheme, utilizing an ensemble-based downstream classifier. We demonstrate the efficacy of our proposed approaches using two data sets: the Bengali call-center data set and the IEMOCAP data set. Experimental results show that our proposed approaches outperform the state-of-the-art baselines by a significant margin. For offline anger recognition, our model achieves an F1 score of 85.5% on the Bengali call-center data set and 91.4% on the IEMOCAP data set. For online anger recognition, our model yields an F1 score of 66.9% on the Bengali call-center data set and 67.7% on the IEMOCAP data set. Additionally, we vary different utterance parameters, such as the numbers of input and output utterances and observe their effect on the performance of anger detection. en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering (CSE) en_US
dc.subject Machine learning en_US
dc.title A Deep Ensemble Approach of Anger Detection From Audio-Textual Conversation en_US
dc.type Thesis-MSc en_US
dc.contributor.id 1017052003 en_US
dc.identifier.accessionNumber 119097
dc.contributor.callno 006.31/MAH/2022 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account