DSpace Repository

Ensemble approach with insightful features for spoiler detection

Show simple item record

dc.contributor.advisor Islam, Dr. Md. Monirul
dc.contributor.author Noor, Sabah Binte
dc.date.accessioned 2019-02-17T04:46:12Z
dc.date.available 2019-02-17T04:46:12Z
dc.date.issued 2018-03-21
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/5119
dc.description.abstract Suspense is an important element to absorb an audience into a story. Early revealing of plot twists, climax, or endings may eliminate that suspense and therefore impair the audience enjoyment. Any content that have such critical information regarding an art of fiction is considered as a spoiler. Due to the heavy use of internet and smartphones, it has become impossible to prevent oneself from spoilers posted in popular social networks. The aim of this study is to develop an effective machine learning model to detect spoilers in text. Extracting relevant features that represent the concept of text efficiently is one of the major challenges regarding this problem. Therefore, we employ syntactically related word pairs, along with traditional bag-of-words, in our feature extraction technique. Naturally, the number of spoilers are significantly low in datasets compared to that of spoiler free texts. To tackle this imbalance in data distribution, we propose a novel distribution-based amalgam minority oversampling technique (DAMOT). It oversamples the dataset by a combination of original and synthetic minor instances based on the distribution over their classes. We also employ adaboost algorithm to enhance the performance of our model. Our proposed models have been tested extensively on IMDb (Internet Movie Database) reviews and DAMOT, with our feature extraction technique outperformed the baseline methods on a significant scale by bringing balance in different performance metrics. en_US
dc.language.iso en en_US
dc.publisher Department of computer Science and Engineering en_US
dc.subject Machine learning en_US
dc.title Ensemble approach with insightful features for spoiler detection en_US
dc.type Thesis-MSc en_US
dc.contributor.id 0413052049 en_US
dc.identifier.accessionNumber 116816
dc.contributor.callno 006.31/SAB/2018 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account