Abstract:
Literature, as an imitation of human behavior, portrays the picture of society. Literary analysis offers a meaningful analysis of the literature by involving critical thinking from multiple perspectives. Analyzing the writing styles of authors and articles is a key to supporting various stylometry analysis tasks such as author attribution, genre identification, etc. Over the years, rich sets of features that include stylometry, bag-of-words, n-grams have been widely used to perform such literary analysis. However, the effectiveness of these features largely depends on the linguistic aspects of a particular language and the characteristics of the datasets. Techniques based on these feature sets cannot give desired results across domains. Consequently, social structures and real-world incidents often impact contemporary literary fiction. However, existing researches in literary fiction analysis explain these phenomena in a mostly non-technical perspective through the critical analysis of stories. Character networks (or graphs), in this scenario, can be particularly suitable for information retrieval from fiction to address various high-level problems.
In this study, we perform literary analysis from both perspectives by solving stylometry tasks as well as incorporating character networks. We are the first to utilize character interaction graphs to answer a wide range of social questions regarding the influence of contemporary society on literary fiction. Our study involves constructing character interaction graphs from fiction, extracting graph features, and exploiting these features to resolve these queries. Experimental evaluation of influential Bengali fiction over more than half a century demonstrates that character interaction graph can be highly effective in certain types of assessments and information retrieval from literary fiction. We also propose a novel word2vec graph based modeling of a story that can rightly capture both the context and the structure of the story. By using these word2vec graph based features, we develop a classification technique to perform several stylometry tasks: author attribution, genre detection, stylochronometry. Our detailed experimental study with a comprehensive set of literary writings from famous authors of Bengali literature shows the effectiveness of this method over traditional feature based approaches.