Abstract:
With the rapid advancement in the field of information and communication technology, people are getting connected with each other via social media sharing contents like texts, images, or posts. Since the trend of sharing thoughts, feel- ings, or opinions through social media has become an indispensable part of our life, social media platform has opened the way of being a victim of cyberbullying significantly more than before. Social distancing, due to the effect of post COVID 19 pandemic situation, causes a noteworthy rise up to be a victim of cyberbul- lying in social media. This work proposes a hybrid deep learning based classifier that combines a self-attention layer with BiLSTM to differentiate between bully and non-bully texts in bangla language from different social media. We have col- lected and labeled our work dataset from Facebook, Youtube, Twitter, TikTok etc. Context-based data augmentation is applied to improve the performance of the model. Existing algorithms for sentiment analysis tasks like SVM, Random Forest, Naive Bayes, LSTM, GRU, BERT, etc. are experimented and comparative analysis among these models and our proposed hybrid model is also demonstrated. This research combines prominent feature extraction techniques like count vectorizer, Tf-Idf, and transformer-based contextual word embedding. The experimental re- sult depicts that our proposed hybrid model outperforms all the previous works in cyberbullying detection in Bangla by achieving 89.3% accuracy.