dc.description.abstract |
Although, a number of research studies have been done on natural language processing (NLP) in di erent areas such as Example-based Machine Translation (EBMT), Phrase-based Machine Trans-lation, etc., for di erent pairs of languages such as English to Bengali translator, very few research studies have been done on Bengali to English translation. Popular and widely available translators such as Google translator performs reasonably well when translating among the popular languages such as English, French, or Spanish; however, they make elementary mistakes when translating the languages that are newly introduced to the system such as Bengali, Arabic, etc.
Google Translator uses Neural Machine Translation (NMT) approach with Recurrent Neural Net-work (RNN) to build its multilingual translation system. Prior to NMT, Google Translator used Statistical Machine Translation (SMT) approach. However, these approaches solely depend on the availability of a large parallel corpus of the translating language pairs. As a result, most of the research studies found so far on NLP have been performed keeping English as the base or source language. Here, a good number of widely spoken potential languages still remain nearly unexplored. Bengali, the eighth one in terms of usage all over the world, represents one of the prominent examples among those languages. Therefore, in this study, we explore improvized translation from Bengali to English. To do so, we study both the rule-based translator and the data-driven machine translators (NMT and SMT) in isolation, and in combination with di erent approaches of blending between them. More speci cally, rst, we implement some basic grammatical rules along with identi cation of names as subjects and optimization of Bengali verbs in our rule-based translator. Next, we integrate our rule-based translator with each of the data-driven machine translators (NMT and SMT) separately using di erent approaches. Besides, We perform rigorous experimentation over di erent datasets to reveal a comparison among the di erent approaches in terms of translation accuracy, time complexity, and space complexity. |
en_US |