Towards detecting, extracting, and parsing the address information from Bangla signboard: a deep learning-based approach

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

dc.contributor.advisor	Ali, Dr. Mohammed Eunus
dc.contributor.author	Murad, Hasan
dc.date.accessioned	2024-01-13T03:41:31Z
dc.date.available	2024-01-13T03:41:31Z
dc.date.issued	2023-06-22
dc.identifier.uri	http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6528
dc.description.abstract	Retrieving textual information from natural scene images is an active researchareainthefieldofcomputervisionwithnumerouspracticalapplications. Detectingtext regions and extracting text from signboards is a challenging problem due tospecialcharacteristicslikereflectinglights,unevenillumination,orshadowsfoundin real-life natural scene images. With the advent of deep learning-based methods,different sophisticated techniques have been proposed for text detection and textrecognition from the natural scene. Though a significant amount of effort has beendevotedtoextractingnaturalscenetextforresourcefullanguageslikeEnglish,littlehas been done for low-resource languages like Bangla. In this research work, wehave proposed an end-to-end system with deep learning-based models for efficientlydetecting, recognizing, correcting, and parsing address information from Banglasignboards. We have created manually annotated datasets and synthetic datasets totrain signboard detection, address text detection, address text recognition, addresstext correction, and address text parser models. We have conducted a comparativestudy among different CTC-based and Encoder-Decoder model architectures forBangla address text recognition. Moreover,we have designed a novel addresstextcorrectionmodelusingasequence-to-sequencetransformer-basednetworkto improve the performance of Bangla address text recognition model by post-correction.Finally, we have developed a Bangla address text parser using thestate-of-the-arttransformer-basedpre-trainedlanguagemodel.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering (CSE)	en_US
dc.subject	Optical character recognition	en_US
dc.title	Towards detecting, extracting, and parsing the address information from Bangla signboard: a deep learning-based approach	en_US
dc.type	Thesis-MSc	en_US
dc.contributor.id	1018052017	en_US
dc.identifier.accessionNumber	119453
dc.contributor.callno	006.424/HAS/2023	en_US