Bangla voice command recognition with context specific optimization

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

Bangla voice command recognition with context specific optimization

Nafis Sadeq

URI: http://lib.buet.ac.bd:8080/xmlui/handle/123456789/5761

Date: 2021-02-27

Abstract:

Voice command recognition task commonly involves an Automatic Speech Recognition (ASR) system with context-specific optimization. Automatic Speech Recognition system development involves corpus resource development such as phoneme list, text corpus, word dictionary, phonetic dictionary, and speech corpus. These corpus resources are used to train speech recognition models. The performance of the speech recognition systems can be further improved by exploiting user and device-specific contexts. Context information for a specific smartphone user includes contact names, installed apps, songs, media files, location, recent search history, the content of the screen user is looking at, etc. The context information changes frequently so it is desired that the contextual model will be updated on-the-fly within the device. Traditional speech recognition systems usually consist of several individual components such as an acoustic model, a language model, a pronunciation dictionary, etc. So context-specific optimization can be achieved by tuning a particular component like the language model. Recently, end-to-end speech recognition architectures have been very effective in many speech recognition tasks. Incorporating context-specific optimization with the latest end-to-end speech recognition architectures requires a different approach. In this work, we focus on Bangla voice command recognition. We develop an ASR system for voice command recognition tasks and improve the performance further using context-specific optimization. In our work, we develop each linguistic resource in a way that considers language-specific characteristics of Bangla. We enrich our speech corpus with both domain-specific and domain-independent speech data. We also experiment with traditional and end-to-end speech recognition architectures. We propose a novel approach for context-specific optimization of voice commands. We also explore several other approaches for improving ASR performance such as synthetic speech corpus development and semi-supervised speech recognition.

Show full item record

Files in this item

Name: Full Thesis.pdf

Size: 6.318Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

Dissertations/Theses - Department of Computer Science and Engineering
Post graduate dissertations (Theses) of Computer Science Engineering (CSE)

Bangla voice command recognition with context specific optimization

Bangla voice command recognition with context specific optimization

Abstract:

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account