Medical sound event detection using audio spectrogram fourier network

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Electrical and Electronic Engineering
→
View Item

Medical sound event detection using audio spectrogram fourier network

Naimul Hassan, K.M.

URI: http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6626

Date: 2023-07-25

Abstract:

Soundeventdetection(SED)inmedicalenvironmentsiscrucialforextractingvaluableinformation from diverse sound events such as coughing, sneezing, sniffling, speech,gasping, and snoring. These events carry vital information for diagnosis, monitoring,andprevention.Byutilizingsoundevents,healthcareprofessionalscanmakeinformeddecisions and provide optimal care. Due to the success of Transformer encoder archi-tecturesforsoundeventdetection,theyseemtobeaprudentchoicefordetectingaudioeventsinhospitalsettings.However,applyingTransformerstomedicalaudioeventde-tection faces two significant challenges.Firstly, there is a severe scarcity of medicalaudio data, making it difficult to train Transformer models effectively. Secondly, SEDmodelsmustbecomputationallyefficienttobedeployableinresource-limitedmedicalenvironments.Unfortunately,Transformershavehighcomputationalcomplexityduetothe attention mechanism they employ. To tackle these obstacles, this thesis introducesAudioSpectrogramFourierNetwork(ASFNet),anovelattention-freeTransformeren-coder specifically designed for sound event detection in medical environments.ASFNetreplaces the attention operation with a simplified Fast Fourier Transform. By employ-ing this technique, ASFNet surpasses other methods, achieving an impressive averagemeanaverageprecision(mAP)of0.474witha16.76%relativeimprovement.ASFNetachievesthisperformancewithfewermodelparametersandsmallermodelsize,makingitahighlyefficientandeffectivesolutionfordetectingmedicalaudioevents. Furthermore,speech-privacyisacriticalconsiderationinmedicalaudioeventdetection.It is important to separate speech data from audio recordings to protect privacy of thepatients when collecting the dataset.While audio source separation techniques canseparatespeechsignalsofdifferentspeakers,weneedtodifferentiatespeechandothermedicalaudioeventsofthesamespeaker.Therefore,acustomdatasetwaspreparedandaWave-U-Netmodelwastrainedforseparatingspeechdatafrommedicalaudioeventsduringdataacquisition.Wave-U-Netdemonstratesanoverallsource-to-distortionratio(SDR)of11.829indicatinganear-perfectsourceseparationtask. Therefore, the combination of ASFNet and Wave-U-Net has the potential to play asignificantroleindevelopingspeech-privacyconsciousandresource-efficientmedicalsoundeventdetectionormonitoringsystems.

Show full item record

Files in this item

Name: Full Thesis.pdf

Size: 772.1Kb

Format: PDF

View/Open

This item appears in the following Collection(s)

Dissertations/Theses - Department of Electrical and Electronic Engineering
Post graduate dissertations (Theses) of Electrical and Electronic Engineering (EEE)

Medical sound event detection using audio spectrogram fourier network

Medical sound event detection using audio spectrogram fourier network

Abstract:

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account