DSpace Repository

Leveraging frequency domain attributes for multi-modal medical image segmentation using convolutional neural networks

Show simple item record

dc.contributor.advisor Hasan Al Banna, Dr. Taufiq
dc.contributor.author Shams Nafisa Ali
dc.date.accessioned 2026-01-05T04:27:36Z
dc.date.available 2026-01-05T04:27:36Z
dc.date.issued 2025-05-28
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/7232
dc.description.abstract Recent advances in deep learning have significantly enhanced medical image segmen- tation. As medical data becomes increasingly diverse and complex, the need for ar- chitectures that can generalize across modalities and anatomical structures has grown paramount. While CNNs, Transformers, and their hybrid architectures have addressed issues such as limited receptive fields and redundant feature representations, most mod- els remain confined to the spatial domain—overlooking the frequency domain’s rich structural cues. Some recent studies have explored spectral information at the feature level; however, frequency-domain integration at the supervision level remains largely untapped. To this end, we propose Phi-SegNet, a CNN-based architecture that incor- porates phase-aware cues at both architectural and optimization levels. The network integrates Bi-Feature Mask Former (BFMF) modules that blend neighboring encoder features to reduce semantic gaps, and Reverse Fourier Attention (RFA) blocks that re- fine decoder outputs using phase-regularized embeddings. A dedicated phase-aware loss aligns these embeddings with structural priors, forming a closed feedback loop that emphasizes boundary precision. Evaluated on five public datasets spanning ultra- sound, X-ray, histopathology, MRI, and colonoscopy, Phi-SegNet consistently achieves state-of-the-art performance, particularly excelling in fine-grained boundary segmen- tation tasks. On average, across these five datasets, Phi-SegNet achieves a relative improvement of 1.54% ± 1.26% in IoU and 1.10% ± 0.69% in F1-score over the next best-performing model for each dataset. Additionally, under generalized training using a unified dataset comprising all five modalities, as well as in cross-dataset generaliza- tion scenarios involving unseen datasets from the known domain, Phi-SegNet exhibits robust and superior performance—highlighting its adaptability and modality-agnostic design. These findings demonstrate the potential of leveraging spectral priors in both learning and supervision, offering a new direction toward generalized, universal, and anatomically precise segmentation frameworks. en_US
dc.language.iso en en_US
dc.publisher Department of Biomedical Engineering, BUET en_US
dc.subject Diagnostic imaging en_US
dc.title Leveraging frequency domain attributes for multi-modal medical image segmentation using convolutional neural networks en_US
dc.type Thesis-MSc en_US
dc.contributor.id 0422182001 en_US
dc.identifier.accessionNumber 120727
dc.contributor.callno 616.0754/SHA/2025 en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account