Leveraging frequency domain attributes for multi-modal medical image segmentation using convolutional neural networks

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Biomedical Engineering
→
View Item

dc.contributor.advisor	Hasan Al Banna, Dr. Taufiq
dc.contributor.author	Shams Nafisa Ali
dc.date.accessioned	2026-01-05T04:27:36Z
dc.date.available	2026-01-05T04:27:36Z
dc.date.issued	2025-05-28
dc.identifier.uri	http://lib.buet.ac.bd:8080/xmlui/handle/123456789/7232
dc.description.abstract	Recent advances in deep learning have significantly enhanced medical image segmen- tation. As medical data becomes increasingly diverse and complex, the need for ar- chitectures that can generalize across modalities and anatomical structures has grown paramount. While CNNs, Transformers, and their hybrid architectures have addressed issues such as limited receptive fields and redundant feature representations, most mod- els remain confined to the spatial domain—overlooking the frequency domain’s rich structural cues. Some recent studies have explored spectral information at the feature level; however, frequency-domain integration at the supervision level remains largely untapped. To this end, we propose Phi-SegNet, a CNN-based architecture that incor- porates phase-aware cues at both architectural and optimization levels. The network integrates Bi-Feature Mask Former (BFMF) modules that blend neighboring encoder features to reduce semantic gaps, and Reverse Fourier Attention (RFA) blocks that re- fine decoder outputs using phase-regularized embeddings. A dedicated phase-aware loss aligns these embeddings with structural priors, forming a closed feedback loop that emphasizes boundary precision. Evaluated on five public datasets spanning ultra- sound, X-ray, histopathology, MRI, and colonoscopy, Phi-SegNet consistently achieves state-of-the-art performance, particularly excelling in fine-grained boundary segmen- tation tasks. On average, across these five datasets, Phi-SegNet achieves a relative improvement of 1.54% ± 1.26% in IoU and 1.10% ± 0.69% in F1-score over the next best-performing model for each dataset. Additionally, under generalized training using a unified dataset comprising all five modalities, as well as in cross-dataset generaliza- tion scenarios involving unseen datasets from the known domain, Phi-SegNet exhibits robust and superior performance—highlighting its adaptability and modality-agnostic design. These findings demonstrate the potential of leveraging spectral priors in both learning and supervision, offering a new direction toward generalized, universal, and anatomically precise segmentation frameworks.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Biomedical Engineering, BUET	en_US
dc.subject	Diagnostic imaging	en_US
dc.title	Leveraging frequency domain attributes for multi-modal medical image segmentation using convolutional neural networks	en_US
dc.type	Thesis-MSc	en_US
dc.contributor.id	0422182001	en_US
dc.identifier.accessionNumber	120727
dc.contributor.callno	616.0754/SHA/2025	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Dissertations/Theses - Department of Biomedical Engineering
Post graduate dissertations (Theses) of Biomedical Engineering (BME)

Show simple item record

Search BUET IR

Advanced Search

Browse

All of IR
This Collection

Leveraging frequency domain attributes for multi-modal medical image segmentation using convolutional neural networks

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account