Multi-label classification of human protein subcellular locations from microscopy images using convolutional neural networks

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Electrical and Electronic Engineering
→
View Item

Multi-label classification of human protein subcellular locations from microscopy images using convolutional neural networks

Mitra, Avijit

URI: http://lib.buet.ac.bd:8080/xmlui/handle/123456789/5737

Date: 2021-01-16

Abstract:

Proteins are the ‘doers’ of all living organisms. Subcellular localization of human proteins plays an important role for inferring their structures and functions in our cells. Due to the recent advancement of molecule imaging techniques, the importance of analyzing image data for protein subcellular locations is now more than ever.At the same time,it is getting widely popular instead of conventional 1D protein amino acid sequence data. Classification of human protein cell localization is important to automate and accelerate different biomedical research tasks as well as the diagnosis of different diseases to reduce the time and manual effort. Although the use of deep convolutional neural networks (DCNN) to classify images is a very straightforward approach, our task comes with multiple challenges. First, there are 28 distinct labels, assigned to a single image. Second, there is a strong class imbalance in the dataset with some labels appearing in less than 0.3% of the data. Lastly, the protein location classification task is to be performed across a wide range of different human cells. We aim at overcoming these through different approaches. In this work, our principal goal is to presentan end-to-end system for the classification of mixed pattern protein subcellular localization from confocal microscopy images, using convolutional neural networks. We showed the outcomes of several experimental setups for a highly imbalanced dataset and investigated their effectiveness. We also demonstrate that oversampling outweighs cost sensitive learning to handle the data imbalance problem. In addition, we show that an ensemble of models always benefits our task. Using these observations, we managed to achieve a public macro F1 score of 0.574 and a private macro F1 score of 0.515 on the dataset for Kaggle competition - Human Protein Atlas Image Classification.

Show full item record

Files in this item

Name: Full Thesis.pdf

Size: 2.533Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

Dissertations/Theses - Department of Electrical and Electronic Engineering
Post graduate dissertations (Theses) of Electrical and Electronic Engineering (EEE)

Multi-label classification of human protein subcellular locations from microscopy images using convolutional neural networks

Multi-label classification of human protein subcellular locations from microscopy images using convolutional neural networks

Abstract:

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account