DSpace Repository

Analysis and interpretability of machine learning models to classify thyroid disease

Show simple item record

dc.contributor.advisor Hossen Asiful Mustafa, Dr.
dc.contributor.author Sumya Akter
dc.date.accessioned 2024-12-17T03:35:42Z
dc.date.available 2024-12-17T03:35:42Z
dc.date.issued 2023-10-30
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6915
dc.description.abstract Thyroid disease classification plays a crucial role in early diagnosis and effective treat- ment of thyroid disorders. Machine learning (ML) techniques have demonstrated re- markable potential in this domain, offering accurate and efficient diagnostic tools. Most of the real-life datasets have imbalanced characteristics that hamper the overall perfor- mance of the classifiers. Existing data balancing techniques perform the whole dataset at a time that sometimes causes overfitting and underfitting. However, the complexity of some machine learning models, often referred to as ”black boxes,” raises concerns about their interpretability and clinical applicability. This thesis presents a compre- hensive study focused on the analysis and interpretability of various machine-learning models for classifying thyroid diseases. In this research, we have divided our work into four stages: i) A data balancing technique that follows a process of clustering the dataset into several segments using K-means clustering, and both oversampling and un- dersampling operations are performed on each cluster. To find the optimal number of clusters, we use the elbow method. ii) A range of ML algorithms, such as AdaBoost, Decision Tree (DT), Extreme Gradient Boosting (XGB), Extra Tree Classifier (ETC), K-Nearest Neighbour (KNN), Logistic Regression (LR), Multilayer Perceptron (MLP), Random Forest (RF) and Support Vector Machine (SVM) etc. are applied to thyroid dataset on our proposed pipeline which increases the performance diagnosing thyroid disease. Model performance is evaluated using standard metrics like accuracy, preci- sion, recall, F1-score, AUC score, and ROC curve, highlighting the efficacy of each al- gorithm in thyroid disease classification. iii) Addressing the interpretability challenge, the paper explores techniques for model explanation and feature importance analysis using eXplainable Artificial Intelligence (XAI) tools globally as well as locally. iv) A survey has been conducted by the domain experts, which supports the XAI explain- ability at a stage but still has some mismatch. But most of the features are ranked the same, and their weight is the same as the XAI tools’ results and opinions of the domain experts. Such insights are essential for building trust in the machine learning models among medical practitioners, enabling them to make more informed decisions based on the model’s predictions. Experimental results show that our proposed mech- anism is efficient in diagnosing thyroid disease and can explain the models effectively. The findings contribute to bridging the gap between the adoption of advanced machine learning techniques and the clinical requirements of transparency and accountability in diagnostic decision-making. en_US
dc.language.iso en en_US
dc.publisher Institute of Information and Communication Technology, BUET en_US
dc.subject Machine learning en_US
dc.title Analysis and interpretability of machine learning models to classify thyroid disease en_US
dc.type Thesis-MSc en_US
dc.contributor.id 0417312039 en_US
dc.identifier.accessionNumber 119675
dc.contributor.callno 006.31/SUM/2023 en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account