Analysis and interpretability of machine learning models to classify thyroid disease

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Institute of Information and Communication Technology
→
View Item

dc.contributor.advisor	Hossen Asiful Mustafa, Dr.
dc.contributor.author	Sumya Akter
dc.date.accessioned	2024-12-17T03:35:42Z
dc.date.available	2024-12-17T03:35:42Z
dc.date.issued	2023-10-30
dc.identifier.uri	http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6915
dc.description.abstract	Thyroid disease classification plays a crucial role in early diagnosis and effective treat- ment of thyroid disorders. Machine learning (ML) techniques have demonstrated re- markable potential in this domain, offering accurate and efficient diagnostic tools. Most of the real-life datasets have imbalanced characteristics that hamper the overall perfor- mance of the classifiers. Existing data balancing techniques perform the whole dataset at a time that sometimes causes overfitting and underfitting. However, the complexity of some machine learning models, often referred to as ”black boxes,” raises concerns about their interpretability and clinical applicability. This thesis presents a compre- hensive study focused on the analysis and interpretability of various machine-learning models for classifying thyroid diseases. In this research, we have divided our work into four stages: i) A data balancing technique that follows a process of clustering the dataset into several segments using K-means clustering, and both oversampling and un- dersampling operations are performed on each cluster. To find the optimal number of clusters, we use the elbow method. ii) A range of ML algorithms, such as AdaBoost, Decision Tree (DT), Extreme Gradient Boosting (XGB), Extra Tree Classifier (ETC), K-Nearest Neighbour (KNN), Logistic Regression (LR), Multilayer Perceptron (MLP), Random Forest (RF) and Support Vector Machine (SVM) etc. are applied to thyroid dataset on our proposed pipeline which increases the performance diagnosing thyroid disease. Model performance is evaluated using standard metrics like accuracy, preci- sion, recall, F1-score, AUC score, and ROC curve, highlighting the efficacy of each al- gorithm in thyroid disease classification. iii) Addressing the interpretability challenge, the paper explores techniques for model explanation and feature importance analysis using eXplainable Artificial Intelligence (XAI) tools globally as well as locally. iv) A survey has been conducted by the domain experts, which supports the XAI explain- ability at a stage but still has some mismatch. But most of the features are ranked the same, and their weight is the same as the XAI tools’ results and opinions of the domain experts. Such insights are essential for building trust in the machine learning models among medical practitioners, enabling them to make more informed decisions based on the model’s predictions. Experimental results show that our proposed mech- anism is efficient in diagnosing thyroid disease and can explain the models effectively. The findings contribute to bridging the gap between the adoption of advanced machine learning techniques and the clinical requirements of transparency and accountability in diagnostic decision-making.	en_US
dc.language.iso	en	en_US
dc.publisher	Institute of Information and Communication Technology, BUET	en_US
dc.subject	Machine learning	en_US
dc.title	Analysis and interpretability of machine learning models to classify thyroid disease	en_US
dc.type	Thesis-MSc	en_US
dc.contributor.id	0417312039	en_US
dc.identifier.accessionNumber	119675
dc.contributor.callno	006.31/SUM/2023	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Dissertations/Theses - Institute of Information and Communication Technology
Post graduate dissertations (Theses) of Institute of Information and Communication Technology (IICT)

Show simple item record

Search BUET IR

Advanced Search

Browse

All of IR
This Collection

Analysis and interpretability of machine learning models to classify thyroid disease

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account