Abstract:
Thyroid disease classification plays a crucial role in early diagnosis and effective treat- ment of thyroid disorders. Machine learning (ML) techniques have demonstrated re- markable potential in this domain, offering accurate and efficient diagnostic tools. Most of the real-life datasets have imbalanced characteristics that hamper the overall perfor- mance of the classifiers. Existing data balancing techniques perform the whole dataset at a time that sometimes causes overfitting and underfitting. However, the complexity of some machine learning models, often referred to as ”black boxes,” raises concerns about their interpretability and clinical applicability. This thesis presents a compre- hensive study focused on the analysis and interpretability of various machine-learning models for classifying thyroid diseases. In this research, we have divided our work into four stages: i) A data balancing technique that follows a process of clustering the dataset into several segments using K-means clustering, and both oversampling and un- dersampling operations are performed on each cluster. To find the optimal number of clusters, we use the elbow method. ii) A range of ML algorithms, such as AdaBoost, Decision Tree (DT), Extreme Gradient Boosting (XGB), Extra Tree Classifier (ETC), K-Nearest Neighbour (KNN), Logistic Regression (LR), Multilayer Perceptron (MLP), Random Forest (RF) and Support Vector Machine (SVM) etc. are applied to thyroid dataset on our proposed pipeline which increases the performance diagnosing thyroid disease. Model performance is evaluated using standard metrics like accuracy, preci- sion, recall, F1-score, AUC score, and ROC curve, highlighting the efficacy of each al- gorithm in thyroid disease classification. iii) Addressing the interpretability challenge, the paper explores techniques for model explanation and feature importance analysis using eXplainable Artificial Intelligence (XAI) tools globally as well as locally. iv) A survey has been conducted by the domain experts, which supports the XAI explain- ability at a stage but still has some mismatch. But most of the features are ranked the same, and their weight is the same as the XAI tools’ results and opinions of the domain experts. Such insights are essential for building trust in the machine learning models among medical practitioners, enabling them to make more informed decisions based on the model’s predictions. Experimental results show that our proposed mech- anism is efficient in diagnosing thyroid disease and can explain the models effectively. The findings contribute to bridging the gap between the adoption of advanced machine learning techniques and the clinical requirements of transparency and accountability in diagnostic decision-making.