DSpace Repository

Prediction of cervical cancer in Bangladesh using hybrid machine learning algorithms

Show simple item record

dc.contributor.advisor Mondal, Dr. Md. Rubaiyat Hossain
dc.contributor.author Khanam, Fahima
dc.date.accessioned 2022-06-28T04:19:58Z
dc.date.available 2022-06-28T04:19:58Z
dc.date.issued 2021-10-10
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6030
dc.description.abstract The aim of this research work is to apply machine learning algorithms for predicting cervical cancer. Early screening of vulnerable patients is essential to prevent cervical cancer. However, in many developing countries, there is a scarcity of medical facilities for such screening. Hence, research is needed in the field of data-driven diagnosis of cervical cancer. In this thesis, a dataset of cervical cancer patients has been considered, which includes attributes suitable for Bangladeshi patients. Another objective is to classify the patients of the dataset by using a new efficient hybrid algorithm. Firstly, an existing dataset collected from the University of California, Irvine (UCI); a machine learning repository is considered, which consists of 36 attributes and 858 instances. To overcome the imbalance of the data samples, the borderline Synthetic Minority Over-sampling Technique (SMOTE) is used. Next, a new dataset of cervical cancer patients collected from various hospitals in Bangladesh has been introduced. This new dataset consists of 21 attributes and 228 instances. The Recursive Feature Elimination method is applied to both datasets to find the most important attributing to cervical cancer. A number of classifiers, including base, ensemble, and hybrid algorithms, are applied to the datasets. Next, a two-stage hybrid algorithm is proposed where ExtraTreeClassifier is used in the first stage, and a stacking algorithm is used in the second stage. Results show that stacking as a combination of Random Forest, ExtraTreeClassifier, XGBoost, and Bagging exhibits the best classification accuracy of 95.3% for the first dataset. For the second dataset, AdaBoost shows the best classification accuracy of 95.6%. The proposed hybrid method offers classification accuracy of 95.9% and 96.2% for first and second datasets. Hence, the Bangladeshi dataset and the proposed hybrid algorithm can play an essential role in predicting cervical cancer. en_US
dc.language.iso en en_US
dc.publisher Institute of Information and Commutation Technology en_US
dc.subject Diagnostic imaging-Digital techniques-Breast cancer en_US
dc.title Prediction of cervical cancer in Bangladesh using hybrid machine learning algorithms en_US
dc.type Thesis-MSc en_US
dc.contributor.id 0417312045 en_US
dc.identifier.accessionNumber 118614
dc.contributor.callno 616.0754/FAH/2021 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account