Risk prediction of loan default using knowledge graph

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

Risk prediction of loan default using knowledge graph

Alam, Md. Nurul

URI: http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6025

Date: 2022-03-09

Abstract:

Loan default risk, also known as credit risk, is one of the significant financial challenges in banking and financial institutions since it involves the uncertainty of the borrowers' ability to perform their contractual obligations. Banks and financial institutions rely on statistical and machine learning methods for loan default prediction to reduce the potential losses of issued loans. These machine learning applications may never achieve their full potential without the semantic context of the data. A knowledge graph is a collection of linked entities and objects that include semantic information to contextualize them. Knowledge graphs allow machines to incorporate human expertise into their decision-making and provide context for machine learning applications. A Knowledge Graph can semantically incorporate various data and link knowledge from many areas without altering its original form, enabling organizations to leverage the power of collective intelligence. Furthermore, knowledge graph embedding is now a widely adopted technique for representing knowledge. This graph embedding preserves the original graph's semantic information and structure. It can be a beneficial source of features for a subsequent machine learning classification task. So, a knowledge graph-based approach will improve the prediction model's performance and interpretability. In this thesis, we present a hybrid approach combining a knowledge graph and machine learning to enhance the performance and rationality of the loan default prediction model. For this purpose, we developed an ontology for the semantic data model. Then, we mapped our semantic data model with a publicly available credit dataset to construct the knowledge graph. Next, we used knowledge graph embedding methods to discover the knowledge graph's semantic and structural content. Finally, we inputted the vectors extracted from the graph embedding as features to the machine learning classifier to forecast loan default. The experimental results demonstrate that incorporating knowledge graph embedding as features can boost the performance of conventional machine learning classifiers in predicting loan default risk. To evaluate the performance of several machine learning classifiers that exhibited strong performance in the credit default prediction task, we employed accuracy, precision, recall, F1 score, MCC, and ROC AUC as evaluation metrics. The “XGBoost + KGE” model performed best in all evaluation measures, with a ROC AUC of 0.836 (an increase of around 10.14% over the conventional technique).

Show full item record

Files in this item

Name: Full Thesis.pdf

Size: 1.040Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

Dissertations/Theses - Department of Computer Science and Engineering
Post graduate dissertations (Theses) of Computer Science Engineering (CSE)

Risk prediction of loan default using knowledge graph

Risk prediction of loan default using knowledge graph

Abstract:

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account