Revisiting succinylated lysine residue prediction with carefully selected physicochemical and biochemical properties of amino acids

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

Revisiting succinylated lysine residue prediction with carefully selected physicochemical and biochemical properties of amino acids

Shehab, Sarar Ahmed

URI: http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6582

Date: 2022-05-16

Abstract:

Succinylation of lysine residue is a special type of post-translational modification(PTM).Ithasacrucialroleinbalancingtheprocessesofcells.Abnormalsuccinylation can be the cause of cancers, metabolism diseases, inflammation andnervous system diseases.Detecting succinylation sites is of great importance toexplore the function of proteins.However, the experimental methods to detectsuccinylation sites are costly,time and labor consuming.This thus calls forcomputational models with high efficacy and attention has been given in theliterature for developing such models, albeit with only moderate success in thecontextofdifferentevaluationmetrics.Inparticular,theexistingworksfailedto balance the two metrics, sensitivity and specificity, leaving a large room forimprovements in this context. One important aspect in this context is the biochemicaland physicochemical properties of amino acids, which appear to be useful as featuresfor such computational predictors. However, some of the existing computationalmodelsdidnotusethebiochemicalandphysicochemicalpropertiesofaminoacids,while some others used them without considering the inter-dependency among theproperties. In this thesis, we revisit the computational prediction of succinylated lysineresidue (SLR) and use a broad spectrum of weaponry to tackle this problem. Wefirst focus on the biochemical and physicochemical properties of amino acids andformulateanoptimizationproblemtofindcombinationthatismoresuitablefortheproblem at hand considering their inter-dependencies and other factors. In particular,we propose a variant of genetic algorithm, called IBCGA, to search for suitablecombinations thereof for efficient prediction of SLRs. In this context, we leveragethe power of Random Forest (RF) and Balanced RF (a variant of RF to handleimbalanceddata). We then propose three deep learning architectures, CNN+Bi-LSTM (CBL),Bi-LSTM+CNN (BLC) and their combination (CBL BLC) thereby leveraging thepotentialofdeepneuralnetworkarchitecturesforSLRprediction.Wealsoemploydifferent ensembling techniques to improve upon the performance of our models,which includes heterogeneous ensembling of traditional ML models with deeplearning architectures as well. Finally, we apply differential evolution to tune thethreshold of ensemble classifiers thereby providing the biologists and practitionerswithaknobtobalancethesensitivityandspecificity. Thecombinationsofbiochemicalandphysicochemicalpropertiesderivedthroughouroptimizationprocessachievebetterresultsthantheresultsachievedbythe combination of all the properties. In this context, one of the best performingcombinationsconsistsofonlytwoproperties.Asforourdeeplearningarchitectures,

Show full item record

Files in this item

Name: Full Thesis.pdf

Size: 791.2Kb

Format: PDF

View/Open

This item appears in the following Collection(s)

Dissertations/Theses - Department of Computer Science and Engineering
Post graduate dissertations (Theses) of Computer Science Engineering (CSE)

Revisiting succinylated lysine residue prediction with carefully selected physicochemical and biochemical properties of amino acids

Revisiting succinylated lysine residue prediction with carefully selected physicochemical and biochemical properties of amino acids

Abstract:

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account