DSpace Repository

Revisiting succinylated lysine residue prediction with carefully selected physicochemical and biochemical properties of amino acids

Show simple item record

dc.contributor.advisor Rahman, Dr.M.Sohel
dc.contributor.author Shehab, Sarar Ahmed
dc.date.accessioned 2024-01-22T05:43:33Z
dc.date.available 2024-01-22T05:43:33Z
dc.date.issued 2022-05-16
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6582
dc.description.abstract Succinylation of lysine residue is a special type of post-translational modification(PTM).Ithasacrucialroleinbalancingtheprocessesofcells.Abnormalsuccinylation can be the cause of cancers, metabolism diseases, inflammation andnervous system diseases.Detecting succinylation sites is of great importance toexplore the function of proteins.However, the experimental methods to detectsuccinylation sites are costly,time and labor consuming.This thus calls forcomputational models with high efficacy and attention has been given in theliterature for developing such models, albeit with only moderate success in thecontextofdifferentevaluationmetrics.Inparticular,theexistingworksfailedto balance the two metrics, sensitivity and specificity, leaving a large room forimprovements in this context. One important aspect in this context is the biochemicaland physicochemical properties of amino acids, which appear to be useful as featuresfor such computational predictors. However, some of the existing computationalmodelsdidnotusethebiochemicalandphysicochemicalpropertiesofaminoacids,while some others used them without considering the inter-dependency among theproperties. In this thesis, we revisit the computational prediction of succinylated lysineresidue (SLR) and use a broad spectrum of weaponry to tackle this problem. Wefirst focus on the biochemical and physicochemical properties of amino acids andformulateanoptimizationproblemtofindcombinationthatismoresuitablefortheproblem at hand considering their inter-dependencies and other factors. In particular,we propose a variant of genetic algorithm, called IBCGA, to search for suitablecombinations thereof for efficient prediction of SLRs. In this context, we leveragethe power of Random Forest (RF) and Balanced RF (a variant of RF to handleimbalanceddata). We then propose three deep learning architectures, CNN+Bi-LSTM (CBL),Bi-LSTM+CNN (BLC) and their combination (CBL BLC) thereby leveraging thepotentialofdeepneuralnetworkarchitecturesforSLRprediction.Wealsoemploydifferent ensembling techniques to improve upon the performance of our models,which includes heterogeneous ensembling of traditional ML models with deeplearning architectures as well. Finally, we apply differential evolution to tune thethreshold of ensemble classifiers thereby providing the biologists and practitionerswithaknobtobalancethesensitivityandspecificity. Thecombinationsofbiochemicalandphysicochemicalpropertiesderivedthroughouroptimizationprocessachievebetterresultsthantheresultsachievedbythe combination of all the properties. In this context, one of the best performingcombinationsconsistsofonlytwoproperties.Asforourdeeplearningarchitectures, en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering (CSE) en_US
dc.subject Computer simulation en_US
dc.title Revisiting succinylated lysine residue prediction with carefully selected physicochemical and biochemical properties of amino acids en_US
dc.type Thesis-MSc en_US
dc.contributor.id 419052005 en_US
dc.identifier.accessionNumber 119139
dc.contributor.callno 005.369/SHE/2022 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account