Improving phylogenomic analysis using naive binning with quartet-based super tree estimation

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

dc.contributor.advisor	Bayzid, Dr. Md. Shamsuzzoha
dc.contributor.author	Munmun, Farha Akhter
dc.date.accessioned	2025-04-15T06:07:47Z
dc.date.available	2025-04-15T06:07:47Z
dc.date.issued	2024-09-18
dc.identifier.uri	http://lib.buet.ac.bd:8080/xmlui/handle/123456789/7038
dc.description.abstract	Estimating a species tree from biomolecular sequences is extremely diﬀicult, especially when confronted with gene tree heterogeneity resulting from incomplete lineage sorting (ILS). Two of the most popular techniques for estimating species tree are: combined analysis (CA), which concatenates multiple sequence alignments of different genes into a single supergene alignment and then estimates a tree from this alignment, and another one is summary methods, which compute gene trees from different loci and then combine the inferred gene trees into a species tree. CA could be highly accurate in many cases as the combined gene alignments offer a high level of phylogenetic signals. However, it is agnostic about gene tree discordance (i.e., different genes having different evolutionary histories), leading to statistical inconsistency. On the other hand, summary methods can explicitly account for gene tree discordance and the underlying biological reasons, and thus could be statistically consistent. But they do not perform well when the number of genes is limited and the gene trees are not well estimated (i.e, gene tree estimation errors are prevalent). In this study, we have introduced a hybrid pipeline for species tree estimation that combines the strengths of both the combined analysis method and summary methods. Specifically, we have updated the process flow of a widely used quartet-based summary method called SVDquartets by combining SVDquartets with an existing technique called “binning” and a highly accurate quartet amalgamation technique wQFM. We assessed the performance of our proposed hybrid model on a collection of simulated and real biological datasets that cover a wide range of challenging model conditions with varying numbers of genes, amounts of gene tree estimation errors, and levels gene tree discordance. Our extensive evaluation studies on on both simulated and real biological datasets suggest that this hybrid model could be a promising approach for estimating species trees, especially in the presence of gene tree estimation error due to short gene sequences.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering (CSE), BUET	en_US
dc.subject	Heuristic programming	en_US
dc.title	Improving phylogenomic analysis using naive binning with quartet-based super tree estimation	en_US
dc.type	Thesis-MSc	en_US
dc.contributor.id	0419052092	en_US
dc.identifier.accessionNumber	119880
dc.contributor.callno	119880	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Dissertations/Theses - Department of Computer Science and Engineering
Post graduate dissertations (Theses) of Computer Science Engineering (CSE)

Show simple item record

Search BUET IR

Advanced Search

Browse

All of IR
This Collection

Improving phylogenomic analysis using naive binning with quartet-based super tree estimation

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account