DSpace Repository

Estimating species trees from multi-locus data in the presence of incomplete lineage sorting by maximizing quartet consistency and minimizing deep coalescence

Show simple item record

dc.contributor.advisor Bayzid, Dr. Md. Shamsuzzoha
dc.contributor.author Farah, Ishrat Tanzila
dc.date.accessioned 2022-09-27T04:22:38Z
dc.date.available 2022-09-27T04:22:38Z
dc.date.issued 2022-03-01
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/6189
dc.description.abstract Species tree estimation from multi-locus datasets is extremely challenging, especially in the presence of gene tree heterogeneity across the genome due to incomplete lineage sorting (ILS). Summary methods have been developed which estimate gene trees and then combine the gene trees to estimate a species tree by optimizing various optimization scores. In this study, we have extended and adapted the concept of phylogenetic terraces to species tree estimation by “summarizing” a set of gene trees, where multiple species trees with distinct topologies may have exactly the same optimality score (i.e., quartet score, extra lineage score, etc.). We particularly investigated the presence and impacts of equally optimal trees in species tree estimation from multi-locus data using summary methods by taking ILS into account. We analyzed two of the most popular ILS-aware optimization criteria: maximize quartet consistency (MQC) and minimize deep coalescence (MDC). Methods based on MQC are provably statistically consistent, whereas MDC is not a consistent criterion for species tree estimation. We present a comprehensive comparative study of these two optimality criteria. Our experiments, on a collection of datasets simulated under ILS, indicate that MDC may result in competitive or identical quartet consistency score as MQC, but could be significantly worse than MQC in terms of tree accuracy – demonstrating the presence and impacts of equally optimal species trees. This is the first known study that provides the conditions for the datasets to have equally optimal trees in the context of phylogenomic inference using summary methods. en_US
dc.language.iso en en_US
dc.publisher Department of computer Science and Engineering en_US
dc.subject Phylogenetic-Data processing en_US
dc.title Estimating species trees from multi-locus data in the presence of incomplete lineage sorting by maximizing quartet consistency and minimizing deep coalescence en_US
dc.type Thesis-MSc en_US
dc.contributor.id 1017052063 en_US
dc.identifier.accessionNumber 118650
dc.contributor.callno 576.880285/ISH/2022 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account