Repository logo

Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method

dc.contributor.authorXia, Xuhua
dc.date.accessioned2020-09-03T13:39:22Z
dc.date.available2020-09-03T13:39:22Z
dc.date.issued2018
dc.description.abstractI analyzed various site pattern combinations in a 4-OTU case to identify sources of starless bias and parameter-estimation bias in likelihood-based phylogenetic methods, and reported three significant contributions. First, the likelihood method is counterintuitive in that it may not generate a star tree with sequences that are equidistant from each other. This behaviour, dubbed starless bias, happens in a 4-OTU tree when there is an excess (i.e., more than expected from a star tree and a substitution model) of conflicting phylogenetic signals supporting the three resolved topologies equally. Special site pattern combinations leading to rejection of a star tree, when sequences are equidistant from each other, were identified. Second, fitting gamma distribution to model rate heterogeneity over sites is strongly confounded with tree topology, especially in conjunction with the starless bias. I present examples to show dramatic differences in the estimated shape parameter α between a star tree and a resolved tree. There may be no rate heterogeneity over sites (with the estimated α > 10000) when a star tree is imposed, but α < 1 (suggesting strong rate heterogeneity over sites) when an (incorrect) resolved tree is imposed. Thus, the dependence of "rate heterogeneity" on tree topology implies that "rate heterogeneity" is not a sequence-specific feature, cautioning against interpreting a small α to mean that some sites are under strong purifying selection and others not. Thirdly, because there is no existing (and working) likelihood method for evaluating a star tree with continuous gamma-distributed rate, I have implemented the method for JC69 in a self-contained R script for a four-OTU tree (star or resolved), in addition to another R script assuming a constant rate over sites. These R scripts should be useful for teaching and exploring likelihood methods in phylogenetics.en_US
dc.description.sponsorshipNSERCen_US
dc.identifier.doi10.3934/genet.2018.4.212en_US
dc.identifier.issn2377-1143en_US
dc.identifier.urihttp://hdl.handle.net/10393/40921
dc.identifier.urihttps://doi.org/10.20381/ruor-25147
dc.language.isoenen_US
dc.subjectmaximum likelihooden_US
dc.subjectmolecular phylogeneticsen_US
dc.subjectrate heterogeneityen_US
dc.subjectstar-tree paradoxen_US
dc.subjectstarlessen_US
dc.titleStarless bias and parameter-estimation bias in the likelihood-based phylogenetic methoden_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
AIMS2019_genetics.pdf
Size:
206.55 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
4.92 KB
Format:
Item-specific license agreed upon to submission
Description: