Missing data and the accuracy of Bayesian phylogenetics

doi:10.3724/SP.J.1002.2008.08040

Abstract

Abstract: The effect of missing data on phylogenetic methods is a potentially important issue in our attempts to reconstruct the Tree of Life. If missing data are truly problematic, then it may be unwise to include species in an analysis that lack data for some characters (incomplete taxa) or to include characters that lack data for some species. Given the difficulty of obtaining data from all characters for all taxa (e.g., fossils), missing data might seriously impede efforts to reconstruct a comprehensive phylogeny that includes all species. Fortunately, recent simulations and empirical analyses suggest that missing data cells are not themselves problematic, and that incomplete taxa can be accurately placed as long as the overall number of characters in the analysis is large. However, these studies have so far only been conducted on parsimony, likelihood, and neighbor-joining methods. Although Bayesian phylogenetic methods have become widely used in recent years, the effects of missing data on Bayesian analysis have not been adequately studied. Here, we conduct simulations to test whether Bayesian analyses can accurately place incomplete taxa despite extensive missing data. In agreement with previous studies of other methods, we find that Bayesian analyses can accurately reconstruct the position of highly incomplete taxa (i.e., 95% missing data), as long as the overall number of characters in the analysis is large. These results suggest that highly incomplete taxa can be safely included in many Bayesian phylogenetic analyses.

Key words: accuracy, Bayesian analysis, missing data, phylogenetic analysis

John J. WIENS*; Daniel S. MOEN. Missing data and the accuracy of Bayesian phylogenetics[J]. J Syst Evol, 2008, 46(3): 307-314.

Add to citation manager EndNote|Ris|BibTeX

URL: https://www.jse.ac.cn/EN/10.3724/SP.J.1002.2008.08040

https://www.jse.ac.cn/EN/Y2008/V46/I3/307

[1]	Ning Liu, Xin‐Lai Wu, Ruo‐Bing Zhang, Jin Wang, Qi‐Sen Yang, Ji‐Long Cheng, Zhi‐Xin Wen, Lin Xia, Alexei V. Abramov, De‐Yan Ge. Genomic differentiation and gene flow among Rattus species distributed in China and adjacent regions [J]. J Syst Evol, 2025, 63(2): 307-318.
[2]	Ya Li, Carole T. Gee, Zhen-Zhen Tan, Yan-Bin Zhu, Tie-Mei Yi, and Cheng-Sen Li. Exceptionally well-preserved seed cones of a new fossil species of hemlock, Tsuga weichangensis sp. nov. (Pinaceae), from the Lower Miocene of Hebei Province, North China [J]. J Syst Evol, 2024, 62(1): 164-180.
[3]	Gui-Lin Wu, Qing Ye, Hui Liu, De-Xiang Chen, Zhang Zhou, Ming Kang, Hang-Hui Kong, Zhi-Jing Qiu, and Hui Wang. The evolutionary rate of leaf osmotic strength drives diversification of Primulina species in karst regions [J]. J Syst Evol, 2023, 61(5): 843-851.
[4]	Jia-Yun Zou, Ya-Huang Luo, Kevin S. Burgess, Shao-Lin Tan, Wei Zheng, Chao-Nan Fu, Kun Xu, and Lian-Ming Gao. Joint effect of phylogenetic relatedness and trait selection on the elevational distribution of Rhododendron species [J]. J Syst Evol, 2021, 59(6): 1244-1255.
[5]	Chao Xu and De-Yuan Hong. Phylogenetic analyses confirm polyphyly of the genus Campanula (Campanulaceae s. str.), leading to a proposal for generic reappraisal [J]. J Syst Evol, 2021, 59(3): 475-489.
[6]	Xin-Xing Fu, Jian Zhang, Guo-Qiang Zhang, Zhong-Jian Liu, and Zhi-Duan Chen. Insights into the origin and evolution of plant sigma factors [J]. J Syst Evol, 2021, 59(2): 326-340.
[7]	Xu Zhang, Hua-Jie Zhang, Jacob B. Landis, Tao Deng, Ai-Ping Meng, Hang Sun, Yan-Song Peng, Heng-Chang Wang, and Yan-Xia Sun. Plastome phylogenomic analysis of Torreya (Taxaceae) [J]. J Syst Evol, 2019, 57(6): 607-615.
[8]	De-Yuan Hong, Qiang Wang. A new taxonomic system of the Campanulaceae s.s. [J]. J Syst Evol, 2015, 53(3): 203-209.
[9]	Chao ZHAO, Xiao-Quan WANG, Fu-Sheng YANG. Mechanisms underlying flower color variation in Asian species of Meconopsis: A preliminary phylogenetic analysis based on chloroplast DNA and anthocyanin biosynthesis genes [J]. J Syst Evol, 2014, 52(2): 125-133.
[10]	Li JI, Shu-Lian XIE, Jia FENG, Le CHEN, Jie WANG. Molecular systematics of four endemic Batrachospermaceae (Rhodophyta) species in China with multilocus data [J]. J Syst Evol, 2014, 52(1): 92-100.
[11]	Jeanett ESCOBEDO-SARTI, Ivón RAMÍREZ, Carlos LEOPARDI, Germán CARNEVALI, Susana MAGALLÓN, Rodrigo DUNO, Demetria MONDRAGÓN. A phylogeny of Bromeliaceae (Poales, Monocotyledoneae) derived from an evaluation of nine supertree methods [J]. J Syst Evol, 2013, 51(6): 743-757.
[12]	Wen-Hai CHEN,Zhi-Xi SU, Xun GU. A note on gene pleiotropy estimation from phylogenetic analysis of protein sequences [J]. J Syst Evol, 2013, 51(3): 365-369.
[13]	Naoki KOBAYASHI, Maiko WATANABE, Yukiko HARA-KUDO. Distinctive identification of Cladosporium sphaerospermum and Cladosporium halotolerans based on physiological methods [J]. J Syst Evol, 2012, 50(3): 235-243.
[14]	Tracy A. HEATH, Shannon M. HEDTKE, and David M. HILLIS. Taxon sampling and the accuracy of phylogenetic analyses [J]. J Syst Evol, 2008, 46(3): 239-257.
[15]	Markku HAKKINEN, Chee How TEO, Yasmin Rofina OTHMAN. Genome constitution for Musa beccarii (Musaceae) varieties [J]. J Syst Evol, 2007, 45(1): 69-74.

Missing data and the accuracy of Bayesian phylogenetics

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments