J Syst Evol ›› 2021, Vol. 59 ›› Issue (5): 1124-1138.DOI: 10.1111/jse.12806

• Research Articles • Previous Articles    

Capturing single-copy nuclear genes, organellar genomes, and nuclear ribosomal DNA from deep genome skimming data for plant phylogenetics: A case study in Vitaceae

Bin-Bin Liu1,2,3, Zhi-Yao Ma3, Chen Ren4,5, Richard G. J. Hodel3, Miao Sun6, Xiu-Qun Liu7, Guang-Ning Liu8, De-Yuan Hong1, Elizabeth A. Zimmer3, and Jun Wen3*   

  1. 1 State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
    2 State Key Laboratory of Vegetation and Environmental Change (LVEC), Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
    3 Department of Botany, National Museum of Natural History, Smithsonian Institution, PO Box 37012, Washington, DC 20013‐7012, USA
    4 Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China
    5 Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China
    6 Department of Biology—Ecoinformatics and Biodiversity, Aarhus University, 8000 Aarhus C, Denmark
    7 Key Laboratory of Horticultural Plant Biology (Ministry of Education), College of Horticulture and Forestry Science, Huazhong Agricultural University, Wuhan 430070, China
    8 College of Architecture and Urban Planning, Tongji University, Shanghai 200092, China
  • Received:2021-02-25 Accepted:2021-07-19 Online:2021-07-21 Published:2021-09-01

Abstract: With the decreasing cost and availability of many newly developed bioinformatics pipelines, next-generation sequencing (NGS) has revolutionized plant systematics in recent years. Genome skimming has been widely used to obtain high-copy fractions of the genomes, including plastomes, mitochondrial DNA (mtDNA), and nuclear ribosomal DNA (nrDNA). In this study, through simulations, we evaluated the optimal (minimum) sequencing depth and performance for recovering single-copy nuclear genes (SCNs) from genome skimming data, by subsampling genome resequencing data and generating 10 data sets with different sequencing coverage in silico. We tested the performance of four data sets (plastome, nrDNA, mtDNA, and SCNs) obtained from genome skimming based on phylogenetic analyses of the Vitis clade at the genus level and Vitaceae at the family level, respectively. Our results showed that optimal minimum sequencing depth for high-quality SCNs assembly via genome skimming was about 10× coverage. Without the steps of synthesizing baits and enrichment experiments, coupled with incredibly low sequencing costs, we showcase that deep genome skimming (DGS) is as effective for capturing large data sets of SCNs as the widely used Hyb-Seq approach, in addition to capturing plastomes, mtDNA, and entire nrDNA repeats. DGS may serve as an efficient and economical alternative and may be superior to the popular target enrichment/Hyb-Seq approach.

Key words: deep genome skimming, Hyb‐Seq, mitochondrial genes, nuclear ribosomal DNA, single‐copy nuclear genes, Vitaceae