Table of Contents

22 September 2015, Volume 53 Issue 5
Cover illustration: The hierarchical model for multigene sequence data. The model consists of three components—alignments, gene trees and the species tree. This hierarchical model involves two layers; the sequences-and-genetree layer and the genetree-and-speciestree layer. The model assumes that the gene trees are generated from a coalescent process occurring along the lineages of the species tree, while the sequences are generated from a mutation process occurring on the branches of the gene trees. See Liu et al., [Detail] ...
  • Jun Wen, Jianquan Liu, Song Ge, Qiu-Yun (Jenny) Xiang, Elizabeth A. Zimmer
    J Syst Evol. 2015, 53(5): 369-370.
  • Reviews
  • Elizabeth A. Zimmer, Jun Wen
    J Syst Evol. 2015, 53(5): 371-379.
    Single and low copy nuclear genes offer a larger number of, and more rapidly evolving, characters than the chloroplast and nuclear ribosomal gene sequences that have dominated plant phylogenetic studies to date. Until recently, only one or a few low copy nuclear gene markers were included in such studies. Now, the rapid adoption of “next generation sequencing” (NGS) techniques offers simpler and cheaper access to hundreds of, and not just tens of, coding and noncoding DNA regions. In this review, we describe the most commonly-used NGS methods available for accessing nuclear genes and discuss many NGS case studies that have been published in the last two to three years. These approaches include whole genome sequencing to target microsatellites, transcriptome sequencing, Exon-Primed Intron-Crossing sequencing (EPIC), targeted enrichment (or sequence capture), RAD sequencing (RAD-Seq, including genotyping-by-sequencing or GBS), and genome skimming. We also discuss some of the challenges to, and posed by, the NGS approaches.
  • Liang Liu, Shaoyuan Wu, Lili Yu
    J Syst Evol. 2015, 53(5): 380-390.
    Genome-scale sequence data have become increasingly available in the phylogenetic studies for understanding the evolutionary histories of species. However, it is challenging to develop probabilistic models to account for heterogeneity of phylogenomic data. The multispecies coalescent model describes gene trees as independent random variables generated from a coalescence process occurring along the lineages of the species tree. Since the multispecies coalescent model allows gene trees to vary across genes, coalescent-based methods have been popularly used to account for heterogeneous gene trees in phylogenomic data analysis. In this paper, we summarize and evaluate the performance of coalescent-based methods for estimating species trees from genome-scale sequence data. We investigate the effects of deep coalescence and mutation on the performance of species tree estimation methods. We found that the coalescent-based methods perform well in estimating species trees for a large number of genes, regardless of the degree of deep coalescence and mutation. The performance of the coalescent methods is negatively correlated with the lengths of internal branches of the species tree.
  • Research Articles
  • Jennifer R. Mandel, Rebecca B. Dikow, Vicki A. Funk
    J Syst Evol. 2015, 53(5): 391-402.
    Next-generation sequencing and phylogenomics hold great promise for elucidating complex relationships among large plant families. Here, we performed targeted capture of low copy sequences followed by next-generation sequencing on the Illumina platform in the large and diverse angiosperm family Compositae (Asteraceae). The family is monophyletic, based on morphology and molecular data, yet many areas of the phylogeny have unresolved polytomies and interpreting phylogenetic patterns has been historically difficult. In order to outline a method and provide a framework and for future phylogenetic studies in the Compositae, we sequenced 23 taxa from across the family in which the relationships were well established as well as a member of the sister family Calyceraceae. We generated nuclear data from 795 loci and assembled chloroplast genomes from off-target capture reads enabling the comparison of nuclear and chloroplast genomes for phylogenetic analyses. We also analyzed multi-copy nuclear genes in our data set using a clustering method during orthology detection, and we applied a network approach to these clusters—analyzing all related locus copies. Using these data, we produced hypotheses of phylogenetic relationships employing both a conservative (restricted to only loci with one copy per targeted locus) and a multigene approach (including all copies per targeted locus). The methods and bioinformatics workflow presented here provide a solid foundation for future work aimed at understanding gene family evolution in the Compositae as well as providing a model for phylogenomic analyses in other plant mega-families.
  • Hui Ma, Jing Lu, Bing-Bing Liu, Bing-Bing Duan, Xiao-Dong He, Jian-Quan Liu
    J Syst Evol. 2015, 53(5): 403-410.
    With the advance of next-generation sequencing techniques, phylogenetic analyses based on genomic data are becoming frequent in plants. However, the efficacy of different methods used to retrieve orthologs for phylogenetic studies based on transcriptome sequences is rarely compared and examined for closely related species with low genetic differentiation. In this study, we used both low-copy gene-based approaches and mapping-based approaches to construct a phylogeny of Betulaceae. We found that both approaches showed no distinct differences in phylogeny construction and associated bootstrap support, although mapping-based methods seemed to be superior to the former in all analyses particularly when the summed length of orthologs was not long enough. Phylogenetic relationships within the family, largely consistent with those previously based on chloroplast DNA and nuclear internal transcribed spacer sequence variations, received high support for all clades and subclades. However, we found that support values for the sister relationship between two polyploid genera were lower than those between diploid groups if only a few orthologs were sampled. This may result from ortholog misidentification due to genome duplications.
  • Morgan R. Gostel, Kiera A. Coy, Andrea Weeks
    J Syst Evol. 2015, 53(5): 411-431.
    Developing effective and cost-efficient multilocus nuclear datasets for angiosperm species is a continuing challenge to the systematics community. Here we describe the development and validation of a novel set of 91 nuclear markers for PCR-based target enrichment. Using microfluidic PCR and Illumina MiSeq, we generated nuclear, subgenomic libraries for 96 species simultaneously and sequenced them for a total cost of ca. $6000 USD. Approximately half of these costs include reusable reagents (primers, barcodes, and custom sequencing primers) and taxon sampling could be increased by an order of magnitude to maximize sequencing depth efficiency. The principle benefit of microfluidic PCR over alternative target enrichment strategies is that it bypasses costly library preparation. After sequencing, we evaluated the ability of the loci to resolve species level relationships within two recently radiated lineages of endemic Madagascan Commiphora Jacq. (Burseraceae) species. Our results demonstrate that (i) effective nuclear markers can be designed for non-model angiosperm taxa from these publicly available datasets; (ii) that microfluidic PCR amplification followed by high throughput sequencing can produce highly complete taxon by locus sequence data matrices with minimal resource investment; and (iii) that these numerous nuclear phylogenomic markers can improve our understanding of phylogenetic relationships withinCommiphora. We provide a synopsis of ongoing activities to enhance this microfluidic PCR-based target enrichment strategy through broader primer assays, multiplexing, and increased efficiency of sequencing depth.
  • Zhe-Chen Qi, Yi Yu, Xiang Liu, Andrew Pais, Thomas Ranney, Ross Whetten, Qiu-Yun (Jenny) Xiang
    J Syst Evol. 2015, 53(5): 432-447.
    Fothergilla (Hamamelidaceae) consists of Fothergilla gardenii (4x) from the coastal plains of the southeastern USA, F. major (6x) from the piedmont and mountains of the same region, and a few allopatric diploid populations of unknown taxonomic status. The objective of this study was to explore the relationships of the polyploid species with the diploid plants. Genotyping by sequencing (GBS) was applied to generate genome-wide molecular markers for phylogenetic and genetic structure analyses of 36 accessions of Fothergilla. Sanger sequencing of three plastid and one nuclear regions provided data for comparison with GBS-based results. Phylogenetic outcomes were compared using data from different sequencing runs and different software workflows. The different data sets showed substantial differences in inferred phylogenies, but all supported a genetically distinct 6x F. major and two lineages of the diploid populations closely associated with the 4x F. gardenii. We hypothesize that the 4x F. gardenii originated through hybridization between the Gulf coastal 2x and an extinct (or undiscovered) 2x lineage, followed by backcrosses to the Atlantic coastal 2x before chromosome doubling, and the 6x F. major also originated from the “extinct” 2x lineage. Alternative scenarios are possible but are not as well supported. The origins and divergence of the polyploid species likely occurred during the Pleistocene cycles of glaciation, although fossil evidence indicates the genus might have existed for a much longer time with a wider past distribution. Our study demonstrates the power of combining GBS data with Sanger sequencing in reconstructing the evolutionary network of polyploid lineages.
  • Jin-Mei Lu, Ning Zhang, Xin-Yu Du, Jun Wen, De-Zhu Li
    J Syst Evol. 2015, 53(5): 448-457.
    Studies on chloroplast genomes of ferns and lycophytes are relatively few in comparison with those on seed plants. Although a basic phylogenetic framework of extant ferns is available, relationships among a few key nodes remain unresolved or poorly supported. The primary objective of this study is to explore the phylogenetic utility of large chloroplast gene data in resolving difficult deep nodes in ferns. We sequenced the chloroplast genomes from Cyrtomium devexiscapulae(Koidz.) Ching (eupolypod I) and Woodwardia unigemmata (Makino) Nakai (eupolypod II), and constructed the phylogeny of ferns based on both 48 genes and 64 genes. The trees based on 48 genes and 64 genes are identical in topology, differing only in support values for four nodes, three of which showed higher support values for the 48-gene dataset. Equisetum L. was resolved as the sister to the Psilotales–Ophioglossales clade, and Equisetales–Psilotales–Ophioglossales clade was sister to the clade of the leptosporangiate and marattioid ferns. The sister relationship between the tree fern clade and polypods was supported by 82% and 100% bootstrap values in the 64-gene and 48-gene trees, respectively. Within polypod ferns, Pteridaceae was sister to the clade of Dennstaedtiaceae and eupolypods with a high support value, and the relationship of Dennstaedtiaceae–eupolypods was strongly supported. With recent parallel advances in the phylogenetics of ferns using nuclear data, chloroplast phylogenomics shows great potential in providing a framework for testing the impact of reticulate evolution in the early evolution of ferns.
  • Erika N. Schwarz, Tracey A. Ruhlman, Jamal S. M. Sabir, Nahid H. Hajrah, Njud S. Alharbi, Abdulrahman L. Al-Malki, C. Donovan Bailey, Robert K. Jansen
    J Syst Evol. 2015, 53(5): 458-468.
    To date, publicly available plastid genomes of legumes have for the most part been limited to the subfamily Papilionoideae. Here we report 13 new plastid genomes of legumes spanning all three subfamilies. The genomes representing Caesalpinioideae and Mimosoideae are highly conserved in gene content and gene order, similar to the ancestral angiosperm genome organization. Genomes within the Papilionoideae, however, have reduced sizes due to deletions in nine intergenic spacers primarily in the large single copy region. Our study also indicates that rps16 has been independently lost at least five times in legumes, with additional gene and intron losses scattered among the papilionoids. Additionally, genera from two distinct lineages within the papilionoids, Lupinus and Robinia, have a parallel inversion of 36 and 39 kb, respectively. This parallel inversion is novel as it appears to be caused by a 29 bp repeat within two trnS genes. This repeat is present in all available legume plastid genomes indicating that there is the potential for this inversion to be present in more species. This case of a homoplasious inversion is also evidence that some inversion events may not be reliable phylogenetic markers.
  • Ning Zhang, Jun Wen, Elizabeth A. Zimmer
    J Syst Evol. 2015, 53(5): 469-476.
    The leaf-opposed tendril, a characteristic organ in Vitaceae (grape family), is likely a morphological key innovation for the family. It has been considered as the homologous organ of the inflorescence. Expression of floral related genes has been studied extensively in the model species, grapevine (Vitis vinifera), to uncover molecular mechanisms that determine the development of a common uncommitted primordium (or an anlage) into an inflorescence or a tendril. However, to investigate the homology of tendrils and inflorescences in Vitaceae, evidence only from the highly derived grapevine is insufficient. Therefore, gene sequences of four key floral meristem genes, i.e., FUL, AP1, FT and LEAFY orthologs were obtained from transcriptome data of 14 Vitaceae species, the grapevine genome and the outgroup Leea guineensis. Additionally, expression patterns of these four genes were studied in leaves, tendrils, and inflorescences of five phylogenetically distinct Vitaceae species. Expression of the AP1 ortholog was only detected in the tendril and the inflorescence but not in the leaf for all species, indicating that the tendril is more like the inflorescence than the leaf and that the tendrils from these six species including grapevine are likely homologous. Meanwhile, expression of the LEAFYortholog was found in the inflorescence but not in the tendril and leaf, suggesting that the LEAFY ortholog expression might play a role in determining whether an anlage develops into a tendril or an inflorescence. Based on combined evidence from the expression patterns of these four genes, the possible mechanisms on the evolution of tendrils are discussed.