J Syst Evol ›› 2015, Vol. 53 ›› Issue (5): 391-402.DOI: 10.1111/jse.12167

• Research Articles • Previous Articles     Next Articles

Using phylogenomics to resolve mega-families: An example from Compositae

Jennifer R. Mandel1,2†*, Rebecca B. Dikow3†, and Vicki A. Funk4   

  1. 1Department of Biological Sciences, University of Memphis, Memphis, USA
    2W. Harry Feinstone Center for Genomic Research, University of Memphis, Memphis, USA
    3Smithsonian Institute for Biodiversity Genomics, Center for Conservation and Evolutionary Genetics, National Zoological Park and Division of Mammals, National Museum of Natural History, Smithsonian Institution, Washington DC, USA
    4Department of Botany, National Museum of Natural History, Smithsonian Institution, Washington DC, USA These authors contributed equally to this work.
  • Received:2015-03-17 Published:2015-09-22

Abstract: Next-generation sequencing and phylogenomics hold great promise for elucidating complex relationships among large plant families. Here, we performed targeted capture of low copy sequences followed by next-generation sequencing on the Illumina platform in the large and diverse angiosperm family Compositae (Asteraceae). The family is monophyletic, based on morphology and molecular data, yet many areas of the phylogeny have unresolved polytomies and interpreting phylogenetic patterns has been historically difficult. In order to outline a method and provide a framework and for future phylogenetic studies in the Compositae, we sequenced 23 taxa from across the family in which the relationships were well established as well as a member of the sister family Calyceraceae. We generated nuclear data from 795 loci and assembled chloroplast genomes from off-target capture reads enabling the comparison of nuclear and chloroplast genomes for phylogenetic analyses. We also analyzed multi-copy nuclear genes in our data set using a clustering method during orthology detection, and we applied a network approach to these clusters—analyzing all related locus copies. Using these data, we produced hypotheses of phylogenetic relationships employing both a conservative (restricted to only loci with one copy per targeted locus) and a multigene approach (including all copies per targeted locus). The methods and bioinformatics workflow presented here provide a solid foundation for future work aimed at understanding gene family evolution in the Compositae as well as providing a model for phylogenomic analyses in other plant mega-families.

Key words: chloroplast, gene-tree, network, next-generation sequencing, nuclear, phylogenetics