97
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Phylogenomics Reveals Three Sources of Adaptive Variation during a Rapid Radiation

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Speciation events often occur in rapid bursts of diversification, but the ecological and genetic factors that promote these radiations are still much debated. Using whole transcriptomes from all 13 species in the ecologically and reproductively diverse wild tomato clade ( Solanum sect. Lycopersicon), we infer the species phylogeny and patterns of genetic diversity in this group. Despite widespread phylogenetic discordance due to the sorting of ancestral variation, we date the origin of this radiation to approximately 2.5 million years ago and find evidence for at least three sources of adaptive genetic variation that fuel diversification. First, we detect introgression both historically between early-branching lineages and recently between individual populations, at specific loci whose functions indicate likely adaptive benefits. Second, we find evidence of lineage-specific de novo evolution for many genes, including loci involved in the production of red fruit color. Finally, using a “PhyloGWAS” approach, we detect environment-specific sorting of ancestral variation among populations that come from different species but share common environmental conditions. Estimated across the whole clade, small but substantial and approximately equal fractions of the euchromatic portion of the genome are inferred to contribute to each of these three sources of adaptive genetic variation. These results indicate that multiple genetic sources can promote rapid diversification and speciation in response to new ecological opportunity, in agreement with our emerging phylogenomic understanding of the complexity of both ancient and recent species radiations.

          Abstract

          Wild tomatoes contain immense natural trait diversity; this study describes the evolutionary processes that have generated this diversity over only a few million years, drawing from multiple sources of genetic variation.

          Author Summary

          The formation of new and distinct species during evolution often occurs in rapid bursts of diversification in which many species arise within a short time frame. The ecological and genetic factors that promote these radiations are much debated. Here, we examine genome-wide patterns of molecular evolution that accompanied a rapid adaptive radiation among 13 species of wild tomato—the ecologically and reproductively diverse group that gave rise to the domesticated tomato. By analyzing patterns of genetic variation in thousands of expressed genes from multiple populations and species, we identify genome-wide signatures of rapid consecutive speciation events during 2.5 million years of diversification in this group. These signatures include pervasive shared ancestral variation and frequently discordant signals of relatedness among different parts of the genome. Our analyses find evidence for three unique sources of genetic variation that fuel adaptive diversification in this group—postspeciation hybridization, rapid accumulation of new mutations, and recruitment from ancestral variation—and identify specific examples of putatively adaptive loci drawn from each source. Recent analyses of other rapid radiations have also inferred a role for at least one of these mechanisms; our finding of all three simultaneously at work within the same diversifying clade suggests that they might be a universal feature of rapid adaptation to diverse environmental niches.

          Related collections

          Most cited references66

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The tomato genome sequence provides insights into fleshy fruit evolution

          Introductory Paragraph Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera 1 and includes annual and perennial plants from diverse habitats. We present a high quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, S. pimpinellifolium 2 , and compare them to each other and to potato (S. tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show >8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato, small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness. Main Text The genome of the inbred tomato cultivar ‘Heinz 1706’ was sequenced and assembled using a combination of Sanger and “next generation” technologies (Supplementary Section 1). The predicted genome size is ~900 Mb, consistent with prior estimates 3 , of which 760 Mb were assembled in 91 scaffolds aligned to the 12 tomato chromosomes, with most gaps restricted to pericentromeric regions (Fig. 1A; Supplementary Fig. 1). Base accuracy is approximately one substitution error per 29.4 kb and one indel error per 6.4 kb. The scaffolds were linked with two BAC-based physical maps and anchored/oriented using a high-density genetic map, introgression line mapping and BAC fluorescence in situ hybridisation (FISH). The genome of S. pimpinellifolium (accession LA1589) was sequenced and assembled de novo using Illumina short reads, yielding a 739 Mb draft genome (Supplementary Section 3). Estimated divergence between the wild and domesticated genomes is 0.6% (5.4M SNPs distributed along the chromosomes (Fig. 1A, Supplementary Fig. 1)). Tomato chromosomes consist of pericentric heterochromatin and distal euchromatin, with repeats concentrated within and around centromeres, in chromomeres and at telomeres (Fig. 1A, Supplementary Fig. 1). Substantially higher densities of recombination, genes and transcripts are observed in euchromatin, while chloroplast insertions (Supplementary Sections 1.22-1.23) and conserved miRNA genes (Supplementary Section 2.9) are more evenly distributed throughout the genome. The genome is highly syntenic with those of other economically important Solanaceae (Fig. 1B). Compared to the genomes of Arabidopsis 4 and sorghum 5 , tomato has fewer high-copy, full-length LTR retrotransposons with older average insertion ages (2.8 versus 0.8 mya) and fewer high-frequency k-mers (Supplementary Section 2.10). This supports previous findings that the tomato genome is unusual among angiosperms by being largely comprised of low-copy DNA 6,7 . The pipeline used to annotate the tomato and potato 8 genomes is described in Supplementary Section 2. It predicted 34,727 and 35,004 protein-coding genes, respectively. Of these, 30,855 and 32,988, respectively, are supported by RNA-Seq data, and 31,741 and 32,056, respectively, show high similarity to Arabidopsis genes (Supplementary section 2.1). Chromosomal organisation of genes, transcripts, repeats and sRNAs is very similar in the two species (Supplementary Figures 2-4). The protein coding genes of tomato, potato, Arabidopsis, rice and grape were clustered into 23,208 gene groups (≥2 members), of which 8,615 are common to all five genomes, 1,727 are confined to eudicots (tomato, potato, grape and Arabidopsis), and 727 are confined to plants with fleshy fruits (tomato, potato and grape) (Supplementary Section 5.1, Supplementary Fig. 5). Relative expression of all tomato genes was determined by replicated strand-specific Illumina RNA-Seq of root, leaf, flower (2 stages) and fruit (6 stages) in addition to leaf and fruit (3 stages) of S. pimpinellifolium (Supplementary Table 1). sRNA sequencing data supported the prediction of 96 conserved miRNA genes in tomato and 120 in potato, a number consistent with other plant species (Fig. 1A, Supplementary Figures 1 and 3, Supplementary Section 2.9). Among the 34 miRNA families identified, 10 are highly conserved in plants and similarly represented in the two species, whereas other, less conserved families are more abundant in potato. Several miRNAs, predicted to target TIR-NBS-LRR genes, appeared to be preferentially or exclusively expressed in potato (Supplementary Section 2.9). Supplementary section 4 deals with comparative genomic studies. Sequence alignment of 71 Mb of euchromatic tomato genomic DNA to their potato 8 counterparts revealed 8.7% nucleotide divergence (Supplementary Section 4.1). Intergenic and repeat-rich heterochromatic sequences showed more than 30% nucleotide divergence, consistent with the high sequence diversity in these regions among potato genotypes 8 . Alignment of tomato-potato orthologous regions confirmed 9 large inversions known from cytological or genetic studies and several smaller ones (Fig. 1C). The exact number of small inversions is difficult to determine due to the lack of orientation of most potato scaffolds. 18,320 clearly orthologous tomato-potato gene pairs were identified. Of these, 138 (0.75%) had significantly higher than average non-synonymous (Ka) versus synonymous (Ks) nucleotide substitution rate ratios (ω), suggesting diversifying selection, whereas 147 (0.80%) had significantly lower than average ω, suggesting purifying selection (Supplementary Table 2). The proportions of high and low ω between sorghum and maize (Zea mays) are 0.70% and 1.19%, respectively, after 11.9 Myr of divergence 9 , suggesting that diversifying selection may have been stronger in tomato-potato. The highest densities of low-ω genes are found in collinear blocks with average Ks >1.5, tracing to a genome triplication shared with grape (see below) (Fig. 1C, Supplementary Fig. 6, Supplementary Table 3). These genes, which have been preserved in paleo-duplicated locations for more than 100 Myr 10,11 are more constrained than ‘average’ genes and are enriched for transcription factors and genes otherwise related to gene regulation (Supplementary Tables 3-4). Sequence comparison of 32,955 annotated genes in tomato and S. pimpinellifolium revealed 6,659 identical genes and 3,730 with only synonymous changes. A total of 22,888 genes had non-synonymous changes, including gains and losses of stop codons with potential consequences for gene function (Supplementary Tables 5-7). Several pericentric regions, predicted to contain genes, are absent or polymorphic in the broader S. pimpinellifolium germplasm (Supplementary Table 8, Supplementary Fig. 7). Within cultivated germplasm, particularly among the small-fruited cherry tomatoes, several chromosomal segments are more closely related to S. pimpinellifolium than to ‘Heinz 1706’ (Supplementary Figures 8-9), supporting previous observations on recent admixture of these gene pools due to breeding 12 . ‘Heinz 1706’ itself has been reported to carry introgressions from S. pimpinellifolium 13 , traces of which are detectable on chromosomes 4, 9, 11 and 12 (Supplementary Table 9). Comparison of the tomato and grape genomes supports the hypothesis that a whole-genome triplication affecting the rosid lineage occurred in a common eudicot ancestor 11 (Fig. 2B). The distribution of Ks between corresponding gene pairs in duplicated blocks suggests that one polyploidisation in the solanaceous lineage preceded the rosid-asterid (tomato-grape) divergence (Supplementary Fig. 10). Comparison to the grape genome also reveals a more recent triplication in tomato and potato. While few individual tomato/potato genes remain triplicated (Supplementary Tables 10-11), 73% of tomato gene models are in blocks that are orthologous to one grape region, collectively covering 84% of the grape gene space. Among these grape genomic regions, 22.5% have one orthologous region in tomato, 39.9% have two, and 21.6% have three, indicating that a whole genome triplication occurred in the Solanum lineage, followed by widespread gene loss. This triplication, also evident in potato (Supplementary Fig. 11) is estimated at 71 (+/-19.4) mya based on Ks of paralogous genes (Supplementary Fig. 10), and therefore predates the ~7.3 mya tomato-potato divergence. Based on alignments to single grape genome segments, the tomato genome can be partitioned into three non-overlapping ‘subgenomes’ (Fig. 2A). The number of euasterid lineages that have experienced the recent triplication remains unclear and awaits complete euasterid I and II genome sequences. Ks distributions show that euasterids I and II, and indeed the rosid-asterid lineages, all diverged from common ancestry at or near the pan-eudicot triplication (Fig. 2B), suggesting that this event may have contributed to formation of major eudicot lineages in a short period of several million years 14 , partially explaining the explosive radiation of angiosperm plants on earth 15 . Supplementary section 5 reports on the analysis of specific gene families. Fleshy fruits (Supplementary Fig. 12) are an important means of attracting vertebrate frugivores for seed dispersal 16 . Combined orthology and synteny analyses suggest that both genome triplications added new gene family members that mediate important fruit-specific functions (Fig. 3). These include transcription factors and enzymes necessary for ethylene biosynthesis (RIN, CNR, ACS) and perception (LeETR3/NR, LeETR4) 17 , red light photoreceptors influencing fruit quality (PHYB1/PHYB2) and ethylene- and light-regulated genes mediating lycopene biosynthesis (PSY1/PSY2). Several cytochrome P450 subfamilies associated with toxic alkaloid biosynthesis show contraction or complete loss in tomato and the extant genes show negligible expression in ripe fruits (Supplementary Section 5.4). Fruit texture has profound agronomic and sensory importance and is controlled in part by cell wall structure and composition 18 . More than 50 genes showing differential expression during fruit development and ripening encode proteins involved in modification of wall architecture (Fig. 4A and Supplementary Section 5.7). For example, a family of xyloglucan endotransglucosylase-/hydrolases (XTHs) has expanded both in the recent whole genome triplication and through tandem duplication. One of the triplicated members, SlXTH10, shows differential loss between tomato and potato (Fig. 4A, Supplementary Table 12), suggesting genetically driven specialisation in the remodelling of fruit cell walls. Similar to soybean and potato and in contrast to Arabidopsis, tomato sRNAs map preferentially to euchromatin (Supplementary Fig. 2). sRNAs from tomato flowers and fruits 19 map to 8,416 gene promoters. Differential expression of sRNAs during fruit development is apparent for 2,687 promoters, including those of cell wall-related genes (Fig. 4B) and occurs preferentially at key developmental transitions (e.g. flower to fruit, fruit growth to fruit ripening, Supplementary Section 2.8). The genome sequences of tomato, S. pimpinellifolium and potato provide a starting point for comparing gene family evolution and sub-functionalization in the Solanaceae. A striking example is the SELF PRUNING (SP) gene family, which includes the homolog of Arabidopsis FT, encoding the mobile flowering hormone florigen 20 and its antagonist SP, encoding the ortholog of TFL1. Nearly a century ago, a spontaneous mutation in SP spawned the “determinate” varieties that now dominate the tomato mechanical harvesting industry 21 . The genome sequence has revealed that the SP family has expanded in the Solanum lineage compared to Arabidopsis, driven by the Solanum triplication and tandem duplication (Supplementary Fig. 13). In potato, SP3D and SP6A control flowering and tuberisation, respectively 22 , whereas SP3D in tomato, known as SINGLE FLOWER TRUSS, similarly controls flowering, but also drives heterosis for fruit yield in an epistatic relationship with SP 23,24,25 . Interestingly, SP6A in S. lycopersicum is inactivated by a premature stop codon, but remains functionally intact in S. pimpinellifolium. Thus, allelic variation in a subset of SP family genes has played a major role in the generation of both shared and species-specific variation in Solanaceous agricultural traits. The genome sequences of tomato and S. pimpinellifolium also provide a basis for understanding the bottlenecks that have narrowed tomato genetic diversity: the domestication of S. pimpinellifolium in the Americas, the export of a small number of accessions to Europe in the 16th Century, and the intensive breeding that followed. Charles Rick pioneered the use of trait introgression from wild tomato relatives to increase genetic diversity of cultivated tomatoes 26 . Introgression lines exist for seven wild tomato species, including S. pimpinellifolium, in the background of cultivated tomato. The genome sequences presented here and the availability of millions of SNPs will allow breeders to revisit this rich trait reservoir and identify domestication genes, providing biological knowledge and empowering biodiversity-based breeding. Methods Summary A total of 21 Gb of Roche/454 Titanium shotgun and matepair reads and 3.3 Gb of Sanger paired-end reads, including ~200,000 BAC and fosmid end sequence pairs, were generated from the ‘Heinz 1706’ inbred line (Supplementary Sections 1.1-1.7), assembled using both Newbler and CABOG and integrated into a single assembly (Supplementary Sections 1.17-1.18). The scaffolds were anchored using two BAC-based physical maps, one high density genetic map, overgo hybridization and genome-wide BAC FISH (Supplementary Sections 1.8-1.16 and 1.19). Over 99.9% of BAC/fosmid end pairs mapped consistently on the assembly and over 98% of EST sequences could be aligned to the assembly (Supplementary Section 1.20). Chloroplast genome insertions in the nuclear genome were validated using a matepair method and the flanking regions were identified (Supplementary Sections 1.22-1.24). Annotation was carried out using a pipeline based on EuGene that integrates de novo gene prediction, RNA-Seq alignment and rich function annotation (Supplementary Section 2). To facilitate interspecies comparison, the potato genome was re-annotated using the same pipeline. LTR retrotransposons were detected de novo with the LTR-STRUC program and dated by the sequence divergence between left and right solo LTR (Supplementary Section 2.10). The genome of S. pimpinellifolium was sequenced to 40x depth using Illumina paired end reads and assembled using ABySS (Supplementary Section 3). The tomato and potato genomes were aligned using LASTZ (Supplementary Section 4.1). Identification of triplicated regions was done using BLASTP, in-house generated scripts and three way comparisons between tomato, potato and S. pimpinellifolium using MCscan (Supplementary Sections 4.2-4.4). Specific gene families/groups (genes for ascorbate, carotenoid and jasmonate biosynthesis, cytochrome P450s, genes controlling cell wall architecture, hormonal and transcriptional regulators, resistance genes) were subjected to expert curation/analysis, (Supplementary Section 5). PHYML and MEGA were used to reconstruct phylogenetic trees and MCSCAN was used to infer gene collinearity (Supplementary Section 5.2). Supplementary Material 1 2 3 4
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Widespread parallel evolution in sticklebacks by repeated fixation of Ectodysplasin alleles.

            Major phenotypic changes evolve in parallel in nature by molecular mechanisms that are largely unknown. Here, we use positional cloning methods to identify the major chromosome locus controlling armor plate patterning in wild threespine sticklebacks. Mapping, sequencing, and transgenic studies show that the Ectodysplasin (EDA) signaling pathway plays a key role in evolutionary change in natural populations and that parallel evolution of stickleback low-plated phenotypes at most freshwater locations around the world has occurred by repeated selection of Eda alleles derived from an ancestral low-plated haplotype that first appeared more than two million years ago. Members of this clade of low-plated alleles are present at low frequencies in marine fish, which suggests that standing genetic variation can provide a molecular basis for rapid, parallel evolution of dramatic phenotypic change in nature.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Testing for ancient admixture between closely related populations.

              One enduring question in evolutionary biology is the extent of archaic admixture in the genomes of present-day populations. In this paper, we present a test for ancient admixture that exploits the asymmetry in the frequencies of the two nonconcordant gene trees in a three-population tree. This test was first applied to detect interbreeding between Neandertals and modern humans. We derive the analytic expectation of a test statistic, called the D statistic, which is sensitive to asymmetry under alternative demographic scenarios. We show that the D statistic is insensitive to some demographic assumptions such as ancestral population sizes and requires only the assumption that the ancestral populations were randomly mating. An important aspect of D statistics is that they can be used to detect archaic admixture even when no archaic sample is available. We explore the effect of sequencing error on the false-positive rate of the test for admixture, and we show how to estimate the proportion of archaic ancestry in the genomes of present-day populations. We also investigate a model of subdivision in ancestral populations that can result in D statistics that indicate recent admixture.
                Bookmark

                Author and article information

                Contributors
                Role: Academic Editor
                Journal
                PLoS Biol
                PLoS Biol
                plos
                plosbiol
                PLoS Biology
                Public Library of Science (San Francisco, CA USA )
                1544-9173
                1545-7885
                12 February 2016
                February 2016
                12 February 2016
                : 14
                : 2
                : e1002379
                Affiliations
                [1 ]Department of Biology, Indiana University, Bloomington, Indiana, United States of America
                [2 ]Department of Plant Pathology, Physiology and Weed Science, Virginia Tech, Blacksburg, Virginia, United States of America
                [3 ]School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States of America
                Massey University, NEW ZEALAND
                Author notes

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: LCM MWH DCH JBP. Performed the experiments: JBP DCH. Analyzed the data: JBP MWH. Wrote the paper: LCM JBP MWH.

                Article
                PBIOLOGY-D-15-02351
                10.1371/journal.pbio.1002379
                4752443
                26871574
                0885fbba-c6e9-4cde-8463-a378226be68d
                © 2016 Pease et al

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.

                History
                : 11 August 2015
                : 14 January 2016
                Page count
                Figures: 4, Tables: 0, Pages: 24
                Funding
                This work was supported by National Science Foundation grant Division of Environmental Biology-1136707. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Evolutionary Biology
                Evolutionary Processes
                Introgression
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Molecular Biology Assays and Analysis Techniques
                Phylogenetic Analysis
                Research and Analysis Methods
                Molecular Biology Techniques
                Molecular Biology Assays and Analysis Techniques
                Phylogenetic Analysis
                Biology and Life Sciences
                Evolutionary Biology
                Evolutionary Systematics
                Phylogenetics
                Biology and Life Sciences
                Taxonomy
                Evolutionary Systematics
                Phylogenetics
                Computer and Information Sciences
                Data Management
                Taxonomy
                Evolutionary Systematics
                Phylogenetics
                Biology and Life Sciences
                Genetics
                Genetic Loci
                Biology and Life Sciences
                Agriculture
                Crop Science
                Crops
                Fruits
                Tomatoes
                Biology and Life Sciences
                Organisms
                Plants
                Fruits
                Tomatoes
                Biology and Life Sciences
                Evolutionary Biology
                Evolutionary Processes
                Speciation
                Biology and Life Sciences
                Genetics
                Genomics
                Plant Genomics
                Biology and Life Sciences
                Biotechnology
                Plant Biotechnology
                Plant Genomics
                Biology and Life Sciences
                Plant Science
                Plant Biotechnology
                Plant Genomics
                Biology and Life Sciences
                Genetics
                Plant Genetics
                Plant Genomics
                Biology and Life Sciences
                Plant Science
                Plant Genetics
                Plant Genomics
                Biology and Life Sciences
                Biogeography
                Phylogeography
                Ecology and Environmental Sciences
                Biogeography
                Phylogeography
                Earth Sciences
                Geography
                Biogeography
                Phylogeography
                Biology and Life Sciences
                Evolutionary Biology
                Population Genetics
                Phylogeography
                Biology and Life Sciences
                Genetics
                Population Genetics
                Phylogeography
                Biology and Life Sciences
                Population Biology
                Population Genetics
                Phylogeography
                Custom metadata
                VCF and MVF data and script files are deposited in the Dryad repository: http://dx.doi.org/10.5061/dryad.182dv Raw reads (FASTQ files) are available from the NCBI BioProject PRJNA305880.

                Life sciences
                Life sciences

                Comments

                Comment on this article