14
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Large scale genome skimming from herbarium material for accurate plant identification and phylogenomics

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Herbaria are valuable sources of extensive curated plant material that are now accessible to genetic studies because of advances in high-throughput, next-generation sequencing methods. As an applied assessment of large-scale recovery of plastid and ribosomal genome sequences from herbarium material for plant identification and phylogenomics, we sequenced 672 samples covering 21 families, 142 genera and 530 named and proposed named species. We explored the impact of parameters such as sample age, DNA concentration and quality, read depth and fragment length on plastid assembly error. We also tested the efficacy of DNA sequence information for identifying plant samples using 45 specimens recently collected in the Pilbara.

          Results

          Genome skimming was effective at producing genomic information at large scale. Substantial sequence information on the chloroplast genome was obtained from 96.1% of samples, and complete or near-complete sequences of the nuclear ribosomal RNA gene repeat were obtained from 93.3% of samples. We were able to extract sequences for the core DNA barcode regions rbcL and matK from 96 to 93.3% of samples, respectively. Read quality and DNA fragment length had significant effects on sequencing outcomes and error correction of reads proved essential. Assembly problems were specific to certain taxa with low GC and high repeat content ( Goodenia, Scaevola, Cyperus, Bulbostylis, Fimbristylis) suggesting biological rather than technical explanations. The structure of related genomes was needed to guide the assembly of repeats that exceeded the read length. DNA-based matching proved highly effective and showed that the efficacy for species identification declined in the order cpDNA >> rDNA >  matK >>  rbcL.

          Conclusions

          We showed that a large-scale approach to genome sequencing using herbarium specimens produces high-quality complete cpDNA and rDNA sequences as a source of data for DNA barcoding and phylogenomics.

          Related collections

          Most cited references19

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Toward almost closed genomes with GapFiller

          De novo assembly is a commonly used application of next-generation sequencing experiments. The ultimate goal is to puzzle millions of reads into one complete genome, although draft assemblies usually result in a number of gapped scaffold sequences. In this paper we propose an automated strategy, called GapFiller, to reliably close gaps within scaffolds using paired reads. The method shows good results on both bacterial and eukaryotic datasets, allowing only few errors. As a consequence, the amount of additional wetlab work needed to close a genome is drastically reduced. The software is available at http://www.baseclear.com/bioinformatics-tools/.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics.

            Just as Sanger sequencing did more than 20 years ago, next-generation sequencing (NGS) is poised to revolutionize plant systematics. By combining multiplexing approaches with NGS throughput, systematists may no longer need to choose between more taxa or more characters. Here we describe a genome skimming (shallow sequencing) approach for plant systematics. Through simulations, we evaluated optimal sequencing depth and performance of single-end and paired-end short read sequences for assembly of nuclear ribosomal DNA (rDNA) and plastomes and addressed the effect of divergence on reference-guided plastome assembly. We also used simulations to identify potential phylogenetic markers from low-copy nuclear loci at different sequencing depths. We demonstrated the utility of genome skimming through phylogenetic analysis of the Sonoran Desert clade (SDC) of Asclepias (Apocynaceae). Paired-end reads performed better than single-end reads. Minimum sequencing depths for high quality rDNA and plastome assemblies were 40× and 30×, respectively. Divergence from the reference significantly affected plastome assembly, but relatively similar references are available for most seed plants. Deeper rDNA sequencing is necessary to characterize intragenomic polymorphism. The low-copy fraction of the nuclear genome was readily surveyed, even at low sequencing depths. Nearly 160000 bp of sequence from three organelles provided evidence of phylogenetic incongruence in the SDC. Adoption of NGS will facilitate progress in plant systematics, as whole plastome and rDNA cistrons, partial mitochondrial genomes, and low-copy nuclear markers can now be efficiently obtained for molecular phylogenetics studies.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Genomic Treasure Troves: Complete Genome Sequencing of Herbarium and Insect Museum Specimens

              Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22–82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4–97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2–71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well.
                Bookmark

                Author and article information

                Contributors
                paul.nevill@curtin.edu.au
                xiao.zhong@research.uwa.edu.au
                julian.tonti-filippini@uwa.edu.au
                margaret.byrne@dbca.wa.gov.au
                michael.hislop@dbca.wa.gov.au
                kevin.thiele@uwa.edu.au
                stephen.vanleeuwen@dbca.wa.gov.au
                laura.boykin@uwa.edu.au
                ian.small@uwa.edu.au
                Journal
                Plant Methods
                Plant Methods
                Plant Methods
                BioMed Central (London )
                1746-4811
                4 January 2020
                4 January 2020
                2020
                : 16
                : 1
                Affiliations
                [1 ]ISNI 0000 0004 0375 4078, GRID grid.1032.0, Australian Research Council Centre for Mine Site Restoration, School of Molecular and Life Sciences, , Curtin University, ; GPO Box U1987, Perth, WA 6102 Australia
                [2 ]ISNI 0000 0004 1936 7910, GRID grid.1012.2, School of Biological Sciences, , The University of Western Australia, ; Crawley, WA 6009 Australia
                [3 ]Kings Park and Botanic Garden, Fraser Ave, Kings Park, WA 6005 Australia
                [4 ]ISNI 0000 0004 1936 7910, GRID grid.1012.2, Australian Research Council Centre of Excellence in Plant Energy Biology, , The University of Western Australia, ; Crawley, WA 6009 Australia
                [5 ]ISNI 0000 0004 1936 7910, GRID grid.1012.2, School of Molecular Sciences, , The University of Western Australia, ; Crawley, WA 6009 Australia
                [6 ]Biodiversity and Conservation Science, Department of Biodiversity, Conservation and Attractions, Locked Bag 104, Bentley Delivery Centre, Bentley, WA 6983 Australia
                [7 ]ISNI 0000 0004 0375 4078, GRID grid.1032.0, School of Molecular and Life Sciences, , Curtin University, ; GPO Box U1987, Perth, WA 6102 Australia
                Author information
                http://orcid.org/0000-0003-2060-906X
                Article
                534
                10.1186/s13007-019-0534-5
                6942304
                31911810
                ed9ee479-89b9-4138-aa42-4517edbf909f
                © The Author(s) 2020

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 14 July 2019
                : 27 November 2019
                Funding
                Funded by: Bioplatforms Australia
                Funded by: Fortescue Metals group Ltd
                Funded by: FundRef http://dx.doi.org/10.13039/501100002287, Department of Biodiversity, Conservation and Attractions;
                Funded by: University of Western Australia
                Funded by: Australian Research Council Industrial Transformation Training Centre for Mine Site Restoration
                Award ID: ICI150100041
                Categories
                Methodology
                Custom metadata
                © The Author(s) 2020

                Plant science & Botany
                chloroplast,genome skimming,herbarium specimens,next-generation sequencing,pilbara,plant dna barcoding,plastid genome

                Comments

                Comment on this article