1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A bioinformatic platform to integrate target capture and whole genome sequences of various read depths for phylogenomics

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The increasing availability of short‐read whole genome sequencing (WGS) provides unprecedented opportunities to study ecological and evolutionary processes. Although loci of interest can be extracted from WGS data and combined with target sequence data, this requires suitable bioinformatic workflows. Here, we test different assembly and locus extraction strategies and implement them into secapr, a pipeline that processes short‐read data into multilocus alignments for phylogenetics and molecular ecology analyses. We integrate the processing of data from low‐coverage WGS (<30×) and target sequence capture into a flexible framework, while optimizing de novo contig assembly and loci extraction. Specifically, we test different assembly strategies by contrasting their ability to recover loci from targeted butterfly protein‐coding genes, using four data sets: a WGS data set across different average coverages (10×, 5× and 2×) and a data set for which these loci were enriched prior to sequencing via target sequence capture. Using the resulting de novo contigs, we account for potential errors within contigs and infer phylogenetic trees to evaluate the ability of each assembly strategy to recover species relationships. We demonstrate that choosing multiple sizes of kmer simultaneously for assembly results in the highest yield of extracted loci from de novo assembled contigs, while data sets derived from sequencing read depths as low as 5× recovers the expected species relationships in phylogenetic trees. By making the tested assembly approaches available in the secapr pipeline, we hope to inspire future studies to incorporate complementary data and make an informed choice on the optimal assembly strategy.

          Related collections

          Most cited references60

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Trimmomatic: a flexible trimmer for Illumina sequence data

          Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic Contact: usadel@bio1.rwth-aachen.de Supplementary information: Supplementary data are available at Bioinformatics online.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Fitting Linear Mixed-Effects Models Usinglme4

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability

              We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.
                Bookmark

                Author and article information

                Contributors
                degusp00@prf.jcu.cz
                Journal
                Mol Ecol
                Mol Ecol
                10.1111/(ISSN)1365-294X
                MEC
                Molecular Ecology
                John Wiley and Sons Inc. (Hoboken )
                0962-1083
                1365-294X
                31 October 2021
                December 2021
                : 30
                : 23 , WHOLE GENOME SEQUENCING IN MOLECULAR ECOLOGY ( doiID: 10.1111/mec.v30.23 )
                : 6021-6035
                Affiliations
                [ 1 ] Biology Centre of the Czech Academy of Sciences Institute of Entomology České Budějovice Czech Republic
                [ 2 ] Faculty of Science University of South Bohemia České Budějovice Czech Republic
                [ 3 ] Department of Biological and Environmental Sciences University of Gothenburg Gothenburg Sweden
                [ 4 ] Gothenburg Global Biodiversity Centre Gothenburg Sweden
                [ 5 ] Department of Biology University of Fribourg Fribourg Switzerland
                [ 6 ] Swiss Institute of Bioinformatics Fribourg Switzerland
                [ 7 ] Royal Botanical Gardens Kew Richmond UK
                [ 8 ] Department of Plant Sciences University of Oxford Oxford UK
                Author notes
                [*] [* ] Correspondence

                Pedro de Gusmão Ribeiro, Biology Centre of the Czech Academy of Sciences, Institute of Entomology, České Budějovice, Czech Republic and Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic.

                Email: degusp00@ 123456prf.jcu.cz

                Author information
                https://orcid.org/0000-0001-5964-1978
                https://orcid.org/0000-0003-2341-2705
                https://orcid.org/0000-0002-2885-4919
                Article
                MEC16240
                10.1111/mec.16240
                9298010
                34674330
                155e72f2-a39a-4e4d-830d-f26d9ce3d29f
                © 2021 The Authors. Molecular Ecology published by John Wiley & Sons Ltd.

                This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.

                History
                : 24 September 2021
                : 30 November 2020
                : 16 October 2021
                Page count
                Figures: 6, Tables: 2, Pages: 0, Words: 10330
                Funding
                Funded by: Swedish Research Council
                Award ID: 2017‐04980
                Award ID: 2019‐04739
                Funded by: Swedish Foundation for Strategic Research
                Funded by: Royal Botanic Gardens, Kew
                Funded by: Grant Agency of the Czech Republic
                Award ID: GJ20‐18566Y
                Funded by: Marie Skłodowska‐Curie Fellowship of the European Commission
                Award ID: MARIPOSAS‐704035
                Categories
                Special Issue
                Methodological Approaches and Advances for Wgs
                Custom metadata
                2.0
                December 2021
                Converter:WILEY_ML3GV2_TO_JATSPMC version:6.1.7 mode:remove_FC converted:20.07.2022

                Ecology
                de novo assembly,loci extraction,low‐coverage whole genome sequencing,secapr,target sequence capture

                Comments

                Comment on this article