13
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Highly contiguous assemblies of 101 drosophilid genomes

      research-article
      1 , , 2 , 3 , 4 , 4 , 4 , 5 , 6 , 6 , 7 , 7 , 7 , 7 , 1 , 8 , 9 , 10 , 10 , 10 , 11 , 11 , 12 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 18 , 19 , 20 , 9 , 6 , 21 , 7 , 4 , 6 , , 1 ,
      ,
      eLife
      eLife Sciences Publications, Ltd
      Drosophila, Drosophilidae, comparative genomics, genome assembly, nanopore, long reads, D. melanogaster, Other

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Over 100 years of studies in Drosophila melanogaster and related species in the genus Drosophila have facilitated key discoveries in genetics, genomics, and evolution. While high-quality genome assemblies exist for several species in this group, they only encompass a small fraction of the genus. Recent advances in long-read sequencing allow high-quality genome assemblies for tens or even hundreds of species to be efficiently generated. Here, we utilize Oxford Nanopore sequencing to build an open community resource of genome assemblies for 101 lines of 93 drosophilid species encompassing 14 species groups and 35 sub-groups. The genomes are highly contiguous and complete, with an average contig N50 of 10.5 Mb and greater than 97% BUSCO completeness in 97/101 assemblies. We show that Nanopore-based assemblies are highly accurate in coding regions, particularly with respect to coding insertions and deletions. These assemblies, along with a detailed laboratory protocol and assembly pipelines, are released as a public resource and will serve as a starting point for addressing broad questions of genetics, ecology, and evolution at the scale of hundreds of species.

          Related collections

          Most cited references92

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The Sequence Alignment/Map format and SAMtools

          Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability

            We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Basic local alignment search tool.

              A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
                Bookmark

                Author and article information

                Contributors
                Role: Reviewing Editor
                Role: Senior Editor
                Journal
                eLife
                Elife
                eLife
                eLife
                eLife Sciences Publications, Ltd
                2050-084X
                19 July 2021
                2021
                : 10
                : e66405
                Affiliations
                [1 ] Department of Biology, Stanford University Stanford United States
                [2 ] Department of Genetics, University of North Carolina Chapel Hill United States
                [3 ] Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children’s Hospital Seattle United States
                [4 ] Department of Evolution and Ecology, University of California Davis Davis United States
                [5 ] School of Natural Sciences, Bangor University Bangor United Kingdom
                [6 ] Biology Department, University of North Carolina Chapel Hill United States
                [7 ] Department of Integrative Biology, University of California, Berkeley Berkeley United States
                [8 ] Molecular and Cellular Biology Program, University of Washington Seattle United States
                [9 ] Department of Biological Sciences, Tokyo Metropolitan University Hachioji Japan
                [10 ] Faculty of Biology, University of Belgrade Belgrade Serbia
                [11 ] University of Belgrade, Institute for Biological Research "Siniša Stanković", National Institute of Republic of Serbia Belgrade Serbia
                [12 ] School of Ecology and Environmental Science, Yunnan University Kunming China
                [13 ] Hokkaido University Museum, Hokkaido University Sapporo Japan
                [14 ] Biological Laboratory, Sapporo College, Hokkaido University of Education Sapporo Japan
                [15 ] Graduate School of Science and Engineering, Ehime University Matsuyama Japan
                [16 ] Department of Biology, University of Kentucky Lexington United States
                [17 ] Department of Biology, Indiana University Bloomington United States
                [18 ] Neurobiology and Genetics, Theodor Boveri Institute, Biocentre, University of Würzburg Würzburg Germany
                [19 ] Institute of Entomology, Biology Centre, Academy of Sciences of the Czech Republic Prague Czech Republic
                [20 ] Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Stowers Institute for Medical Research Kansas City United States
                [21 ] School of Life Science, University of Nevada Las Vegas United States
                University of California, Davis United States
                University of Michigan United States
                University of California, Davis United States
                Harvard University United States
                Author notes
                [†]

                These authors contributed equally to this work.

                Author information
                https://orcid.org/0000-0002-5025-1292
                https://orcid.org/0000-0002-0673-9418
                https://orcid.org/0000-0003-3609-5702
                http://orcid.org/0000-0003-3954-2416
                http://orcid.org/0000-0002-4826-0464
                http://orcid.org/0000-0002-6433-622X
                http://orcid.org/0000-0002-1637-0933
                http://orcid.org/0000-0002-0053-1982
                http://orcid.org/0000-0003-0158-1858
                http://orcid.org/0000-0002-8391-7417
                https://orcid.org/0000-0003-1448-4678
                http://orcid.org/0000-0001-5224-0741
                https://orcid.org/0000-0002-3664-9130
                Article
                66405
                10.7554/eLife.66405
                8337076
                34279216
                d1024cdb-94f7-465f-86eb-e6d4973310d8
                © 2021, Kim et al

                This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

                History
                : 11 January 2021
                : 16 July 2021
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: F32GM135998
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: R35GM118165
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000062, National Institute of Diabetes and Digestive and Kidney Diseases;
                Award ID: K01DK119582
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000001, National Science Foundation;
                Award ID: DEB-1457707
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: R01GM121750
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: R01GM125715
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100006785, Google;
                Award ID: Google Cloud Research Credits
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: R35GM122592
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: R35GM119816
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100008732, Uehara Memorial Foundation;
                Award ID: 201931028
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100004564, Ministry of Education, Science and Technological Development of the Republic of Serbia;
                Award ID: 451-03-68/2020-14/200178
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100004564, Ministry of Education, Science and Technological Development of the Republic of Serbia;
                Award ID: 451-03-68/2020-14/200007
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100001809, National Natural Science Foundation of China;
                Award ID: 32060112
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100001691, Japan Society for the Promotion of Science;
                Award ID: JP18K06383
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100007601, Horizon 2020 - Research and Innovation Framework Programme;
                Award ID: 765937-CINCHRON
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100001824, Czech Science Foundation;
                Award ID: 19-13381S
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100001691, Japan Society for the Promotion of Science;
                Award ID: JP19H03276
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000001, National Science Foundation;
                Award ID: 1345247
                Award Recipient :
                The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
                Categories
                Tools and Resources
                Evolutionary Biology
                Genetics and Genomics
                Custom metadata
                One hundred one high-quality drosophilid genomes are released, along with low-cost assembly workflows, as an open community resource for studying genetics, ecology, and evolution in this important model system.

                Life sciences
                drosophila,drosophilidae,comparative genomics,genome assembly,nanopore,long reads,d. melanogaster,other

                Comments

                Comment on this article