1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A highly contiguous genome assembly of the bat hawkmoth Hyles vespertilio (Lepidoptera: Sphingidae)

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Adapted to different ecological niches, moth species belonging to the Hyles genus exhibit a spectacular diversity of larval color patterns. These species diverged ∼7.5 million years ago, making this rather young genus an interesting system to study a wide range of questions including the process of speciation, ecological adaptation, and adaptive radiation.

          Results

          Here we present a high-quality genome assembly of the bat hawkmoth Hyles vespertilio, the first reference genome of a member of the Hyles genus. We generated 51× Pacific Biosciences long reads with an average read length of 8.9 kb. Pacific Biosciences reads longer than 4 kb were assembled into contigs, resulting in a 651.4-Mb assembly consisting of 530 contigs with an N50 value of 7.5 Mb. The circular mitochondrial contig has a length of 15,303 bp. The H. vespertilio genome is very repeat-rich and exhibits a higher repeat content (50.3%) than other Bombycoidea species such as Bombyx mori (45.7%) and Manduca sexta (27.5%). We developed a comprehensive gene annotation workflow to obtain consensus gene models from different evidence including gene projections, protein homology, transcriptome data, and ab initio predictions. The resulting gene annotation is highly complete with 94.5% of BUSCO genes being completely present, which is higher than the BUSCO completeness of the B. mori (92.2%) and M. sexta (90%) annotations.

          Conclusions

          Our gene annotation strategy has general applicability to other genomes, and the H. vespertilio genome provides a valuable molecular resource to study a range of questions in this genus, including phylogeny, incomplete lineage sorting, speciation, and hybridization. A genome browser displaying the genome, alignments, and annotations is available at https://genome-public.pks.mpg.de/cgi-bin/hgTracks?db=HLhylVes1.

          Related collections

          Most cited references24

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources

          Background In order to improve gene prediction, extrinsic evidence on the gene structure can be collected from various sources of information such as genome-genome comparisons and EST and protein alignments. However, such evidence is often incomplete and usually uncertain. The extrinsic evidence is usually not sufficient to recover the complete gene structure of all genes completely and the available evidence is often unreliable. Therefore extrinsic evidence is most valuable when it is balanced with sequence-intrinsic evidence. Results We present a fairly general method for integration of external information. Our method is based on the evaluation of hints to potentially protein-coding regions by means of a Generalized Hidden Markov Model (GHMM) that takes both intrinsic and extrinsic information into account. We used this method to extend the ab initio gene prediction program AUGUSTUS to a versatile tool that we call AUGUSTUS+. In this study, we focus on hints derived from matches to an EST or protein database, but our approach can be used to include arbitrary user-defined hints. Our method is only moderately effected by the length of a database match. Further, it exploits the information that can be derived from the absence of such matches. As a special case, AUGUSTUS+ can predict genes under user-defined constraints, e.g. if the positions of certain exons are known. With hints from EST and protein databases, our new approach was able to predict 89% of the exons in human chromosome 22 correctly. Conclusion Sensitive probabilistic modeling of extrinsic evidence such as sequence database matches can increase gene prediction accuracy. When a match of a sequence interval to an EST or protein sequence is used it should be treated as compound information rather than as information about individual positions.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training.

            We describe a new ab initio algorithm, GeneMark-ES version 2, that identifies protein-coding genes in fungal genomes. The algorithm does not require a predetermined training set to estimate parameters of the underlying hidden Markov model (HMM). Instead, the anonymous genomic sequence in question is used as an input for iterative unsupervised training. The algorithm extends our previously developed method tested on genomes of Arabidopsis thaliana, Caenorhabditis elegans, and Drosophila melanogaster. To better reflect features of fungal gene organization, we enhanced the intron submodel to accommodate sequences with and without branch point sites. This design enables the algorithm to work equally well for species with the kinds of variations in splicing mechanisms seen in the fungal phyla Ascomycota, Basidiomycota, and Zygomycota. Upon self-training, the intron submodel switches on in several steps to reach its full complexity. We demonstrate that the algorithm accuracy, both at the exon and the whole gene level, is favorably compared to the accuracy of gene finders that employ supervised training. Application of the new method to known fungal genomes indicates substantial improvement over existing annotations. By eliminating the effort necessary to build comprehensive training sets, the new algorithm can streamline and accelerate the process of annotation in a large number of fungal genome sequencing projects.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Engineering a software tool for gene structure prediction in higher organisms

                Bookmark

                Author and article information

                Journal
                Gigascience
                Gigascience
                gigascience
                GigaScience
                Oxford University Press
                2047-217X
                January 2020
                23 January 2020
                23 January 2020
                : 9
                : 1
                : giaa001
                Affiliations
                [1 ] Max Planck Institute of Molecular Cell Biology and Genetics , Pfotenhauerstr. 108, 01307 Dresden, Germany
                [2 ] Center for Systems Biology Dresden , Pfotenhauerstr. 108, 01307 Dresden, Germany
                [3 ] Max Planck Institute for the Physics of Complex Systems , Nöthnitzer Str. 38, 01187 Dresden, Germany
                [4 ] Senckenberg Natural History Collections Dresden , Königsbrücker Landstr. 159, 01109 Dresden, Germany
                [5 ] Department of Entomology, Max Planck Institute for Chemical Ecology , Hans-Knoell-Str. 8, 07745 Jena, Germany
                Author notes
                Correspondence address. Michael Hiller, Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, 01307 Dresden, Germany. E-mail: hiller@ 123456mpi-cbg.de
                Correspondence address. Anna K. Hundsdoerfer, Senckenberg Natural History Collections Dresden, Königsbrücker Landstr. 159, 01109 Dresden, Germany. E-mail: anna.hundsdoerfer@ 123456senckenberg.de

                Joint first authorship. *Joint senior authorship.

                Author information
                http://orcid.org/0000-0003-3024-1449
                http://orcid.org/0000-0001-5594-4154
                Article
                giaa001
                10.1093/gigascience/giaa001
                6977585
                31972020
                42eaef31-8ff3-463e-b90f-626bb42f549e
                © The Author(s) 2020. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 18 October 2019
                : 20 December 2019
                : 08 January 2020
                Page count
                Pages: 10
                Funding
                Funded by: Federal Ministry of Education and Research 10.13039/501100002347
                Award ID: 01IS18026C
                Funded by: Deutsche Forschungsgemeinschaft 10.13039/501100001659
                Award ID: HI 1423/3-1
                Award ID: HU 1561/5-1
                Award ID: RE 603/25-1
                Categories
                Data Note

                genome assembly,pacbio long reads,hawkmoth–silk moth comparison,gene annotation

                Comments

                Comment on this article