0
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      Why publish your research Open Access with G3: Genes|Genomes|Genetics?

      Learn more and submit today!

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      An improved reference of the grapevine genome reasserts the origin of the PN40024 highly homozygous genotype

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The genome sequence of the diploid and highly homozygous Vitis vinifera genotype PN40024 serves as the reference for many grapevine studies. Despite several improvements to the PN40024 genome assembly, its current version PN12X.v2 is quite fragmented and only represents the haploid state of the genome with mixed haplotypes. In fact, being nearly homozygous, this genome contains several heterozygous regions that are yet to be resolved. Taking the opportunity of improvements that long-read sequencing technologies offer to fully discriminate haplotype sequences, an improved version of the reference, called PN40024.v4, was generated. Through incorporating long genomic sequencing reads to the assembly, the continuity of the 12X.v2 scaffolds was highly increased with a total number decreasing from 2,059 to 640 and a reduction in N bases of 88%. Additionally, the full alternative haplotype sequence was built for the first time, the chromosome anchoring was improved and the number of unplaced scaffolds was reduced by half. To obtain a high-quality gene annotation that outperforms previous versions, a liftover approach was complemented with an optimized annotation workflow for Vitis. Integration of the gene reference catalogue and its manual curation have also assisted in improving the annotation, while defining the most reliable estimation of 35,230 genes to date. Finally, we demonstrated that PN40024 resulted from 9 selfings of cv. “Helfensteiner” (cross of cv. “Pinot noir” and “Schiava grossa”) instead of a single “Pinot noir”. These advances will help maintain the PN40024 genome as a gold-standard reference, also contributing toward the eventual elaboration of the grapevine pangenome.

          Related collections

          Most cited references84

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Trimmomatic: a flexible trimmer for Illumina sequence data

          Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic Contact: usadel@bio1.rwth-aachen.de Supplementary information: Supplementary data are available at Bioinformatics online.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            The Sequence Alignment/Map format and SAMtools

            Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              STAR: ultrafast universal RNA-seq aligner.

              Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                G3 (Bethesda)
                Genetics
                g3journal
                G3: Genes|Genomes|Genetics
                Oxford University Press (US )
                2160-1836
                May 2023
                26 March 2023
                26 March 2023
                : 13
                : 5
                : jkad067
                Affiliations
                SVQV, INRAE—University of Strasbourg , Colmar 68000, France
                Genetics and Genomics of Plants, CeBiTec & Faculty of Biology, Bielefeld University , Bielefeld 33615, Germany
                SVQV, INRAE—University of Strasbourg , Colmar 68000, France
                Genetics and Genomics of Plants, CeBiTec & Faculty of Biology, Bielefeld University , Bielefeld 33615, Germany
                SVQV, INRAE—University of Strasbourg , Colmar 68000, France
                SVQV, INRAE—University of Strasbourg , Colmar 68000, France
                Unidad de Hortofruticultura, Centro de Investigación y Tecnología Agroalimentaria de Aragón (CITA) , Zaragoza 50059, Spain
                SVQV, INRAE—University of Strasbourg , Colmar 68000, France
                Cold Spring Harbor Laboratory , Cold Spring Harbor, NY 11724, USA
                SVQV, INRAE—University of Strasbourg , Colmar 68000, France
                Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC , Paterna 46908, Valencia, Spain
                Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC , Paterna 46908, Valencia, Spain
                Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC , Paterna 46908, Valencia, Spain
                Cold Spring Harbor Laboratory , Cold Spring Harbor, NY 11724, USA
                Dipartimento di Biotecnologie, Università degli Studi di Verona , Verona 37134, Italy
                Cold Spring Harbor Laboratory , Cold Spring Harbor, NY 11724, USA
                USDA ARS NEA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service , Ithaca, NY 14853, USA
                SVQV, INRAE—University of Strasbourg , Colmar 68000, France
                Author notes
                Corresponding author: Email: camille.rustenholz@ 123456inrae.fr

                Amandine Velt and Bianca Frommer equally contributed.

                Conflicts of interest The author(s) declare no conflict of interest.

                Author information
                https://orcid.org/0000-0003-2368-839X
                https://orcid.org/0000-0002-5792-0102
                https://orcid.org/0000-0003-2501-548X
                https://orcid.org/0000-0002-1062-4576
                https://orcid.org/0000-0003-2712-1892
                https://orcid.org/0000-0001-6089-3048
                https://orcid.org/0000-0002-3265-4012
                https://orcid.org/0000-0002-1641-9274
                https://orcid.org/0000-0002-9196-1813
                https://orcid.org/0000-0003-1623-3039
                https://orcid.org/0000-0003-2756-4028
                https://orcid.org/0000-0002-7499-5368
                https://orcid.org/0000-0002-9571-0747
                https://orcid.org/0000-0002-8125-3821
                https://orcid.org/0000-0001-5355-3408
                Article
                jkad067
                10.1093/g3journal/jkad067
                10151409
                36966465
                0426be3a-84e5-469c-8eda-bc1e86b5f08a
                © The Author(s) 2023. Published by Oxford University Press on behalf of the Genetics Society of America.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 22 January 2023
                : 20 March 2023
                : 04 April 2023
                Page count
                Pages: 14
                Funding
                Funded by: INRAE, doi 10.13039/501100022077;
                Funded by: Biologie et Amélioration des Plantes;
                Funded by: German Network for Bioinformatics Infrastructure, doi 10.13039/501100018929;
                Funded by: European Cooperation in Science and Technology, doi 10.13039/501100000921;
                Categories
                Genome Report
                AcademicSubjects/SCI01180
                AcademicSubjects/SCI01140

                Genetics
                vitis vinifera,genotype pn40024,reference genome,long reads,improved annotation
                Genetics
                vitis vinifera, genotype pn40024, reference genome, long reads, improved annotation

                Comments

                Comment on this article