2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The phased telomere-to-telomere reference genome of Musa acuminata, a main contributor to banana cultivars

      data-paper

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Musa acuminata is a main wild contributor to banana cultivars. Here, we reported a haplotype-resolved and telomere-to-telomere reference genome of M. acuminata by incorporating PacBio HiFi reads, Nanopore ultra-long reads, and Hi-C data. The genome size of the two haploid assemblies was estimated to be 469.83 Mb and 470.21 Mb, respectively. Multiple assessments confirmed the contiguity (contig N50: 16.53 Mb and 18.58 Mb; LAI: 20.18 and 19.48), completeness (BUSCOs: 98.57% and 98.57%), and correctness (QV: 45.97 and 46.12) of the genome. The repetitive sequences accounted for about half of the genome size. In total, 40,889 and 38,269 protein-coding genes were annotated in the two haploid assemblies, respectively, of which 9.56% and 3.37% were newly predicted. Genome comparison identified a large reciprocal translocation involving 3 Mb and 10 Mb from chromosomes 01 and 04 within M. acuminata. This reference genome of M. acuminata provides a valuable resource for further understanding of subgenome evolution of Musa species, and precise genetic improvement of banana.

          Related collections

          Most cited references47

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Fast and accurate short read alignment with Burrows–Wheeler transform

          Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            fastp: an ultra-fast all-in-one FASTQ preprocessor

            Abstract Motivation Quality control and preprocessing of FASTQ files are essential to providing clean data for downstream analysis. Traditionally, a different tool is used for each operation, such as quality control, adapter trimming and quality filtering. These tools are often insufficiently fast as most are developed using high-level programming languages (e.g. Python and Java) and provide limited multi-threading support. Reading and loading data multiple times also renders preprocessing slow and I/O inefficient. Results We developed fastp as an ultra-fast FASTQ preprocessor with useful quality control and data-filtering features. It can perform quality control, adapter trimming, quality filtering, per-read quality pruning and many other operations with a single scan of the FASTQ data. This tool is developed in C++ and has multi-threading support. Based on our evaluation, fastp is 2–5 times faster than other FASTQ preprocessing tools such as Trimmomatic or Cutadapt despite performing far more operations than similar tools. Availability and implementation The open-source code and corresponding instructions are available at https://github.com/OpenGene/fastp.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Minimap2: pairwise alignment for nucleotide sequences

              Heng Li (2018)
              Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 Mb in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms.
                Bookmark

                Author and article information

                Contributors
                huirun.huang@scbg.ac.cn
                Journal
                Sci Data
                Sci Data
                Scientific Data
                Nature Publishing Group UK (London )
                2052-4463
                16 September 2023
                16 September 2023
                2023
                : 10
                : 631
                Affiliations
                [1 ]GRID grid.9227.e, ISNI 0000000119573309, Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, , Chinese Academy of Sciences, ; Guangzhou, 510650 China
                [2 ]South China National Botanical Garden, Guangzhou, 510650 China
                [3 ]University of Chinese Academy of Sciences, ( https://ror.org/05qbk4x57) Beijing, 100049 China
                [4 ]GRID grid.410727.7, ISNI 0000 0001 0526 1937, National Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, , Chinese Academy of Agricultural Sciences, ; Shenzhen, 518120 China
                [5 ]School of Marine Sciences and Biotechnology, Guangxi University for Nationalities, ( https://ror.org/0495efn48) Nanning, 530008 China
                [6 ]GRID grid.453499.6, ISNI 0000 0000 9835 1415, National Key Laboratory of Tropical Crop Breeding, Tropical Crops Genetic Resources Institute, , Chinese Academy of Tropical Agricultural Sciences, ; Haikou, 571101 China
                Author information
                http://orcid.org/0000-0003-4175-9358
                http://orcid.org/0000-0003-0780-2973
                http://orcid.org/0000-0002-4656-5627
                Article
                2546
                10.1038/s41597-023-02546-9
                10505225
                37716992
                985b6309-e020-480e-813f-d9e22f6d6b7c
                © Springer Nature Limited 2023

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 6 June 2023
                : 5 September 2023
                Funding
                Funded by: FundRef https://doi.org/10.13039/501100001809, National Natural Science Foundation of China (National Science Foundation of China);
                Award ID: 32070237, 31261140366
                Award Recipient :
                Funded by: Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDB31000000
                Categories
                Data Descriptor
                Custom metadata
                © Springer Nature Limited 2023

                structural variation,comparative genomics,dna sequencing,natural variation in plants

                Comments

                Comment on this article