4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Refinement of Draft Genome Assemblies of Pigeonpea ( Cajanus cajan)

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Genome assembly of short reads from large plant genomes remains a challenge in computational biology despite major developments in next generation sequencing. Of late several draft assemblies have been reported in sequenced plant genomes. The reported draft genome assemblies of Cajanus cajan have different levels of genome completeness, a large number of repeats, gaps, and segmental duplications. Draft assemblies with portions of genome missing are shorter than the referenced original genome. These assemblies come with low map accuracy affecting further functional annotation and the prediction of gene components as desired by crop researchers. Genome coverage, i.e., the number of sequenced raw reads mapped onto a certain location of the genome is an important quality indicator of completeness and assembly quality in draft assemblies. The present work aimed to improve the coverage in reported de novo sequenced draft genomes (GCA_000340665.1 and GCA_000230855.2) of pigeonpea, a legume widely cultivated in India. The two recently sequenced assemblies, A1 and A2 comprised 72% and 75% of the estimated coverage of the genome, respectively. We employed an assembly reconciliation approach to compare the draft assemblies and merge them, filling the gaps by employing an algorithm size sorting mate-pair library to generate a high quality and near complete assembly with enhanced contiguity. The majority of gaps present within scaffolds were filled with right-sized mate-pair reads. The improved assembly reduced the number of gaps than those reported in draft assemblies resulting in an improved genome coverage of 82.4%. Map accuracy of the improved assembly was evaluated using various quality metrics and for the presence of specific trait-related functional genes. Employed pair-end and mate-pair local libraries helped us to reduce gaps, repeats, and other sequence errors resulting in lengthier scaffolds compared to the two draft assemblies. We reported the prediction of putative host resistance genes against Fusarium wilt disease by their performance and evaluated them both in wet laboratory and field phenotypic conditions.

          Related collections

          Most cited references36

          • Record: found
          • Abstract: found
          • Article: not found

          Basic local alignment search tool.

          A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

            Genomics has revolutionized biological research, but quality assessment of the resulting assembled sequences is complicated and remains mostly limited to technical measures like N50.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              QUAST: quality assessment tool for genome assemblies.

              Limitations of genome sequencing techniques have led to dozens of assembly algorithms, none of which is perfect. A number of methods for comparing assemblers have been developed, but none is yet a recognized benchmark. Further, most existing methods for comparing assemblies are only applicable to new assemblies of finished genomes; the problem of evaluating assemblies of previously unsequenced species has not been adequately considered. Here, we present QUAST-a quality assessment tool for evaluating and comparing genome assemblies. This tool improves on leading assembly comparison software with new ideas and quality metrics. QUAST can evaluate assemblies both with a reference genome, as well as without a reference. QUAST produces many reports, summary tables and plots to help scientists in their research and in their publications. In this study, we used QUAST to compare several genome assemblers on three datasets. QUAST tables and plots for all of them are available in the Supplementary Material, and interactive versions of these reports are on the QUAST website. http://bioinf.spbau.ru/quast . Supplementary data are available at Bioinformatics online.
                Bookmark

                Author and article information

                Contributors
                Journal
                Front Genet
                Front Genet
                Front. Genet.
                Frontiers in Genetics
                Frontiers Media S.A.
                1664-8021
                15 December 2020
                2020
                : 11
                : 607432
                Affiliations
                [1] 1Indian Council for Agricultural Research (ICAR)-National Bureau of Plant Genetic Resources , New Delhi, India
                [2] 2Directorate of Education, Rani Lakshmi Bai Central Agricultural University , Jhansi, India
                [3] 3School of Computational and Integrative Sciences, Jawaharlal Nehru University , New Delhi, India
                Author notes

                Edited by: Yuriy L. Orlov, I.M. Sechenov First Moscow State Medical University, Russia

                Reviewed by: Subhojit Datta, Indian Council of Agricultural Research, India; Vishal Acharya, Institute of Himalayan Bioresource Technology (CSIR), India

                These authors have contributed equally to this work

                This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics

                Article
                10.3389/fgene.2020.607432
                7770131
                33384719
                e5031897-6e53-4185-bcf5-1dd43aabd04e
                Copyright © 2020 Marla, Mishra, Maurya, Singh, Wankhede, Kumar, Yadav, Subbarao, Singh and Kumar.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 17 September 2020
                : 23 November 2020
                Page count
                Figures: 4, Tables: 4, Equations: 0, References: 38, Pages: 12, Words: 0
                Categories
                Genetics
                Original Research

                Genetics
                assembly improvement,reconciliation,mate-pairs,disease resistance,pigeonpea genome
                Genetics
                assembly improvement, reconciliation, mate-pairs, disease resistance, pigeonpea genome

                Comments

                Comment on this article