4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      DNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Long single-molecular sequencing technologies, such as PacBio circular consensus sequencing (CCS) and nanopore sequencing, are advantageous in detecting DNA 5-methylcytosine in CpGs (5mCpGs), especially in repetitive genomic regions. However, existing methods for detecting 5mCpGs using PacBio CCS are less accurate and robust. Here, we present ccsmeth, a deep-learning method to detect DNA 5mCpGs using CCS reads. We sequence polymerase-chain-reaction treated and M.SssI-methyltransferase treated DNA of one human sample using PacBio CCS for training ccsmeth. Using long (≥10 Kb) CCS reads, ccsmeth achieves 0.90 accuracy and 0.97 Area Under the Curve on 5mCpG detection at single-molecule resolution. At the genome-wide site level, ccsmeth achieves >0.90 correlations with bisulfite sequencing and nanopore sequencing using only 10× reads. Furthermore, we develop a Nextflow pipeline, ccsmethphase, to detect haplotype-aware methylation using CCS reads, and then sequence a Chinese family trio to validate it. ccsmeth and ccsmethphase can be robust and accurate tools for detecting DNA 5-methylcytosines.

          Abstract

          Existing methods for detecting DNA methylation (5mC) are less accurate and robust. Here, the authors develop a deep learning tool ccsmeth and a Nextflow pipeline ccsmethphase for genome-wide 5mCpG detection and phasing with high accuracy from CCS reads in human.

          Related collections

          Most cited references49

          • Record: found
          • Abstract: found
          • Article: not found

          Minimap2: pairwise alignment for nucleotide sequences

          Heng Li (2018)
          Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 Mb in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The Human Genome Browser at UCSC

            As vertebrate genome sequences near completion and research refocuses to their analysis, the issue of effective genome annotation display becomes critical. A mature web tool for rapid and reliable display of any requested portion of the genome at any scale, together with several dozen aligned annotation tracks, is provided at http://genome.ucsc.edu. This browser displays assembly contigs and gaps, mRNA and expressed sequence tag alignments, multiple gene predictions, cross-species homologies, single nucleotide polymorphisms, sequence-tagged sites, radiation hybrid data, transposon repeats, and more as a stack of coregistered tracks. Text and sequence-based searches provide quick and precise access to any region of specific interest. Secondary links from individual features lead to sequence details and supplementary off-site databases. One-half of the annotation tracks are computed at the University of California, Santa Cruz from publicly available sequence data; collaborators worldwide provide the rest. Users can stably add their own custom tracks to the browser for educational or research purposes. The conceptual and technical framework of the browser, its underlying MYSQL database, and overall use are described. The web site currently serves over 50,000 pages per day to over 3000 different users.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications

              Summary: A combination of bisulfite treatment of DNA and high-throughput sequencing (BS-Seq) can capture a snapshot of a cell's epigenomic state by revealing its genome-wide cytosine methylation at single base resolution. Bismark is a flexible tool for the time-efficient analysis of BS-Seq data which performs both read mapping and methylation calling in a single convenient step. Its output discriminates between cytosines in CpG, CHG and CHH context and enables bench scientists to visualize and interpret their methylation data soon after the sequencing run is completed. Availability and implementation: Bismark is released under the GNU GPLv3+ licence. The source code is freely available from www.bioinformatics.bbsrc.ac.uk/projects/bismark/. Contact: felix.krueger@bbsrc.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
                Bookmark

                Author and article information

                Contributors
                xiaochuanle@126.com
                luofeng@clemson.edu
                jxwang@mail.csu.edu.cn
                Journal
                Nat Commun
                Nat Commun
                Nature Communications
                Nature Publishing Group UK (London )
                2041-1723
                8 July 2023
                8 July 2023
                2023
                : 14
                : 4054
                Affiliations
                [1 ]GRID grid.216417.7, ISNI 0000 0001 0379 7164, School of Computer Science and Engineering, , Central South University, ; Changsha, 410083 China
                [2 ]Xiangjiang Laboratory, Changsha, 410205 China
                [3 ]GRID grid.216417.7, ISNI 0000 0001 0379 7164, Hunan Provincial Key Lab on Bioinformatics, , Central South University, ; Changsha, 410083 China
                [4 ]GRID grid.452223.0, ISNI 0000 0004 1757 7615, Bioinformatics Center, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, , Xiangya Hospital, Central South University, ; Changsha, 410000 China
                [5 ]GRID grid.216417.7, ISNI 0000 0001 0379 7164, Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, , Central South University, ; Changsha, 410000 China
                [6 ]GRID grid.12981.33, ISNI 0000 0001 2360 039X, State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, , Sun Yat-sen University, ; #7 Jinsui Road, Tianhe District, Guangzhou, China
                [7 ]GRID grid.26090.3d, ISNI 0000 0001 0665 0280, School of Computing, , Clemson University, ; Clemson, SC 29634-0974 USA
                Author information
                http://orcid.org/0000-0002-0801-7574
                http://orcid.org/0009-0005-9967-4244
                http://orcid.org/0000-0003-3335-9303
                http://orcid.org/0000-0002-4680-0682
                http://orcid.org/0000-0002-4813-2403
                http://orcid.org/0000-0003-1516-0480
                Article
                39784
                10.1038/s41467-023-39784-9
                10329642
                37422489
                c053276b-580f-4912-8117-450d11165ef8
                © The Author(s) 2023

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 20 November 2022
                : 22 June 2023
                Funding
                Funded by: FundRef https://doi.org/10.13039/100005825, United States Department of Agriculture | National Institute of Food and Agriculture (NIFA);
                Award ID: 2017-70016-26051
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100000153, NSF | BIO | Division of Biological Infrastructure (DBI);
                Award ID: ABI-1759856
                Award Recipient :
                Categories
                Article
                Custom metadata
                © Springer Nature Limited 2023

                Uncategorized
                computational models,epigenomics,computational biology and bioinformatics,computational platforms and environments,data processing

                Comments

                Comment on this article