20
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Statistical Quantification of Methylation Levels by Next-Generation Sequencing

      research-article
      1 , 1 , 2 , 1 , *
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background/Aims

          Recently, next-generation sequencing-based technologies have enabled DNA methylation profiling at high resolution and low cost. Methyl-Seq and Reduced Representation Bisulfite Sequencing (RRBS) are two such technologies that interrogate methylation levels at CpG sites throughout the entire human genome. With rapid reduction of sequencing costs, these technologies will enable epigenotyping of large cohorts for phenotypic association studies. Existing quantification methods for sequencing-based methylation profiling are simplistic and do not deal with the noise due to the random sampling nature of sequencing and various experimental artifacts. Therefore, there is a need to investigate the statistical issues related to the quantification of methylation levels for these emerging technologies, with the goal of developing an accurate quantification method.

          Methods

          In this paper, we propose two methods for Methyl-Seq quantification. The first method, the Maximum Likelihood estimate, is both conceptually intuitive and computationally simple. However, this estimate is biased at extreme methylation levels and does not provide variance estimation. The second method, based on Bayesian hierarchical model, allows variance estimation of methylation levels, and provides a flexible framework to adjust technical bias in the sequencing process.

          Results

          We compare the previously proposed binary method, the Maximum Likelihood (ML) method, and the Bayesian method. In both simulation and real data analysis of Methyl-Seq data, the Bayesian method offers the most accurate quantification. The ML method is slightly less accurate than the Bayesian method. But both our proposed methods outperform the original binary method in Methyl-Seq. In addition, we applied these quantification methods to simulation data and show that, with sequencing depth above 40–300 (which varies with different tissue samples) per cleavage site, Methyl-Seq offers a comparable quantification consistency as microarrays.

          Related collections

          Most cited references21

          • Record: found
          • Abstract: found
          • Article: not found

          Principles and challenges of genomewide DNA methylation analysis.

          Methylation of cytosine bases in DNA provides a layer of epigenetic control in many eukaryotes that has important implications for normal biology and disease. Therefore, profiling DNA methylation across the genome is vital to understanding the influence of epigenetics. There has been a revolution in DNA methylation analysis technology over the past decade: analyses that previously were restricted to specific loci can now be performed on a genome-scale and entire methylomes can be characterized at single-base-pair resolution. However, there is such a diversity of DNA methylation profiling techniques that it can be challenging to select one. This Review discusses the different approaches and their relative merits and introduces considerations for data analysis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Genomic mapping by fingerprinting random clones: a mathematical analysis.

            Results from physical mapping projects have recently been reported for the genomes of Escherichia coli, Saccharomyces cerevisiae, and Caenorhabditis elegans, and similar projects are currently being planned for other organisms. In such projects, the physical map is assembled by first "fingerprinting" a large number of clones chosen at random from a recombinant library and then inferring overlaps between clones with sufficiently similar fingerprints. Although the basic approach is the same, there are many possible choices for the fingerprint used to characterize the clones and the rules for declaring overlap. In this paper, we derive simple formulas showing how the progress of a physical mapping project is affected by the nature of the fingerprinting scheme. Using these formulas, we discuss the analytic considerations involved in selecting an appropriate fingerprinting scheme for a particular project.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Finding the fifth base: genome-wide sequencing of cytosine methylation.

              Complete sequences of myriad eukaryotic genomes, including several human genomes, are now available, and recent dramatic developments in DNA sequencing technology are opening the floodgates to vast volumes of sequence data. Yet, despite knowing for several decades that a significant proportion of cytosines in the genomes of plants and animals are present in the form of methylcytosine, until very recently the precise locations of these modified bases have never been accurately mapped throughout a eukaryotic genome. Advanced "next-generation" DNA sequencing technologies are now enabling the global mapping of this epigenetic modification at single-base resolution, providing new insights into the regulation and dynamics of DNA methylation in genomes.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2011
                15 June 2011
                : 6
                : 6
                : e21034
                Affiliations
                [1 ]Department of Biostatistics, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
                [2 ]HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, United States of America
                Max Planck Institute for Evolutionary Anthropology, Germany
                Author notes

                Conceived and designed the experiments: DZ NY. Performed the experiments: GW. Analyzed the data: GW DZ. Contributed reagents/materials/analysis tools: DA. Wrote the paper: GW DZ DA.

                Article
                PONE-D-11-01716
                10.1371/journal.pone.0021034
                3115964
                21698242
                076fb96a-20ae-440c-ae0f-a4fdfc87e536
                Wu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 20 January 2011
                : 17 May 2011
                Page count
                Pages: 12
                Categories
                Research Article
                Biology
                Computational Biology
                Genomics
                Genome Sequencing
                Genetics
                Epigenetics
                DNA modification
                Genomics
                Genome Sequencing

                Uncategorized
                Uncategorized

                Comments

                Comment on this article