3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Detection and accurate false discovery rate control of differentially methylated regions from whole genome bisulfite sequencing

      1 , 2 , 3 , 1
      Biostatistics
      Oxford University Press (OUP)

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          With recent advances in sequencing technology, it is now feasible to measure DNA methylation at tens of millions of sites across the entire genome. In most applications, biologists are interested in detecting differentially methylated regions, composed of multiple sites with differing methylation levels among populations. However, current computational approaches for detecting such regions do not provide accurate statistical inference. A major challenge in reporting uncertainty is that a genome-wide scan is involved in detecting these regions, which needs to be accounted for. A further challenge is that sample sizes are limited due to the costs associated with the technology. We have developed a new approach that overcomes these challenges and assesses uncertainty for differentially methylated regions in a rigorous manner. Region-level statistics are obtained by fitting a generalized least squares regression model with a nested autoregressive correlated error structure for the effect of interest on transformed methylation proportions. We develop an inferential approach, based on a pooled null distribution, that can be implemented even when as few as two samples per population are available. Here, we demonstrate the advantages of our method using both experimental data and Monte Carlo simulation. We find that the new method improves the specificity and sensitivity of lists of regions and accurately controls the false discovery rate.

          Related collections

          Most cited references21

          • Record: found
          • Abstract: found
          • Article: not found

          Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies.

          During the past 5 years, high-throughput technologies have been successfully used by epidemiology studies, but almost all have focused on sequence variation through genome-wide association studies (GWAS). Today, the study of other genomic events is becoming more common in large-scale epidemiological studies. Many of these, unlike the single-nucleotide polymorphism studied in GWAS, are continuous measures. In this context, the exercise of searching for regions of interest for disease is akin to the problems described in the statistical 'bump hunting' literature. New statistical challenges arise when the measurements are continuous rather than categorical, when they are measured with uncertainty, and when both biological signal, and measurement errors are characterized by spatial correlation along the genome. Perhaps the most challenging complication is that continuous genomic data from large studies are measured throughout long periods, making them susceptible to 'batch effects'. An example that combines all three characteristics is genome-wide DNA methylation measurements. Here, we present a data analysis pipeline that effectively models measurement error, removes batch effects, detects regions of interest and attaches statistical uncertainty to identified regions. We illustrate the usefulness of our approach by detecting genomic regions of DNA methylation associated with a continuous trait in a well-characterized population of newborns. Additionally, we show that addressing unexplained heterogeneity like batch effects reduces the number of false-positive regions. Our framework offers a comprehensive yet flexible approach for identifying genomic regions of biological interest in large epidemiological studies using quantitative high-throughput methods.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            metilene: fast and sensitive calling of differentially methylated regions from bisulfite sequencing data

            The detection of differentially methylated regions (DMRs) is a necessary prerequisite for characterizing different epigenetic states. We present a novel program, metilene, to identify DMRs within whole-genome and targeted data with unrivaled specificity and sensitivity. A binary segmentation algorithm combined with a two-dimensional statistical test allows the detection of DMRs in large methylation experiments with multiple groups of samples in minutes rather than days using off-the-shelf hardware. metilene outperforms other state-of-the-art tools for low coverage data and can estimate missing data. Hence, metilene is a versatile tool to study the effect of epigenetic modifications in differentiation/development, tumorigenesis, and systems biology on a global, genome-wide level. Whether in the framework of international consortia with dozens of samples per group, or even without biological replicates, it produces highly significant and reliable results.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Differential methylation analysis for BS-seq data under general experimental design.

              DNA methylation is an epigenetic modification with important roles in many biological processes and diseases. Bisulfite sequencing (BS-seq) has emerged recently as the technology of choice to profile DNA methylation because of its accuracy, genome coverage and higher resolution. Current statistical methods to identify differential methylation mainly focus on comparing two treatment groups. With an increasing number of experiments performed under a general and multiple-factor design, particularly in reduced representation bisulfite sequencing, there is a need to develop more flexible, powerful and computationally efficient methods.
                Bookmark

                Author and article information

                Journal
                Biostatistics
                Oxford University Press (OUP)
                1465-4644
                1468-4357
                July 2019
                July 01 2019
                February 22 2018
                July 2019
                July 01 2019
                February 22 2018
                : 20
                : 3
                : 367-383
                Affiliations
                [1 ]Department of Biostatistics & Computational Biology, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA, USA and Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA, USA
                [2 ]Novartis, Inorbit Mall Rd, Silpa Gram Craft Village, HITEC City, Hyderabad, Telangana, India
                [3 ]The Statistics Department, Hebrew University, Mount Scopus, Jerusalem, Israel
                Article
                10.1093/biostatistics/kxy007
                6587918
                29481604
                e10715f5-f354-48aa-8a21-20755c6b48d5
                © 2018

                https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model

                History

                Comments

                Comment on this article