11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          It is well-known, but frequently overlooked, that low- and high-throughput molecular data may contain batch effects, i.e., systematic technical variation. Confounding of experimental batches with the variable(s) of interest is especially concerning, as a batch effect may then be interpreted as a biologically significant finding. An integral step toward reducing false discovery in molecular data analysis includes inspection for batch effects and accounting for this signal if present. In a 30-sample pilot Illumina Infinium HumanMethylation450 (450k array) experiment, we identified two sources of batch effects: row and chip. Here, we demonstrate two approaches taken to process the 450k data in which an R function, ComBat, was applied to adjust for the non-biological signal. In the “initial analysis,” the application of ComBat to an unbalanced study design resulted in 9,612 and 19,214 significant (FDR < 0.05) DNA methylation differences, despite none present prior to correction. Suspicious of this dramatic change, a “revised processing” included changes to our analysis as well as a greater number of samples, and successfully reduced batch effects without introducing false signal. Our work supports conclusions made by an article previously published in this journal: though the ultimate antidote to batch effects is thoughtful study design, every DNA methylation microarray analysis should inspect, assess and, if necessary, account for batch effects. The analysis experience presented here can serve as a reminder to the broader community to establish research questions a priori, ensure that they match with study design and encourage communication between technicians and analysts.

          Related collections

          Most cited references22

          • Record: found
          • Abstract: not found
          • Article: not found

          Scientific method: statistical errors.

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            ArrayExpress—a public database of microarray experiments and gene expression profiles

            ArrayExpress is a public database for high throughput functional genomics data. ArrayExpress consists of two parts—the ArrayExpress Repository, which is a MIAME supportive public archive of microarray data, and the ArrayExpress Data Warehouse, which is a database of gene expression profiles selected from the repository and consistently re-annotated. Archived experiments can be queried by experiment attributes, such as keywords, species, array platform, authors, journals or accession numbers. Gene expression profiles can be queried by gene names and properties, such as Gene Ontology terms and gene expression profiles can be visualized. ArrayExpress is a rapidly growing database, currently it contains data from >50 000 hybridizations and >1 500 000 individual expression profiles. ArrayExpress supports community standards, including MIAME, MAGE-ML and more recently the proposal for a spreadsheet based data exchange format: MAGE-TAB. Availability: .
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A second common mutation in the methylenetetrahydrofolate reductase gene: an additional risk factor for neural-tube defects?

              Recently, we showed that homozygosity for the common 677(C-->T) mutation in the methylenetetrahydrofolate reductase (MTHFR) gene, causing thermolability of the enzyme, is a risk factor for neural-tube defects (NTDs). We now report on another mutation in the same gene, the 1298(A-->C) mutation, which changes a glutamate into an alanine residue. This mutation destroys an MboII recognition site and has an allele frequency of .33. This 1298(A-->C) mutation results in decreased MTHFR activity (one-way analysis of variance [ANOVA] P T) mutation. However, there appears to be an interaction between these two common mutations. When compared with heterozygosity for either the 677(C-->T) or 1298(A-->C) mutations, the combined heterozygosity for the 1298(A-->C) and 677(C-->T) mutations was associated with reduced MTHFR specific activity (ANOVA P T) mutation. This combined heterozygosity was observed in 28% (n =86) of the NTD patients compared with 20% (n =403) among controls, resulting in an odds ratio of 2.04 (95% confidence interval: .9-4.7). These data suggest that the combined heterozygosity for the two MTHFR common mutations accounts for a proportion of folate-related NTDs, which is not explained by homozygosity for the 677(C-->T) mutation, and can be an additional genetic risk factor for NTDs.
                Bookmark

                Author and article information

                Contributors
                Journal
                Front Genet
                Front Genet
                Front. Genet.
                Frontiers in Genetics
                Frontiers Media S.A.
                1664-8021
                16 March 2018
                2018
                : 9
                : 83
                Affiliations
                [1] 1BC Children’s Hospital Research Institute , Vancouver, BC, Canada
                [2] 2Department of Medical Genetics, University of British Columbia , Vancouver, BC, Canada
                [3] 3Department of Obstetrics and Gynaecology, University of British Columbia , Vancouver, BC, Canada
                Author notes

                Edited by: Patrick McGowan, University of Toronto, Canada

                Reviewed by: Jeffrey Mark Craig, Murdoch Children’s Research Institute, Australia; Jorg Tost, Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA), France

                *Correspondence: E. M. Price, magdaprice@ 123456gmail.com

                This article was submitted to Epigenomics and Epigenetics, a section of the journal Frontiers in Genetics

                Article
                10.3389/fgene.2018.00083
                5864890
                29616078
                b3e2d33d-a56e-4188-87b6-292d685801ef
                Copyright © 2018 Price and Robinson.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 09 December 2017
                : 27 February 2018
                Page count
                Figures: 2, Tables: 0, Equations: 0, References: 42, Pages: 7, Words: 0
                Funding
                Funded by: Canadian Institutes of Health Research 10.13039/501100000024
                Award ID: FRN49520
                Categories
                Genetics
                Perspective

                Genetics
                dna methylation,450k array,illumina,batch correction,batch effects,combat,ewas
                Genetics
                dna methylation, 450k array, illumina, batch correction, batch effects, combat, ewas

                Comments

                Comment on this article