Establishing an adjusted p-value threshold to control the family-wide type 1 error in genome wide association studies

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

By assaying hundreds of thousands of single nucleotide polymorphisms, genome wide association studies (GWAS) allow for a powerful, unbiased review of the entire genome to localize common genetic variants that influence health and disease. Although it is widely recognized that some correction for multiple testing is necessary, in order to control the family-wide Type 1 Error in genetic association studies, it is not clear which method to utilize. One simple approach is to perform a Bonferroni correction using all n single nucleotide polymorphisms (SNPs) across the genome; however this approach is highly conservative and would "overcorrect" for SNPs that are not truly independent. Many SNPs fall within regions of strong linkage disequilibrium (LD) ("blocks") and should not be considered "independent".

Results

We proposed to approximate the number of "independent" SNPs by counting 1 SNP per LD block, plus all SNPs outside of blocks (interblock SNPs). We examined the effective number of independent SNPs for Genome Wide Association Study (GWAS) panels. In the CEPH Utah (CEU) population, by considering the interdependence of SNPs, we could reduce the total number of effective tests within the Affymetrix and Illumina SNP panels from 500,000 and 317,000 to 67,000 and 82,000 "independent" SNPs, respectively. For the Affymetrix 500 K and Illumina 317 K GWAS SNP panels we recommend using 10 ^-5, 10 ^-7and 10 ^-8and for the Phase II HapMap CEPH Utah and Yoruba populations we recommend using 10 ^-6, 10 ^-7and 10 ^-9as "suggestive", "significant" and "highly significant" p-value thresholds to properly control the family-wide Type 1 error.

Conclusion

By approximating the effective number of independent SNPs across the genome we are able to 'correct' for a more accurate number of tests and therefore develop 'LD adjusted' Bonferroni corrected p-value thresholds that account for the interdepdendence of SNPs on well-utilized commercially available SNP "chips". These thresholds will serve as guides to researchers trying to decide which regions of the genome should be studied further.

Related collections

Most cited references 6

Record: found
Abstract: not found
Article: not found

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Yoav Benjamini, Yosef Hochberg (1995)

0 comments Cited 23612 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Genome-wide association scan identifies a colorectal cancer susceptibility locus on 11q23 and replicates risk loci at 8q24 and 18q21.

Albert Tenesa, Susan Farrington, James G D Prendergast … (2008)

In a genome-wide association study to identify loci associated with colorectal cancer (CRC) risk, we genotyped 555,510 SNPs in 1,012 early-onset Scottish CRC cases and 1,012 controls (phase 1). In phase 2, we genotyped the 15,008 highest-ranked SNPs in 2,057 Scottish cases and 2,111 controls. We then genotyped the five highest-ranked SNPs from the joint phase 1 and 2 analysis in 14,500 cases and 13,294 controls from seven populations, and identified a previously unreported association, rs3802842 on 11q23 (OR = 1.1; P = 5.8 x 10(-10)), showing population differences in risk. We also replicated and fine-mapped associations at 8q24 (rs7014346; OR = 1.19; P = 8.6 x 10(-26)) and 18q21 (rs4939827; OR = 1.2; P = 7.8 x 10(-28)). Risk was greater for rectal than for colon cancer for rs3802842 (P < 0.008) and rs4939827 (P < 0.009). Carrying all six possible risk alleles yielded OR = 2.6 (95% CI = 1.75-3.89) for CRC. These findings extend our understanding of the role of common genetic variation in CRC etiology.

0 comments Cited 191 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Quantitative trait Loci analysis using the false discovery rate.

Yoav Benjamini, Daniel Yekutieli, Daniel Ho … (2005)

False discovery rate control has become an essential tool in any study that has a very large multiplicity problem. False discovery rate-controlling procedures have also been found to be very effective in QTL analysis, ensuring reproducible results with few falsely discovered linkages and offering increased power to discover QTL, although their acceptance has been slower than in microarray analysis, for example. The reason is partly because the methodological aspects of applying the false discovery rate to QTL mapping are not well developed. Our aim in this work is to lay a solid foundation for the use of the false discovery rate in QTL mapping. We review the false discovery rate criterion, the appropriate interpretation of the FDR, and alternative formulations of the FDR that appeared in the statistical and genetics literature. We discuss important features of the FDR approach, some stemming from new developments in FDR theory and methodology, which deem it especially useful in linkage analysis. We review false discovery rate-controlling procedures--the BH, the resampling procedure, and the adaptive two-stage procedure-and discuss the validity of these procedures in single- and multiple-trait QTL mapping. Finally we argue that the control of the false discovery rate has an important role in suggesting, indicating the significance of, and confirming QTL and present guidelines for its use.

0 comments Cited 128 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): BMC Genomics

Title: BMC Genomics

Publisher: BioMed Central

ISSN (Electronic): 1471-2164

Publication date Collection: 2008

Publication date (Electronic): 31 October 2008

Volume: 9

Page: 516

Affiliations

[1 ]Statistical Genetics Section, Inherited Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Baltimore, MD USA

Article

Publisher ID: 1471-2164-9-516

DOI: 10.1186/1471-2164-9-516

PMC ID: 2621212

PubMed ID: 18976480

SO-VID: a3b11db2-c304-41f7-91d0-3d8d1afa2a38

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Establishing an adjusted p-value threshold to control the family-wide type 1 error in genome wide association studies

Read this article at

Abstract

Background

Results

Conclusion

Related collections

Genome Integrity

Most cited references 6

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Genome-wide association scan identifies a colorectal cancer susceptibility locus on 11q23 and replicates risk loci at 8q24 and 18q21.

Quantitative trait Loci analysis using the false discovery rate.

Author and article information

Journal

Affiliations

Article

History

Categories

Comments

Comment on this article

Similar content 136

Cited by 127

Most referenced authors 1,583