Statistical significance for genomewide studies

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

With the increase in genomewide experiments and the sequencing of multiple genomes, the analysis of large data sets has become commonplace in biology. It is often the case that thousands of features in a genomewide data set are tested against some null hypothesis, where a number of features are expected to be significant. Here we propose an approach to measuring statistical significance in these genomewide studies based on the concept of the false discovery rate. This approach offers a sensible balance between the number of true and false positives that is automatically calibrated and easily interpreted. In doing so, a measure of statistical significance called the q value is associated with each tested feature. The q value is similar to the well known p value, except it is a measure of significance in terms of the false discovery rate rather than the false positive rate. Our approach avoids a flood of false positive results, while offering a more liberal criterion than what has been used in genome scans for linkage.

Related collections

Author and article information

Journal

Title: Proceedings of the National Academy of Sciences

Abbreviated Title: Proceedings of the National Academy of Sciences

Publisher: Proceedings of the National Academy of Sciences

ISSN (Print): 0027-8424

ISSN (Electronic): 1091-6490

Publication date Created: May 01 2011

Publication date Created: August 05 2003

Publication date (Electronic): July 25 2003

Publication date (Print): August 05 2003

Volume: 100

Issue: 16

Pages: 9440-9445

Article

DOI: 10.1073/pnas.1530509100

PMC ID: 170937

PubMed ID: 12883005

SO-VID: a94f6bd8-a8f1-46d1-96e9-5467796e76a7

History

Data availability:

Comments

Comment on this article

scite_

Cited by 3,354

See all cited by

- Version 1
- Version 1