A weighted FDR procedure under discrete and heterogeneous null
  distributions

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

For multiple testing (MT) with discrete and heterogeneous null distributions with finite supports, we propose a generalized estimator of the proportion of true nulls, a new divergence to measure how close two c\`adl\`ag (i.e., right continuous with left limits) functions are, and a grouping mechanism that uses the divergence to group hypotheses according to the similarity between their associated null distributions. Further, we integrate these new proposals with a p-value weighting strategy to induce a weighted false discovery rate (FDR) procedure for MT with such distributions. Our weighted FDR procedure effectively adapts to the heterogeneity and discreteness of the null p-value distributions, and is also applicable to MT based on p-values that are uniformly distributed under the null. Theoretically, we provide conditions under which our procedure does not result in less rejections than the Benjamini-Hochberg (BH) procedure at the same FDR level when applied to MT with p-values that have discrete and heterogeneous null distributions. Through simulation studies on MT based on the two-sided p-values of the binomial test, Fisher's exact test, and the exact negative binomial test, we demonstrate that our procedure is much more powerful than the BH procedure and Storey's procedure at the same FDR level in all these three settings. Our FDR procedure is applied to a differential gene expression study and a differential methylation study based on next-generation sequencing (NGS) discrete data and discrete and heterogeneous null p-value distributions, where it makes significantly more discoveries than other methods.

Related collections

Author and article information

Journal

Publication date Created: 2015-02-03

Publication date Updated: 2015-10-18

Article

ArXiV ID: 1502.00973

SO-VID: 15f50f3e-53cc-433f-b525-e59e9e6bdbce

License:

http://creativecommons.org/licenses/by-sa/4.0/

History

Custom metadata

ACM classes Primary 62C25, 62M99

Comments 39 pages, 7 figures, this version intergates results from the manuscript arXiv:1410.4274, with an extended Introduction, and an explanation on why it is very hard to obtain better non-asymptotic theoretical results than given in the paper in the general settings

Categories stat.ME

ScienceOpen disciplines: Methodology

Data availability:

ScienceOpen disciplines: Methodology

A weighted FDR procedure under discrete and heterogeneous null distributions

Read this article at

Abstract

Related collections

Open source discrete and agent-based modeling frameworks for biology

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 24