Blog
About

23
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A two-step hierarchical hypothesis set testing framework, with applications to gene expression data on ordered categories

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          In complex large-scale experiments, in addition to simultaneously considering a large number of features, multiple hypotheses are often being tested for each feature. This leads to a problem of multi-dimensional multiple testing. For example, in gene expression studies over ordered categories (such as time-course or dose-response experiments), interest is often in testing differential expression across several categories for each gene. In this paper, we consider a framework for testing multiple sets of hypothesis, which can be applied to a wide range of problems.

          Results

          We adopt the concept of the overall false discovery rate (OFDR) for controlling false discoveries on the hypothesis set level. Based on an existing procedure for identifying differentially expressed gene sets, we discuss a general two-step hierarchical hypothesis set testing procedure, which controls the overall false discovery rate under independence across hypothesis sets. In addition, we discuss the concept of the mixed-directional false discovery rate (mdFDR), and extend the general procedure to enable directional decisions for two-sided alternatives. We applied the framework to the case of microarray time-course/dose-response experiments, and proposed three procedures for testing differential expression and making multiple directional decisions for each gene. Simulation studies confirm the control of the OFDR and mdFDR by the proposed procedures under independence and positive correlations across genes. Simulation results also show that two of our new procedures achieve higher power than previous methods. Finally, the proposed methodology is applied to a microarray dose-response study, to identify 17 β-estradiol sensitive genes in breast cancer cells that are induced at low concentrations.

          Conclusions

          The framework we discuss provides a platform for multiple testing procedures covering situations involving two (or potentially more) sources of multiplicity. The framework is easy to use and adaptable to various practical settings that frequently occur in large-scale experiments. Procedures generated from the framework are shown to maintain control of the OFDR and mdFDR, quantities that are especially relevant in the case of multiple hypothesis set testing. The procedures work well in both simulations and real datasets, and are shown to have better power than existing methods.

          Related collections

          Most cited references 6

          • Record: found
          • Abstract: found
          • Article: not found

          Global analysis of ligand sensitivity of estrogen inducible and suppressible genes in MCF7/BUS breast cancer cells by DNA microarray.

          To obtain comprehensive information on 17beta-estradiol (E2) sensitivity of genes that are inducible or suppressible by this hormone, we designed a method that determines ligand sensitivities of large numbers of genes by using DNA microarray and a set of simple Perl computer scripts implementing the standard metric statistics. We used it to characterize effects of low (0-100 pM) concentrations of E2 on the transcriptome profile of MCF7/BUS human breast cancer cells, whose E2 dose-dependent growth curve saturated with 100 pM E2. Evaluation of changes in mRNA expression for all genes covered by the DNA microarray indicated that, at a very low concentration (10 pM), E2 suppressed approximately 3-5 times larger numbers of genes than it induced, whereas at higher concentrations (30-100 pM) it induced approximately 1.5-2 times more genes than it suppressed. Using clearly defined statistical criteria, E2-inducible genes were categorized into several classes based on their E2 sensitivities. This approach of hormone sensitivity analysis revealed that expression of two previously reported E2-inducible autocrine growth factors, transforming growth factor alpha and stromal cell-derived factor 1, was not affected by 100 pM and lower concentrations of E2 but strongly enhanced by 10 nM E2, which was far higher than the concentration that saturated the E2 dose-dependent growth curve of MCF7/BUS cells. These observations suggested that biological actions of E2 are derived from expression of multiple genes whose E2 sensitivities differ significantly and, hence, depend on the E2 concentration, especially when it is lower than the saturating level, emphasizing the importance of characterizing the ligand dose-dependent aspects of E2 actions.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The Simes Method for Multiple Hypothesis Testing with Positively Dependent Test Statistics

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Controlling false discoveries in multidimensional directional decisions, with applications to gene expression data on ordered categories.

              Microarray gene expression studies over ordered categories are routinely conducted to gain insights into biological functions of genes and the underlying biological processes. Some common experiments are time-course/dose-response experiments where a tissue or cell line is exposed to different doses and/or durations of time to a chemical. A goal of such studies is to identify gene expression patterns/profiles over the ordered categories. This problem can be formulated as a multiple testing problem where for each gene the null hypothesis of no difference between the successive mean gene expressions is tested and further directional decisions are made if it is rejected. Much of the existing multiple testing procedures are devised for controlling the usual false discovery rate (FDR) rather than the mixed directional FDR (mdFDR), the expected proportion of Type I and directional errors among all rejections. Benjamini and Yekutieli (2005, Journal of the American Statistical Association 100, 71-93) proved that an augmentation of the usual Benjamini-Hochberg (BH) procedure can control the mdFDR while testing simple null hypotheses against two-sided alternatives in terms of one-dimensional parameters. In this article, we consider the problem of controlling the mdFDR involving multidimensional parameters. To deal with this problem, we develop a procedure extending that of Benjamini and Yekutieli based on the Bonferroni test for each gene. A proof is given for its mdFDR control when the underlying test statistics are independent across the genes. The results of a simulation study evaluating its performance under independence as well as under dependence of the underlying test statistics across the genes relative to other relevant procedures are reported. Finally, the proposed methodology is applied to a time-course microarray data obtained by Lobenhofer et al. (2002, Molecular Endocrinology 16, 1215-1229). We identified several important cell-cycle genes, such as DNA replication/repair gene MCM4 and replication factor subunit C2, which were not identified by the previous analyses of the same data by Lobenhofer et al. (2002) and Peddada et al. (2003, Bioinformatics 19, 834-841). Although some of our findings overlap with previous findings, we identify several other genes that complement the results of Lobenhofer et al. (2002).
                Bookmark

                Author and article information

                Contributors
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central
                1471-2105
                2014
                14 April 2014
                : 15
                : 108
                Affiliations
                [1 ]Department of Statistics, Pennsylvania State University, University Park, State College, Pennsylvania 16802, USA
                Article
                1471-2105-15-108
                10.1186/1471-2105-15-108
                4000433
                24731138
                Copyright © 2014 Li and Ghosh; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

                Categories
                Methodology Article

                Comments

                Comment on this article