• Record: found
  • Abstract: found
  • Article: found
Is Open Access

GSMA: Gene Set Matrix Analysis, An Automated Method for Rapid Hypothesis Testing of Gene Expression Data

Read this article at

      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


      Background:Microarray technology has become highly valuable for identifying complex global changes in gene expression patterns. The assignment of functional information to these complex patterns remains a challenging task in effectively interpreting data and correlating results from across experiments, projects and laboratories. Methods which allow the rapid and robust evaluation of multiple functional hypotheses increase the power of individual researchers to data mine gene expression data more efficiently.Results:We have developed (gene set matrix analysis) GSMA as a useful method for the rapid testing of group-wise up- or down-regulation of gene expression simultaneously for multiple lists of genes (gene sets) against entire distributions of gene expression changes (datasets) for single or multiple experiments. The utility of GSMA lies in its flexibility to rapidly poll gene sets related by known biological function or as designated solely by the end-user against large numbers of datasets simultaneously.Conclusions:GSMA provides a simple and straightforward method for hypothesis testing in which genes are tested by groups across multiple datasets for patterns of expression enrichment.

      Related collections

      Most cited references 31

      • Record: found
      • Abstract: found
      • Article: not found

      Cluster analysis and display of genome-wide expression patterns.

      A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.
        • Record: found
        • Abstract: found
        • Article: not found

        PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes.

        DNA microarrays can be used to identify gene expression changes characteristic of human disease. This is challenging, however, when relevant differences are subtle at the level of individual genes. We introduce an analytical strategy, Gene Set Enrichment Analysis, designed to detect modest but coordinate changes in the expression of groups of functionally related genes. Using this approach, we identify a set of genes involved in oxidative phosphorylation whose expression is coordinately decreased in human diabetic muscle. Expression of these genes is high at sites of insulin-mediated glucose disposal, activated by PGC-1alpha and correlated with total-body aerobic capacity. Our results associate this gene set with clinically important variation in human metabolism and illustrate the value of pathway relationships in the analysis of genomic profiling experiments.
          • Record: found
          • Abstract: found
          • Article: not found

          DAVID: Database for Annotation, Visualization, and Integrated Discovery.

          Functional annotation of differentially expressed genes is a necessary and critical step in the analysis of microarray data. The distributed nature of biological knowledge frequently requires researchers to navigate through numerous web-accessible databases gathering information one gene at a time. A more judicious approach is to provide query-based access to an integrated database that disseminates biologically rich information across large datasets and displays graphic summaries of functional information. Database for Annotation, Visualization, and Integrated Discovery (DAVID; addresses this need via four web-based analysis modules: 1) Annotation Tool - rapidly appends descriptive data from several public databases to lists of genes; 2) GoCharts - assigns genes to Gene Ontology functional categories based on user selected classifications and term specificity level; 3) KeggCharts - assigns genes to KEGG metabolic processes and enables users to view genes in the context of biochemical pathway maps; and 4) DomainCharts - groups genes according to PFAM conserved protein domains. Analysis results and graphical displays remain dynamically linked to primary data and external data repositories, thereby furnishing in-depth as well as broad-based data coverage. The functionality provided by DAVID accelerates the analysis of genome-scale datasets by facilitating the transition from data collection to biological meaning.

            Author and article information

            [1 ]Genomics Core, Division of Allergy and Clinical Immunology, School of Medicine, Johns Hopkins University, 5200 Eastern Avenue, Baltimore, MD 21224
            [2 ]University of Rochester School of Medicine and Dentistry, Division of Pulmonary and Critical Care Medicine, Rochester, New York, U.S.A
            [3 ]Division of Rheumatology, School of Medicine, Johns Hopkins University, 5200 Eastern Avenue, Baltimore, MD 21224
            Author notes
            Correspondence: Chris Cheadle, Ph.D., CCR/NCI/NIH, Basic Research Laboratory-Bethesda, Cellular Biochemistry Section, Bldg. 10, Rm. 5B05, 9000 Rockville Pike, Bethesda MD 20892. Tel: 301-435-2004; Fax: 301-480-8587; Email: cheadlec@
            Bioinform Biol Insights
            Bioinformatics and Biology Insights
            Libertas Academica
            24 November 2009
            : 1
            : 49-62

            This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (

            Original Research

            Bioinformatics & Computational biology


            Comment on this article