Blog
About

  • Record: found
  • Abstract: found
  • Article: found
Is Open Access

Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

, , *

Nucleic Acids Research

Oxford University Press

Read this article at

Bookmark
      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

      Abstract

      Functional analysis of large gene lists, derived in most cases from emerging high-throughput genomic, proteomic and bioinformatics scanning approaches, is still a challenging and daunting task. The gene-annotation enrichment analysis is a promising high-throughput strategy that increases the likelihood for investigators to identify biological processes most pertinent to their study. Approximately 68 bioinformatics enrichment tools that are currently available in the community are collected in this survey. Tools are uniquely categorized into three major classes, according to their underlying enrichment algorithms. The comprehensive collections, unique tool classifications and associated questions/issues will provide a more comprehensive and up-to-date view regarding the advantages, pitfalls and recent trends in a simpler tool-class level rather than by a tool-by-tool approach. Thus, the survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests.

      Related collections

      Most cited references 93

      • Record: found
      • Abstract: not found
      • Article: not found

      Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

        Bookmark
        • Record: found
        • Abstract: found
        • Article: not found

        Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

        Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.
          Bookmark
          • Record: found
          • Abstract: found
          • Article: not found

          DAVID: Database for Annotation, Visualization, and Integrated Discovery.

          Functional annotation of differentially expressed genes is a necessary and critical step in the analysis of microarray data. The distributed nature of biological knowledge frequently requires researchers to navigate through numerous web-accessible databases gathering information one gene at a time. A more judicious approach is to provide query-based access to an integrated database that disseminates biologically rich information across large datasets and displays graphic summaries of functional information. Database for Annotation, Visualization, and Integrated Discovery (DAVID; http://www.david.niaid.nih.gov) addresses this need via four web-based analysis modules: 1) Annotation Tool - rapidly appends descriptive data from several public databases to lists of genes; 2) GoCharts - assigns genes to Gene Ontology functional categories based on user selected classifications and term specificity level; 3) KeggCharts - assigns genes to KEGG metabolic processes and enables users to view genes in the context of biochemical pathway maps; and 4) DomainCharts - groups genes according to PFAM conserved protein domains. Analysis results and graphical displays remain dynamically linked to primary data and external data repositories, thereby furnishing in-depth as well as broad-based data coverage. The functionality provided by DAVID accelerates the analysis of genome-scale datasets by facilitating the transition from data collection to biological meaning.
            Bookmark

            Author and article information

            Affiliations
            Laboratory of Immunopathogenesis and Bioinformatics, Clinical Services Program, SAIC-Frederick, Inc., National Cancer Institute at Frederick, Frederick, MD 21702, USA
            Author notes
            *To whom correspondence should be addressed. Tel: +1 301 846 5093; Fax: +1 301 846 6762; Email: rlempicki@ 123456mail.nih.gov

            The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.

            Journal
            Nucleic Acids Res
            Nucleic Acids Res
            nar
            nar
            Nucleic Acids Research
            Oxford University Press
            0305-1048
            1362-4962
            January 2009
            25 November 2008
            25 November 2008
            : 37
            : 1
            : 1-13
            19033363 2615629 10.1093/nar/gkn923 gkn923
            © 2008 The Author(s)

            This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

            Categories
            Survey and Summary

            Genetics

            Comments

            Comment on this article