55
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      clusterProfiler 4.0: A universal enrichment tool for interpreting omics data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Summary

          Functional enrichment analysis is pivotal for interpreting high-throughput omics data in life science. It is crucial for this type of tool to use the latest annotation databases for as many organisms as possible. To meet these requirements, we present here an updated version of our popular Bioconductor package, clusterProfiler 4.0. This package has been enhanced considerably compared with its original version published 9 years ago. The new version provides a universal interface for functional enrichment analysis in thousands of organisms based on internally supported ontologies and pathways as well as annotation data provided by users or derived from online databases. It also extends the dplyr and ggplot2 packages to offer tidy interfaces for data operation and visualization. Other new features include gene set enrichment analysis and comparison of enrichment results from multiple gene lists. We anticipate that clusterProfiler 4.0 will be applied to a wide range of scenarios across diverse organisms.

          Graphical abstract

          Public summary

          • clusterProfiler supports exploring functional characteristics of both coding and non-coding genomics data for thousands of species with up-to-date gene annotation

          • It provides a universal interface for gene functional annotation from a variety of sources and thus can be applied in diverse scenarios

          • It provides a tidy interface to access, manipulate, and visualize enrichment results to help users achieve efficient data interpretation

          • Datasets obtained from multiple treatments and time points can be analyzed and compared in a single run, easily revealing functional consensus and differences among distinct conditions

          Related collections

          Most cited references43

          • Record: found
          • Abstract: found
          • Article: not found

          Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

          Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            clusterProfiler: an R package for comparing biological themes among gene clusters.

            Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis. Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters. The analysis module and visualization module were combined into a reusable workflow. Currently, clusterProfiler supports three species, including humans, mice, and yeast. Methods provided in this package can be easily extended to other species and ontologies. The clusterProfiler package is released under Artistic-2.0 License within Bioconductor project. The source code and vignette are freely available at http://bioconductor.org/packages/release/bioc/html/clusterProfiler.html.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              KEGG: new perspectives on genomes, pathways, diseases and drugs

              KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an encyclopedia of genes and genomes. Assigning functional meanings to genes and genomes both at the molecular and higher levels is the primary objective of the KEGG database project. Molecular-level functions are stored in the KO (KEGG Orthology) database, where each KO is defined as a functional ortholog of genes and proteins. Higher-level functions are represented by networks of molecular interactions, reactions and relations in the forms of KEGG pathway maps, BRITE hierarchies and KEGG modules. In the past the KO database was developed for the purpose of defining nodes of molecular networks, but now the content has been expanded and the quality improved irrespective of whether or not the KOs appear in the three molecular network databases. The newly introduced addendum category of the GENES database is a collection of individual proteins whose functions are experimentally characterized and from which an increasing number of KOs are defined. Furthermore, the DISEASE and DRUG databases have been improved by systematic analysis of drug labels for better integration of diseases and drugs with the KEGG molecular networks. KEGG is moving towards becoming a comprehensive knowledge base for both functional interpretation and practical application of genomic information.
                Bookmark

                Author and article information

                Contributors
                Journal
                Innovation (N Y)
                Innovation (N Y)
                The Innovation
                Elsevier
                2666-6758
                01 July 2021
                28 August 2021
                01 July 2021
                : 2
                : 3
                : 100141
                Affiliations
                [1 ]Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
                [2 ]Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
                [3 ]Guangdong Provincial Key Laboratory of Proteomics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
                [4 ]Microbiome Medicine Center, Department of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou 510515, China
                Author notes
                []Corresponding author boxc@ 123456bmi.ac.cn
                [∗∗ ]Corresponding author gcyu1@ 123456smu.edu.cn
                [5]

                These authors contributed equally

                Article
                S2666-6758(21)00066-7 100141
                10.1016/j.xinn.2021.100141
                8454663
                34557778
                ceca0aec-cfff-438b-a8b2-20b075b0a435
                © 2021 The Author(s)

                This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

                History
                : 8 May 2021
                : 29 June 2021
                Categories
                Article

                clusterprofiler,biological knowledge mining,functional analysis,enrichment analysis,visualization

                Comments

                Comment on this article