52
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Nine quick tips for pathway enrichment analysis

      research-article
      1 , * , , 2
      PLoS Computational Biology
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Pathway enrichment analysis (PEA) is a computational biology method that identifies biological functions that are overrepresented in a group of genes more than would be expected by chance and ranks these functions by relevance. The relative abundance of genes pertinent to specific pathways is measured through statistical methods, and associated functional pathways are retrieved from online bioinformatics databases. In the last decade, along with the spread of the internet, higher availability of computational resources made PEA software tools easy to access and to use for bioinformatics practitioners worldwide. Although it became easier to use these tools, it also became easier to make mistakes that could generate inflated or misleading results, especially for beginners and inexperienced computational biologists. With this article, we propose nine quick tips to avoid common mistakes and to out a complete, sound, thorough PEA, which can produce relevant and robust results. We describe our nine guidelines in a simple way, so that they can be understood and used by anyone, including students and beginners. Some tips explain what to do before starting a PEA, others are suggestions of how to correctly generate meaningful results, and some final guidelines indicate some useful steps to properly interpret PEA results. Our nine tips can help users perform better pathway enrichment analyses and eventually contribute to a better understanding of current biology.

          Related collections

          Most cited references107

          • Record: found
          • Abstract: found
          • Article: not found

          Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

          Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets

            Abstract Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein–protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein–protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Enrichr: a comprehensive gene set enrichment analysis web server 2016 update

              Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput Biol
                plos
                PLoS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                11 August 2022
                August 2022
                : 18
                : 8
                : e1010348
                Affiliations
                [1 ] Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
                [2 ] Dipartimento di Giurisprudenza Economia e Sociologia, Università Magna Graecia di Catanzaro, Catanzaro, Italy
                McGill University, CANADA
                Author notes

                The authors have declared that no competing interests exist.

                Author information
                https://orcid.org/0000-0001-9655-7142
                https://orcid.org/0000-0003-2868-7732
                Article
                PCOMPBIOL-D-22-00299
                10.1371/journal.pcbi.1010348
                9371296
                35951505
                f7bb23c2-8ad1-445f-8bda-f390f3765c87
                © 2022 Chicco, Agapito

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                Page count
                Figures: 0, Tables: 0, Pages: 15
                Funding
                The authors received no specific funding for this work.
                Categories
                Education
                Biology and Life Sciences
                Organisms
                Eukaryota
                Plants
                Legumes
                Peas
                Research and Analysis Methods
                Database and Informatics Methods
                Bioinformatics
                Research and Analysis Methods
                Mathematical and Statistical Techniques
                Statistical Methods
                Physical Sciences
                Mathematics
                Statistics
                Statistical Methods
                Computer and Information Sciences
                Software Engineering
                Software Tools
                Engineering and Technology
                Software Engineering
                Software Tools
                Science Policy
                Science and Technology Workforce
                Careers in Research
                Scientists
                Biologists
                People and Places
                Population Groupings
                Professions
                Scientists
                Biologists
                Biology and Life Sciences
                Biochemistry
                Metabolism
                Metabolic Pathways
                Biology and Life Sciences
                Genetics
                Genomics
                Computer and Information Sciences
                Software Engineering
                Computer Software
                Engineering and Technology
                Software Engineering
                Computer Software

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article