228
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein–protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein–protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.

          Related collections

          Most cited references64

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          WGCNA: an R package for weighted correlation network analysis

          Background Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. Results The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. Conclusion The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at .
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            NCBI GEO: archive for functional genomics data sets—update

            The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              KEGG: new perspectives on genomes, pathways, diseases and drugs

              KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an encyclopedia of genes and genomes. Assigning functional meanings to genes and genomes both at the molecular and higher levels is the primary objective of the KEGG database project. Molecular-level functions are stored in the KO (KEGG Orthology) database, where each KO is defined as a functional ortholog of genes and proteins. Higher-level functions are represented by networks of molecular interactions, reactions and relations in the forms of KEGG pathway maps, BRITE hierarchies and KEGG modules. In the past the KO database was developed for the purpose of defining nodes of molecular networks, but now the content has been expanded and the quality improved irrespective of whether or not the KOs appear in the three molecular network databases. The newly introduced addendum category of the GENES database is a collection of individual proteins whose functions are experimentally characterized and from which an increasing number of KOs are defined. Furthermore, the DISEASE and DRUG databases have been improved by systematic analysis of drug labels for better integration of diseases and drugs with the KEGG molecular networks. KEGG is moving towards becoming a comprehensive knowledge base for both functional interpretation and practical application of genomic information.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                08 January 2019
                22 November 2018
                22 November 2018
                : 47
                : Database issue , Database issue
                : D607-D613
                Affiliations
                [1 ]Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
                [2 ]Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
                [3 ]Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain
                [4 ]Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark
                [5 ]Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
                [6 ]Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
                [7 ]Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany
                [8 ]Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
                [9 ]Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
                Author notes
                To whom correspondence should be addressed. Tel: +41 44 6353147; Fax: +41 44 6356864; Email: mering@ 123456imls.uzh.ch . Correspondence may also be addressed to Peer Bork. Tel: +49 6221 3878526; Fax: +49 6221 387517; Email: peer.bork@ 123456embl.de . Correspondence may also be addressed to Lars J. Jensen. Tel: +45 353 25025; Fax: +45 353 25001; Email: lars.juhl.jensen@ 123456cpr.ku.dk
                Author information
                http://orcid.org/0000-0001-5794-0456
                http://orcid.org/0000-0002-2410-9671
                http://orcid.org/0000-0003-4195-5025
                http://orcid.org/0000-0002-8806-6850
                http://orcid.org/0000-0003-0290-7979
                http://orcid.org/0000-0002-2627-833X
                http://orcid.org/0000-0001-7885-715X
                http://orcid.org/0000-0001-7734-9102
                Article
                gky1131
                10.1093/nar/gky1131
                6323986
                30476243
                5dd5a867-64f9-4803-9459-27ab16de59a3
                © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 16 November 2018
                : 23 October 2018
                : 28 September 2018
                Page count
                Pages: 7
                Funding
                Funded by: Swiss Institute of Bioinformatics
                Award ID: NNF14CC0001
                Funded by: Danish Council for Independent Research 10.13039/501100004836
                Award ID: DFF-4005-00443
                Funded by: National Institutes of Health 10.13039/100000002
                Award ID: U54 CA189205
                Award ID: U24 224370
                Funded by: National Institute of General Medical Sciences 10.13039/100000057
                Award ID: P41 GM103504
                Funded by: Bundesministerium für Bildung und Forschung 10.13039/501100002347
                Award ID: #031A537B
                Categories
                Database Issue

                Genetics
                Genetics

                Comments

                Comment on this article