STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Abstract Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein–protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein–protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.

Related collections

Most cited references 64

Record: found
Abstract: found
Article: found

Is Open Access

WGCNA: an R package for weighted correlation network analysis

Peter Langfelder, Steve Horvath (2008)

Background Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. Results The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. Conclusion The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at .

0 comments Cited 6341 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

NCBI GEO: archive for functional genomics data sets—update

Tanya Barrett, Stephen Wilhite, Pierre Ledoux … (2012)

The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.

0 comments Cited 2446 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

KEGG: new perspectives on genomes, pathways, diseases and drugs

Minoru Kanehisa, Miho Furumichi, Mao Tanabe … (2016)

KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an encyclopedia of genes and genomes. Assigning functional meanings to genes and genomes both at the molecular and higher levels is the primary objective of the KEGG database project. Molecular-level functions are stored in the KO (KEGG Orthology) database, where each KO is defined as a functional ortholog of genes and proteins. Higher-level functions are represented by networks of molecular interactions, reactions and relations in the forms of KEGG pathway maps, BRITE hierarchies and KEGG modules. In the past the KO database was developed for the purpose of defining nodes of molecular networks, but now the content has been expanded and the quality improved irrespective of whether or not the KOs appear in the three molecular network databases. The newly introduced addendum category of the GENES database is a collection of individual proteins whose functions are experimentally characterized and from which an increasing number of KOs are defined. Furthermore, the DISEASE and DRUG databases have been improved by systematic analysis of drug labels for better integration of diseases and drugs with the KEGG molecular networks. KEGG is moving towards becoming a comprehensive knowledge base for both functional interpretation and practical application of genomic information.

0 comments Cited 2060 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

David Lyon: (View ORCID Profile)

Alexander Junge: (View ORCID Profile)

Jaime Huerta-Cepas: (View ORCID Profile)

Nadezhda T Doncheva: (View ORCID Profile)

John H Morris: (View ORCID Profile)

Peer Bork: (View ORCID Profile)

Lars J Jensen: (View ORCID Profile)

Christian von Mering: (View ORCID Profile)

Journal

Title: Nucleic Acids Research

Publisher: Oxford University Press (OUP)

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date Created: January 08 2019

Publication date Created: November 22 2018

Publication date Other: January 08 2019

Publication date (Print): January 08 2019

Publication date (Electronic): November 22 2018

Volume: 47

Issue: D1

Pages: D607-D613

Affiliations

[1 ]Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland

[2 ]Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark

[3 ]Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM)—Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28223 Madrid, Spain

[4 ]Center for non-coding RNA in Technology and Health, University of Copenhagen, 2200 Copenhagen N, Denmark

[5 ]Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA

[6 ]Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany

[7 ]Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany

[8 ]Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany

[9 ]Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany

Article

DOI: 10.1093/nar/gky1131

SO-VID: 5dd5a867-64f9-4803-9459-27ab16de59a3

License:

http://creativecommons.org/licenses/by/4.0/

History

Data availability:

Comments

Comment on this article

scite_

Cited by 6,727

See all cited by

Most referenced authors 1,643

See all reference authors

STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets

Read this article at

Abstract

Related collections

Functional role of amyloid

Most cited references 64

WGCNA: an R package for weighted correlation network analysis

NCBI GEO: archive for functional genomics data sets—update

KEGG: new perspectives on genomes, pathways, diseases and drugs

Author and article information

Contributors

Journal

Affiliations

Article

History

Comments

Comment on this article

Similar content 1,935

Cited by 6,727

Most referenced authors 1,643