WINNER: A network biology tool for biomolecular characterization and prioritization

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background and contribution

In network biology, molecular functions can be characterized by network-based inference, or “guilt-by-associations.” PageRank-like tools have been applied in the study of biomolecular interaction networks to obtain further the relative significance of all molecules in the network. However, there is a great deal of inherent noise in widely accessible data sets for gene-to-gene associations or protein-protein interactions. How to develop robust tests to expand, filter, and rank molecular entities in disease-specific networks remains an ad hoc data analysis process.

Results

We describe a new biomolecular characterization and prioritization tool called Weighted In- Network Node Expansion and Ranking (WINNER). It takes the input of any molecular interaction network data and generates an optionally expanded network with all the nodes ranked according to their relevance to one another in the network. To help users assess the robustness of results, WINNER provides two different types of statistics. The first type is a node-expansion p-value, which helps evaluate the statistical significance of adding “non-seed” molecules to the original biomolecular interaction network consisting of “seed” molecules and molecular interactions. The second type is a node-ranking p-value, which helps evaluate the relative statistical significance of the contribution of each node to the overall network architecture. We validated the robustness of WINNER in ranking top molecules by spiking noises in several network permutation experiments. We have found that node degree–preservation randomization of the gene network produced normally distributed ranking scores, which outperform those made with other gene network randomization techniques. Furthermore, we validated that a more significant proportion of the WINNER-ranked genes was associated with disease biology than existing methods such as PageRank. We demonstrated the performance of WINNER with a few case studies, including Alzheimer's disease, breast cancer, myocardial infarctions, and Triple negative breast cancer (TNBC). In all these case studies, the expanded and top-ranked genes identified by WINNER reveal disease biology more significantly than those identified by other gene prioritizing software tools, including Ingenuity Pathway Analysis (IPA) and DiAMOND.

Conclusion

WINNER ranking strongly correlates to other ranking methods when the network covers sufficient node and edge information, indicating a high network quality. WINNER users can use this new tool to robustly evaluate a list of candidate genes, proteins, or metabolites produced from high-throughput biology experiments, as long as there is available gene/protein/metabolic network information.

Related collections

Most cited references 151

Record: found
Abstract: found
Article: found

Is Open Access

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

Michael Love, Wolfgang Huber, Simon Anders (2014)

In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0550-8) contains supplementary material, which is available to authorized users.

0 comments Cited 22867 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

STAR: ultrafast universal RNA-seq aligner.

Alexander Dobin, Carrie A. Davis, Felix Schlesinger … (2013)

Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

0 comments Cited 13262 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

A. Subramanian, P. Tamayo, V. K. Mootha … (2005)

Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

0 comments Cited 12496 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Thanh Nguyen: URI : http://loop.frontiersin.org/people/394411/overview

Zongliang Yue: URI : http://loop.frontiersin.org/people/1194964/overview

Radomir Slominski: URI : http://loop.frontiersin.org/people/1646044/overview

Robert Welner: URI : http://loop.frontiersin.org/people/382675/overview

Jianyi Zhang: URI : http://loop.frontiersin.org/people/880514/overview

Jake Y. Chen: URI : http://loop.frontiersin.org/people/531449/overview

Journal

Journal ID (nlm-ta): Front Big Data

Journal ID (iso-abbrev): Front Big Data

Journal ID (publisher-id): Front. Big Data

Title: Frontiers in Big Data

Publisher: Frontiers Media S.A.

ISSN (Electronic): 2624-909X

Publication date (Electronic): 04 November 2022

Publication date Collection: 2022

Volume: 5

Electronic Location Identifier: 1016606

Affiliations

[1] ¹Informatics Institute in School of Medicine, The University of Alabama at Birmingham , Birmingham, AL, United States

[2] ²Department of Biomedical Engineering, The University of Alabama at Birmingham , Birmingham, AL, United States

[3] ³Comprehensive Arthritis, Musculoskeletal, Bone and Autoimmunity Center (CAMBAC), School of Medicine, The University of Alabama at Birmingham , Birmingham, AL, United States

Author notes

Edited by: Prashanti Manda, University of North Carolina at Greensboro, United States

Reviewed by: Emre Sefer, Özyegin University, Turkey; Zhi-Ping Liu, Shandong University, China

*Correspondence: Jake Y. Chen jakechen@ 123456uab.edu

This article was submitted to Medicine and Public Health, a section of the journal Frontiers in Big Data

Article

DOI: 10.3389/fdata.2022.1016606

PMC ID: 9672476

PubMed ID: 36407327

SO-VID: ae373679-feb0-46a7-ba06-c2aed1beac5c

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

History

Date received : 11 August 2022

Date accepted : 14 October 2022

Page count

Figures: 11, Tables: 0, Equations: 12, References: 152, Pages: 21, Words: 14996

Funding

Funded by: Foundation for the National Institutes of Health, doi 10.13039/100000009;

Award ID: U54TR001005

Award ID: R01HL150078

WINNER: A network biology tool for biomolecular characterization and prioritization

Read this article at

Abstract

Background and contribution

Results

Conclusion

Related collections

Vector Biology

Most cited references 151

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

STAR: ultrafast universal RNA-seq aligner.

Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 143

Most referenced authors 2,830