A robust gene expression signature for NASH in liver expression data

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Non-Alcoholic Fatty Liver Disease (NAFLD) is a progressive liver disease that affects up to 30% of worldwide population, of which up to 25% progress to Non-Alcoholic SteatoHepatitis (NASH), a severe form of the disease that involves inflammation and predisposes the patient to liver cirrhosis. Despite its epidemic proportions, there is no reliable diagnostics that generalizes to global patient population for distinguishing NASH from NAFLD. We performed a comprehensive multicohort analysis of publicly available transcriptome data of liver biopsies from Healthy Controls (HC), NAFLD and NASH patients. Altogether we analyzed 812 samples from 12 different datasets across 7 countries, encompassing real world patient heterogeneity. We used 7 datasets for discovery and 5 datasets were held-out for independent validation. Altogether we identified 130 genes significantly differentially expressed in NASH versus a mixed group of NAFLD and HC. We show that our signature is not driven by one particular group (NAFLD or HC) and reflects true biological signal. Using a forward search we were able to downselect to a parsimonious set of 19 mRNA signature with mean AUROC of 0.98 in discovery and 0.79 in independent validation. Methods for consistent diagnosis of NASH relative to NAFLD are urgently needed. We showed that gene expression data combined with advanced statistical methodology holds the potential to serve basis for development of such diagnostic tests for the unmet clinical need.

Related collections

Most cited references 56

Record: found
Abstract: found
Article: not found

STAR: ultrafast universal RNA-seq aligner.

Alexander Dobin, Carrie A. Davis, Felix Schlesinger … (2013)

Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

0 comments Cited 13496 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

clusterProfiler: an R package for comparing biological themes among gene clusters.

Guangchuang Yu, Li-Gen Wang, Yanyan Han … (2012)

Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis. Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters. The analysis module and visualization module were combined into a reusable workflow. Currently, clusterProfiler supports three species, including humans, mice, and yeast. Methods provided in this package can be easily extended to other species and ontologies. The clusterProfiler package is released under Artistic-2.0 License within Bioconductor project. The source code and vignette are freely available at http://bioconductor.org/packages/release/bioc/html/clusterProfiler.html.

0 comments Cited 11095 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

limma powers differential expression analyses for RNA-sequencing and microarray studies

Matthew E. Ritchie, Belinda Phipson, Di Wu … (2015)

limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

0 comments Cited 10984 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Yudong D. He: yhe@inflammatix.com

Timothy E. Sweeney: tsweeney@inflammatix.com

Journal

Journal ID (nlm-ta): Sci Rep

Journal ID (iso-abbrev): Sci Rep

Title: Scientific Reports

Publisher: Nature Publishing Group UK (London )

ISSN (Electronic): 2045-2322

Publication date (Electronic): 16 February 2022

Publication date PMC-release: 16 February 2022

Publication date Collection: 2022

Volume: 12

Electronic Location Identifier: 2571

Affiliations

[1 ]Inflammatix, Inc., 863 Mitten Rd, Suite 104, Burlingame, CA 94010 USA

[2 ]GRID grid.168010.e, ISNI 0000000419368956, Institute for Immunity, Transplantation and Infection, School of Medicine, , Stanford University, ; Palo Alto, CA 94305 USA

[3 ]GRID grid.168010.e, ISNI 0000000419368956, Department of Medicine, Center for Biomedical Informatics Research, , Stanford University, ; Stanford, CA 94305 USA

Article

Publisher ID: 6512

DOI: 10.1038/s41598-022-06512-0

PMC ID: 8850484

PubMed ID: 35173224

SO-VID: 0b6cdbe5-a767-4b90-905b-e6dc03883670

License:

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

History

Date received : 29 October 2021

Date accepted : 31 January 2022

Custom metadata

ScienceOpen disciplines: Uncategorized

Keywords: diagnostic markers,non-alcoholic fatty liver disease,non-alcoholic steatohepatitis

Data availability:

ScienceOpen disciplines: Uncategorized

Keywords: diagnostic markers, non-alcoholic fatty liver disease, non-alcoholic steatohepatitis

A robust gene expression signature for NASH in liver expression data

Read this article at

Abstract

Related collections

Karger: Oncology

Most cited references 56

STAR: ultrafast universal RNA-seq aligner.

clusterProfiler: an R package for comparing biological themes among gene clusters.

limma powers differential expression analyses for RNA-sequencing and microarray studies

Author and article information

Contributors

Journal

Affiliations

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 372

Cited by 5

Most referenced authors 1,248