Prediction of Human Disease Genes by Human-Mouse Conserved Coexpression Analysis

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates.

Methodology/Principal Findings

We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.

Conclusion

Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.

Author Summary

One of the most limiting aspects of biological research in the post-genomic era is the capability to integrate massive datasets on gene structure and function for producing useful biological knowledge. In this report we have applied an integrative approach to address the problem of identifying likely candidate genes within loci associated with human genetic diseases. Despite the recent progress in sequencing technologies, approaching this problem from an experimental perspective still represents a very demanding task, because the critical region may typically contain hundreds of positional candidates. We found that by concentrating only on genes sharing similar expression profiles in both human and mouse, massive microarray datasets can be used to reliably identify disease-relevant relationships among genes. Moreover, we found that integrating the coexpression criterion with systematic phenome analysis allows efficient identification of disease genes in large genomic regions. Using this approach on 850 OMIM loci characterized by unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.

Related collections

Most cited references 43

Record: found
Abstract: found
Article: not found

Cluster analysis and display of genome-wide expression patterns.

P. T. Spellman, P. O. Brown, D Botstein … (1998)

A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.

0 comments Cited 1865 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

A gene-coexpression network for global discovery of conserved genetic modules.

Joshua M. Stuart, Eran Segal, Daphne Koller … (2003)

To elucidate gene function on a global scale, we identified pairs of genes that are coexpressed over 3182 DNA microarrays from humans, flies, worms, and yeast. We found 22,163 such coexpression relationships, each of which has been conserved across evolution. This conservation implies that the coexpression of these gene pairs confers a selective advantage and therefore that these genes are functionally related. Many of these relationships provide strong evidence for the involvement of new genes in core biological functions such as the cell cycle, secretion, and protein expression. We experimentally confirmed the predictions implied by some of these links and identified cell proliferation functions for several genes. By assembling these links into a gene-coexpression network, we found several components that were animal-specific as well as interrelationships between newly evolved and ancient modules.

0 comments Cited 812 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Development of human protein reference database as an initial platform for approaching systems biology in humans.

Suraj Peri, J Navarro, Ramars Amanchy … (2003)

Human Protein Reference Database (HPRD) is an object database that integrates a wealth of information relevant to the function of human proteins in health and disease. Data pertaining to thousands of protein-protein interactions, posttranslational modifications, enzyme/substrate relationships, disease associations, tissue expression, and subcellular localization were extracted from the literature for a nonredundant set of 2750 human proteins. Almost all the information was obtained manually by biologists who read and interpreted >300,000 published articles during the annotation process. This database, which has an intuitive query interface allowing easy access to all the features of proteins, was built by using open source technologies and will be freely available at http://www.hprd.org to the academic community. This unified bioinformatics platform will be useful in cataloging and mining the large number of proteomic interactions and alterations that will be discovered in the postgenomic era.

0 comments Cited 341 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

: Role: Editor

Journal

Journal ID (nlm-ta): PLoS Comput Biol

Journal ID (publisher-id): plos

Journal ID (publisher-id): plcb

Journal ID (pmc): ploscomp

Title: PLoS Computational Biology

Publisher: Public Library of Science (San Francisco, USA )

ISSN (Print): 1553-734X

ISSN (Electronic): 1553-7358

Publication date Collection: March 2008

Publication date (Print): March 2008

Publication date (Electronic): 28 March 2008

Volume: 4

Issue: 3

Electronic Location Identifier: e1000043

Affiliations

[1 ]Molecular Biotechnology Center, Department of Genetics, Biology and Biochemistry, University of Turin, Turin, Italy

[2 ]Department of Human Genetics and Centre for Molecular and Biomolecular Informatics, University Medical Centre Nijmegen, Nijmegen, The Netherlands

Lilly Singapore Centre for Drug Discovery, Singapore

Author notes

* E-mail: paolo.provero@ 123456unito.it (PP); ferdinando.dicunto@ 123456unito.it (FDC)

Conceived and designed the experiments: UA RMP PP FDC. Analyzed the data: UA RMP. Contributed reagents/materials/analysis tools: EG CD LS MO. Wrote the paper: PP FDC.

Article

Publisher ID: 07-PLCB-RA-0633R2

DOI: 10.1371/journal.pcbi.1000043

PMC ID: 2268251

PubMed ID: 18369433

SO-VID: 9eb546cd-f708-4536-9797-169f9c6b5419

Copyright © Ala et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 16 October 2007

Date accepted : 20 February 2008

Page count

Pages: 17

Comments

Comment on this article

scite_

Cited by 55

See all cited by

Prediction of Human Disease Genes by Human-Mouse Conserved Coexpression Analysis

Read this article at

Abstract

Background

Methodology/Principal Findings

Conclusion

Author Summary

Related collections

Genes & Diseases

Most cited references 43

Cluster analysis and display of genome-wide expression patterns.

A gene-coexpression network for global discovery of conserved genetic modules.

Development of human protein reference database as an initial platform for approaching systems biology in humans.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 19

Cited by 55