73
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Prediction of Human Disease Genes by Human-Mouse Conserved Coexpression Analysis

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates.

          Methodology/Principal Findings

          We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.

          Conclusion

          Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.

          Author Summary

          One of the most limiting aspects of biological research in the post-genomic era is the capability to integrate massive datasets on gene structure and function for producing useful biological knowledge. In this report we have applied an integrative approach to address the problem of identifying likely candidate genes within loci associated with human genetic diseases. Despite the recent progress in sequencing technologies, approaching this problem from an experimental perspective still represents a very demanding task, because the critical region may typically contain hundreds of positional candidates. We found that by concentrating only on genes sharing similar expression profiles in both human and mouse, massive microarray datasets can be used to reliably identify disease-relevant relationships among genes. Moreover, we found that integrating the coexpression criterion with systematic phenome analysis allows efficient identification of disease genes in large genomic regions. Using this approach on 850 OMIM loci characterized by unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.

          Related collections

          Most cited references43

          • Record: found
          • Abstract: found
          • Article: not found

          Cluster analysis and display of genome-wide expression patterns.

          A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A gene-coexpression network for global discovery of conserved genetic modules.

            To elucidate gene function on a global scale, we identified pairs of genes that are coexpressed over 3182 DNA microarrays from humans, flies, worms, and yeast. We found 22,163 such coexpression relationships, each of which has been conserved across evolution. This conservation implies that the coexpression of these gene pairs confers a selective advantage and therefore that these genes are functionally related. Many of these relationships provide strong evidence for the involvement of new genes in core biological functions such as the cell cycle, secretion, and protein expression. We experimentally confirmed the predictions implied by some of these links and identified cell proliferation functions for several genes. By assembling these links into a gene-coexpression network, we found several components that were animal-specific as well as interrelationships between newly evolved and ancient modules.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Development of human protein reference database as an initial platform for approaching systems biology in humans.

              Human Protein Reference Database (HPRD) is an object database that integrates a wealth of information relevant to the function of human proteins in health and disease. Data pertaining to thousands of protein-protein interactions, posttranslational modifications, enzyme/substrate relationships, disease associations, tissue expression, and subcellular localization were extracted from the literature for a nonredundant set of 2750 human proteins. Almost all the information was obtained manually by biologists who read and interpreted >300,000 published articles during the annotation process. This database, which has an intuitive query interface allowing easy access to all the features of proteins, was built by using open source technologies and will be freely available at http://www.hprd.org to the academic community. This unified bioinformatics platform will be useful in cataloging and mining the large number of proteomic interactions and alterations that will be discovered in the postgenomic era.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                plos
                plcb
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                1553-734X
                1553-7358
                March 2008
                March 2008
                28 March 2008
                : 4
                : 3
                : e1000043
                Affiliations
                [1 ]Molecular Biotechnology Center, Department of Genetics, Biology and Biochemistry, University of Turin, Turin, Italy
                [2 ]Department of Human Genetics and Centre for Molecular and Biomolecular Informatics, University Medical Centre Nijmegen, Nijmegen, The Netherlands
                Lilly Singapore Centre for Drug Discovery, Singapore
                Author notes

                Conceived and designed the experiments: UA RMP PP FDC. Analyzed the data: UA RMP. Contributed reagents/materials/analysis tools: EG CD LS MO. Wrote the paper: PP FDC.

                Article
                07-PLCB-RA-0633R2
                10.1371/journal.pcbi.1000043
                2268251
                18369433
                9eb546cd-f708-4536-9797-169f9c6b5419
                Ala et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 16 October 2007
                : 20 February 2008
                Page count
                Pages: 17
                Categories
                Research Article
                Computational Biology
                Computational Biology/Comparative Sequence Analysis
                Computational Biology/Metagenomics
                Computational Biology/Molecular Genetics

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article