201
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Identifying gene-disease associations using centrality on a literature mined gene-interaction network

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation: Understanding the role of genetics in diseases is one of the most important aims of the biological sciences. The completion of the Human Genome Project has led to a rapid increase in the number of publications in this area. However, the coverage of curated databases that provide information manually extracted from the literature is limited. Another challenge is that determining disease-related genes requires laborious experiments. Therefore, predicting good candidate genes before experimental analysis will save time and effort. We introduce an automatic approach based on text mining and network analysis to predict gene-disease associations. We collected an initial set of known disease-related genes and built an interaction network by automatic literature mining based on dependency parsing and support vector machines. Our hypothesis is that the central genes in this disease-specific network are likely to be related to the disease. We used the degree, eigenvector, betweenness and closeness centrality metrics to rank the genes in the network.

          Results: The proposed approach can be used to extract known and to infer unknown gene-disease associations. We evaluated the approach for prostate cancer. Eigenvector and degree centrality achieved high accuracy. A total of 95% of the top 20 genes ranked by these methods are confirmed to be related to prostate cancer. On the other hand, betweenness and closeness centrality predicted more genes whose relation to the disease is currently unknown and are candidates for experimental study.

          Availability: A web-based system for browsing the disease-specific gene-interaction networks is available at: http://gin.ncibi.org

          Contact: radev@ 123456umich.edu

          Related collections

          Most cited references41

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            BIND: the Biomolecular Interaction Network Database.

            The Biomolecular Interaction Network Database (BIND: http://bind.ca) archives biomolecular interaction, complex and pathway information. A web-based system is available to query, view and submit records. BIND continues to grow with the addition of individual submissions as well as interaction data from the PDB and a number of large-scale interaction and complex mapping experiments using yeast two hybrid, mass spectrometry, genetic interactions and phage display. We have developed a new graphical analysis tool that provides users with a view of the domain composition of proteins in interaction and complex records to help relate functional domains to protein interactions. An interaction network clustering tool has also been developed to help focus on regions of interest. Continued input from users has helped further mature the BIND data specification, which now includes the ability to store detailed information about genetic interactions. The BIND data specification is available as ASN.1 and XML DTD.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A network of protein-protein interactions in yeast.

              A global analysis of 2,709 published interactions between proteins of the yeast Saccharomyces cerevisiae has been performed, enabling the establishment of a single large network of 2,358 interactions among 1,548 proteins. Proteins of known function and cellular location tend to cluster together, with 63% of the interactions occurring between proteins with a common functional assignment and 76% occurring between proteins found in the same subcellular compartment. Possible functions can be assigned to a protein based on the known functions of its interacting partners. This approach correctly predicts a functional category for 72% of the 1,393 characterized proteins with at least one partner of known function, and has been applied to predict functions for 364 previously uncharacterized proteins.
                Bookmark

                Author and article information

                Journal
                Bioinformatics
                bioinformatics
                bioinfo
                Bioinformatics
                Oxford University Press
                1367-4803
                1460-2059
                1 July 2008
                1 July 2008
                : 24
                : 13
                : i277-i285
                Affiliations
                1Electrical Engineering and Computer Science and 2School of Information, University of Michigan, Ann Arbor, MI 48109, USA
                Author notes
                *To whom correspondence should be addressed.
                Article
                btn182
                10.1093/bioinformatics/btn182
                2718658
                18586725
                6d3ce64e-2a8f-4e7a-b329-f211f749ccbe
                © 2008 The Author(s)

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                Categories
                Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto
                Original Papers
                Text Mining

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article