89
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The dramatic increase in heterogeneous types of biological data—in particular, the abundance of new protein sequences—requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity—GPCRs and kinases from humans, and the crotonase superfamily of enzymes—we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.

          Related collections

          Most cited references33

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            MRBAYES: Bayesian inference of phylogenetic trees.

            The program MRBAYES performs Bayesian inference of phylogeny using a variant of Markov chain Monte Carlo. MRBAYES, including the source code, documentation, sample data files, and an executable, is available at http://brahms.biology.rochester.edu/software.html.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The protein kinase complement of the human genome.

              G. Manning (2002)
              We have catalogued the protein kinase complement of the human genome (the "kinome") using public and proprietary genomic, complementary DNA, and expressed sequence tag (EST) sequences. This provides a starting point for comprehensive analysis of protein phosphorylation in normal and disease states, as well as a detailed view of the current state of human genome analysis through a focus on one large gene family. We identify 518 putative protein kinase genes, of which 71 have not previously been reported or described as kinases, and we extend or correct the protein sequences of 56 more kinases. New genes include members of well-studied families as well as previously unidentified families, some of which are conserved in model organisms. Classification and comparison with model organism kinomes identified orthologous groups and highlighted expansions specific to human and other lineages. We also identified 106 protein kinase pseudogenes. Chromosomal mapping revealed several small clusters of kinase genes and revealed that 244 kinases map to disease loci or cancer amplicons.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2009
                3 February 2009
                : 4
                : 2
                : e4345
                Affiliations
                [1 ]Graduate Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California, United States of America
                [2 ]Institute for Quantitative Biosciences, University of California San Francisco, San Francisco, California, United States of America
                [3 ]Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California, United States of America
                [4 ]Department of Biopharmaceutical Sciences, University of California San Francisco, San Francisco, California, United States of America
                Georgia Institute of Technology, United States of America
                Author notes

                Conceived and designed the experiments: HJA PCB. Performed the experiments: HJA. Analyzed the data: HJA. Contributed reagents/materials/analysis tools: JHM TF. Wrote the paper: HJA PCB.

                Article
                08-PONE-RA-06316R1
                10.1371/journal.pone.0004345
                2631154
                19190775
                46b94a38-6434-4216-87fa-1bf046dc6045
                This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
                History
                : 10 September 2008
                : 10 December 2008
                Page count
                Pages: 14
                Categories
                Research Article
                Computational Biology/Macromolecular Sequence Analysis
                Evolutionary Biology/Bioinformatics
                Genetics and Genomics/Bioinformatics
                Genetics and Genomics/Gene Function

                Uncategorized
                Uncategorized

                Comments

                Comment on this article