229
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction

      product-review
      1 , , 1 , 1
      BMC Bioinformatics
      BioMed Central

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for prediction methods that show more robustness and higher accuracy.

          Results

          We extended our previous MultiLoc predictor by incorporating phylogenetic profiles and Gene Ontology terms. Two different datasets were used for training the system, resulting in two versions of this high-accuracy prediction method. One version is specialized for globular proteins and predicts up to five localizations, whereas a second version covers all eleven main eukaryotic subcellular localizations. In a benchmark study with five localizations, MultiLoc2 performs considerably better than other methods for animal and plant proteins and comparably for fungal proteins. Furthermore, MultiLoc2 performs clearly better when using a second dataset that extends the benchmark study to all eleven main eukaryotic subcellular localizations.

          Conclusion

          MultiLoc2 is an extensive high-performance subcellular protein localization prediction system. By incorporating phylogenetic profiles and Gene Ontology terms MultiLoc2 yields higher accuracies compared to its previous version. Moreover, it outperforms other prediction systems in two benchmarks studies. MultiLoc2 is available as user-friendly and free web-service, available at: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc2.

          Related collections

          Most cited references42

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Assigning protein functions by comparative genome analysis: protein phylogenetic profiles.

            Determining protein functions from genomic sequences is a central goal of bioinformatics. We present a method based on the assumption that proteins that function together in a pathway or structural complex are likely to evolve in a correlated fashion. During evolution, all such functionally linked proteins tend to be either preserved or eliminated in a new species. We describe this property of correlated evolution by characterizing each protein by its phylogenetic profile, a string that encodes the presence or absence of a protein in every known genome. We show that proteins having matching or similar profiles strongly tend to be functionally linked. This method of phylogenetic profiling allows us to predict the function of uncharacterized proteins.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Predotar: A tool for rapidly screening proteomes for N-terminal targeting sequences.

              Probably more than 25% of the proteins encoded by the nuclear genomes of multicellular eukaryotes are targeted to membrane-bound compartments by N-terminal targeting signals. The major signals are those for the endoplasmic reticulum, the mitochondria, and in plants, plastids. The most abundant of these targeted proteins are well-known and well-studied, but a large proportion remain unknown, including most of those involved in regulation of organellar gene expression or regulation of biochemical pathways. The discovery and characterization of these proteins by biochemical means will be long and difficult. An alternative method is to identify candidate organellar proteins via their characteristic N-terminal targeting sequences. We have developed a neural network-based approach (Predotar--Prediction of Organelle Targeting sequences) for identifying genes encoding these proteins amongst eukaryotic genome sequences. The power of this approach for identifying and annotating novel gene families has been illustrated by the discovery of the pentatricopeptide repeat family.
                Bookmark

                Author and article information

                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central
                1471-2105
                2009
                1 September 2009
                : 10
                : 274
                Affiliations
                [1 ]Division for Simulation of Biological Systems, ZBIT/WSI, Eberhard-Karls-Universität Tübingen, Germany
                Article
                1471-2105-10-274
                10.1186/1471-2105-10-274
                2745392
                19723330
                bed13067-fc81-486c-a236-872f661e5c5b
                Copyright © 2009 Blum et al; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 19 April 2009
                : 1 September 2009
                Categories
                Software

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article