Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
28
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          InterPro amalgamates predictive protein signatures from a number of well-known partner databases into a single resource. To aid with interpretation of results, InterPro entries are manually annotated with terms from the Gene Ontology (GO). The InterPro2GO mappings are comprised of the cross-references between these two resources and are the largest source of GO annotation predictions for proteins. Here, we describe the protocol by which InterPro curators integrate GO terms into the InterPro database. We discuss the unique challenges involved in integrating specific GO terms with entries that may describe a diverse set of proteins, and we illustrate, with examples, how InterPro hierarchies reflect GO terms of increasing specificity. We describe a revised protocol for GO mapping that enables us to assign GO terms to domains based on the function of the individual domain, rather than the function of the families in which the domain is found. We also discuss how taxonomic constraints are dealt with and those cases where we are unable to add any appropriate GO terms. Expert manual annotation of InterPro entries with GO terms enables users to infer function, process or subcellular information for uncharacterized sequences based on sequence matches to predictive models.

          Database URL: http://www.ebi.ac.uk/interpro. The complete InterPro2GO mappings are available at: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/external2go/interpro2go

          Related collections

          Most cited references9

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium

            Protein Analysis THrough Evolutionary Relationships (PANTHER) is a comprehensive software system for inferring the functions of genes based on their evolutionary relationships. Phylogenetic trees of gene families form the basis for PANTHER and these trees are annotated with ontology terms describing the evolution of gene function from ancestral to modern day genes. One of the main applications of PANTHER is in accurate prediction of the functions of uncharacterized genes, based on their evolutionary relationships to genes with functions known from experiment. The PANTHER website, freely available at http://www.pantherdb.org, also includes software tools for analyzing genomic data relative to known and inferred gene functions. Since 2007, there have been several new developments to PANTHER: (i) improved phylogenetic trees, explicitly representing speciation and gene duplication events, (ii) identification of gene orthologs, including least diverged orthologs (best one-to-one pairs), (iii) coverage of more genomes (48 genomes, up to 87% of genes in each genome; see http://www.pantherdb.org/panther/summaryStats.jsp), (iv) improved support for alternative database identifiers for genes, proteins and microarray probes and (v) adoption of the SBGN standard for display of biological pathways. In addition, PANTHER trees are being annotated with gene function as part of the Gene Ontology Reference Genome project, resulting in an increasing number of curated functional annotations.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes

              TIGRFAMs is a collection of protein family definitions built to aid in high-throughput annotation of specific protein functions. Each family is based on a hidden Markov model (HMM), where both cutoff scores and membership in the seed alignment are chosen so that the HMMs can classify numerous proteins according to their specific molecular functions. Most TIGRFAMs models describe ‘equivalog’ families, where both orthology and lateral gene transfer may be part of the evolutionary history, but where a single molecular function has been conserved. The Genome Properties system contains a queriable set of metabolic reconstructions, genome metrics and extractions of information from the scientific literature. Its genome-by-genome assertions of whether or not specific structures, pathways or systems are present provide high-level conceptual descriptions of genomic content. These assertions enable comparative genomics, provide a meaningful biological context to aid in manual annotation, support assignments of Gene Ontology (GO) biological process terms and help validate HMM-based predictions of protein function. The Genome Properties system is particularly useful as a generator of phylogenetic profiles, through which new protein family functions may be discovered. The TIGRFAMs and Genome Properties systems can be accessed at and .
                Bookmark

                Author and article information

                Journal
                Database (Oxford)
                database
                databa
                Database: The Journal of Biological Databases and Curation
                Oxford University Press
                1758-0463
                2012
                1 February 2012
                1 February 2012
                : 2012
                : bar068
                Affiliations
                1EMBL-EBI, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK and 2Computational Biology Group, Institute of Infectious Disease and Molecular Medicine, Department of Clinical Laboratory Sciences, University of Cape Town Medical School, Anzio Road, Observatory 7925, Cape Town, South Africa
                Author notes
                * Corresponding author: Tel: +441223494481; Fax: +441223494468; Email: hunter@ 123456ebi.ac.uk
                Contributed equally to this work and should be considered joint first authors.
                Article
                bar068
                10.1093/database/bar068
                3270475
                22301074
                214f9866-4510-4eb5-9ae4-9846971fb0b7
                © The Author(s) 2012. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 15 April 2011
                : 16 December 2011
                : 23 December 2011
                Page count
                Pages: 6
                Categories
                Original Article

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article