Blog
About

  • Record: found
  • Abstract: found
  • Article: found
Is Open Access

InterPro in 2017—beyond protein family and domain annotations

1 , * , 2 , 3 , 1 , 4 , 5 , 1 , 6 , 1 , 1 , 7 , 8 , 3 , 9 , 10 , 11 , 1 , 12 , 12 , 10 , 1 , 13 , 14 , 1 , 15 , 1 , 1 , 14 , 1 , 1 , 5 , 1 , 5 , 1 , 5 , 15 , 7 , 1 , 8 , 12 , 10 , 14 , 16 , 9 , 5 , 13 , 1 , 1

Nucleic Acids Research

Oxford University Press

Read this article at

Bookmark
      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

      Abstract

      InterPro ( http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPro's predictive models, which are provided by its member databases. Here, we report recent developments with InterPro and its associated software, including the addition of two new databases (SFLD and CDD), and the functionality to include residue-level annotation and prediction of intrinsic disorder. These developments enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences.

      Related collections

      Most cited references 47

      • Record: found
      • Abstract: not found
      • Article: not found

      Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

      The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
        Bookmark
        • Record: found
        • Abstract: not found
        • Article: not found

        Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

          Bookmark
          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The Pfam protein families database: towards a more sustainable future

          In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool.
            Bookmark

            Author and article information

            Affiliations
            [1 ]European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
            [2 ]School of Computer Science, University of Manchester, UK
            [3 ]Department of Bioengineering & Therapeutic Sciences, University of California, San Francisco, CA 94143, USA
            [4 ]European Molecular Biology Laboratory, Biocomputing, Meyerhofstasse 1, 69117 Heidelberg, Germany
            [5 ]Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland
            [6 ]MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Pázmány Péter sétány 1/c, Budapest, Hungary
            [7 ]Computer Science department, University of Bristol, Woodland Road, Bristol BS8 1UB, UK
            [8 ]Bioinformatics Department, J. Craig Venter Institute, 9714 Medical Center Drive, Rockville, MD 20850, USA
            [9 ]Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
            [10 ]Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90033, USA
            [11 ]Biobyte Solutions GmbH, Bothestr. 142, 69126 Heidelberg, Germany
            [12 ]National Center for Biotechnology Information, National Library of Medicine, NIH Bldg, 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
            [13 ]Georgetown University Medical Center, 3300 Whitehaven St, NW, Washington, DC 20007, USA
            [14 ]Department of Biomedical Sciences and CRIBI Biotech Center, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
            [15 ]Structural and Molecular Biology, University College London, Darwin Building, London WC1E 6BT, UK
            [16 ]CNR Institute of Neuroscience, via U. Bassi 58/b, 35131 Padua, Italy
            Author notes
            [* ]To whom correspondence should be addressed. Tel: +44 1223 492 679; Fax: +44 1223 494 46; Email: rdf@ 123456ebi.ac.uk
            Journal
            Nucleic Acids Res
            Nucleic Acids Res
            nar
            nar
            Nucleic Acids Research
            Oxford University Press
            0305-1048
            1362-4962
            04 January 2017
            28 November 2016
            28 November 2016
            : 45
            : Database issue , Database issue
            : D190-D199
            27899635
            5210578
            10.1093/nar/gkw1107
            © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

            This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

            Counts
            Pages: 10
            Product
            Categories
            Database Issue
            Custom metadata
            04 January 2017

            Genetics

            Comments

            Comment on this article