InterPro in 2019: improving coverage, classification and access to protein sequence annotations

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The InterPro database ( http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities.

Related collections

Most cited references 15

Record: found
Abstract: found
Article: not found

Gene Ontology: tool for the unification of biology

Michael Ashburner, Catherine A. Ball, Judith Blake … (2002)

Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

0 comments Cited 15850 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Ensembl Genomes 2016: more genomes, more complexity

Paul Kersey, James Allen, Irina M. Armean … (2015)

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces.

0 comments Cited 257 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

MOLEonline: a web-based tool for analyzing channels, tunnels and pores (2018 update )

Lukáš Pravda, David Sehnal, Dominik Toušek … (2018)

Abstract MOLEonline is an interactive, web-based application for the detection and characterization of channels (pores and tunnels) within biomacromolecular structures. The updated version of MOLEonline overcomes limitations of the previous version by incorporating the recently developed LiteMol Viewer visualization engine and providing a simple, fully interactive user experience. The application enables two modes of calculation: one is dedicated to the analysis of channels while the other was specifically designed for transmembrane pores. As the application can use both PDB and mmCIF formats, it can be leveraged to analyze a wide spectrum of biomacromolecular structures, e.g. stemming from NMR, X-ray and cryo-EM techniques. The tool is interconnected with other bioinformatics tools (e.g., PDBe, CSA, ChannelsDB, OPM, UniProt) to help both setup and the analysis of acquired results. MOLEonline provides unprecedented analytics for the detection and structural characterization of channels, as well as information about their numerous physicochemical features. Here we present the application of MOLEonline for structural analyses of α-hemolysin and transient receptor potential mucolipin 1 (TRMP1) pores. The MOLEonline application is freely available via the Internet at https://mole.upol.cz.

0 comments Cited 123 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Journal ID (publisher-id): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date (Print): 08 January 2019

Publication date (Electronic): 06 November 2018

Publication date PMC-release: 06 November 2018

Volume: 47

Issue: Database issue , Database issue

Pages: D351-D360

Affiliations

[1 ]European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

[2 ]School of Computer Science, The University of Manchester, Manchester M13 9PL, UK

[3 ]Department of Bioengineering & Therapeutic Sciences, University of California, San Francisco, CA 94158, USA

[4 ]European Molecular Biology Laboratory, Structural and Computational Biology Unit, Meyerhofstr.1, 69117 Heidelberg, Germany

[5 ]Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland

[6 ]Medical Research Council Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK

[7 ]J. Craig Venter Institute (JCVI), 9605 Medical Center Drive, Suite 150, Rockville, MD 20850, USA

[8 ]Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA

[9 ]Biobyte Solutions GmbH, Bothestr 142, 69126 Heidelberg, Germany

[10 ]National Center for Biotechnology Information, National Library of Medicine, NIH Bldg, 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA

[11 ]Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90033, USA

[12 ]Protein Information Resource, Georgetown University Medical Center, Washington, DC, USA

[13 ]Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy

[14 ]Department of Agricultural Sciences, University of Udine, via Palladio 8, 33100 Udine, Italy

[15 ]Fondazione Edmund Mach, Via E. Mach 1, 38010 S. Michele all’Adige, Italy

[16 ]Structural and Molecular Biology, University College London, Darwin Building, London WC1E 6BT, UK

Author notes

To whom correspondence should be addressed. Tel: +44 1223 492679; Fax: +44 1223 494468; Email: rdf@ 123456ebi.ac.uk

Author information

Alex L Mitchell http://orcid.org/0000-0001-8655-7966

Teresa K Attwood http://orcid.org/0000-0003-2409-4235

Sara El-Gebali http://orcid.org/0000-0003-1378-5495

Simon C Potter http://orcid.org/0000-0003-4208-4102

Matloob A Qureshi http://orcid.org/0000-0003-2208-4236

Neil D Rawlings http://orcid.org/0000-0001-5557-7665

Lorna J Richardson http://orcid.org/0000-0002-3655-5660

Paul D Thomas http://orcid.org/0000-0002-9074-3507

Silvio C E Tosatto http://orcid.org/0000-0003-4525-7793

Robert D Finn http://orcid.org/0000-0001-8626-2148

Article

Publisher ID: gky1100

DOI: 10.1093/nar/gky1100

PMC ID: 6323941

PubMed ID: 30398656

SO-VID: b933de63-b27f-4bc4-b5a1-0c0bef0f62be

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date accepted : 22 October 2018

Date revision received : 19 October 2018

Date received : 27 September 2018

Page count

Pages: 10

Funding

Funded by: Wellcome Trust 10.13039/100004440

Award ID: 108433/Z/15/Z

Funded by: Biotechnology and Biological Sciences Research Council 10.13039/501100000268

Award ID: BB/N00521X/1

Award ID: BB/N019172/1

Award ID: BB/L024136/1

Funded by: National Science Foundation, Division of Biological Infrastructure 10.13039/100006445

InterPro in 2019: improving coverage, classification and access to protein sequence annotations

Read this article at

Abstract

Related collections

G3: Genes|Genomes|Genetics

Most cited references 15

Gene Ontology: tool for the unification of biology

Ensembl Genomes 2016: more genomes, more complexity

MOLEonline: a web-based tool for analyzing channels, tunnels and pores (2018 update )

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 126

Cited by 663

Most referenced authors 1,508