Activities at the Universal Protein Resource (UniProt)

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The mission of the Universal Protein Resource (UniProt) ( http://www.uniprot.org) is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequences and functional annotation. It integrates, interprets and standardizes data from literature and numerous resources to achieve the most comprehensive catalog possible of protein information. The central activities are the biocuration of the UniProt Knowledgebase and the dissemination of these data through our Web site and web services. UniProt is produced by the UniProt Consortium, which consists of groups from the European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB) and the Protein Information Resource (PIR). UniProt is updated and distributed every 4 weeks and can be accessed online for searches or downloads.

Related collections

Most cited references 18

Record: found
Abstract: found
Article: not found

UniRef: comprehensive and non-redundant UniProt reference clusters.

Baris Suzek, Hongzhan Huang, Peter McGarvey … (2007)

Redundant protein sequences in biological databases hinder sequence similarity searches and make interpretation of search results difficult. Clustering of protein sequence space based on sequence similarity helps organize all sequences into manageable datasets and reduces sampling bias and overrepresentation of sequences. The UniRef (UniProt Reference Clusters) provide clustered sets of sequences from the UniProt Knowledgebase (UniProtKB) and selected UniProt Archive records to obtain complete coverage of sequence space at several resolutions while hiding redundant sequences. Currently covering >4 million source sequences, the UniRef100 database combines identical sequences and subfragments from any source organism into a single UniRef entry. UniRef90 and UniRef50 are built by clustering UniRef100 sequences at the 90 or 50% sequence identity levels. UniRef100, UniRef90 and UniRef50 yield a database size reduction of approximately 10, 40 and 70%, respectively, from the source sequence set. The reduced redundancy increases the speed of similarity searches and improves detection of distant relationships. UniRef entries contain summary cluster and membership information, including the sequence of a representative protein, member count and common taxonomy of the cluster, the accession numbers of all the merged entries and links to rich functional annotation in UniProtKB to facilitate biological discovery. UniRef has already been applied to broad research areas ranging from genome annotation to proteomics data analysis. UniRef is updated biweekly and is available for online search and retrieval at http://www.uniprot.org, as well as for download at ftp://ftp.uniprot.org/pub/databases/uniprot/uniref. Supplementary data are available at Bioinformatics online.

0 comments Cited 570 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Reorganizing the protein space at the Universal Protein Resource (UniProt)

emmanuel boutet, Claire O'Donovan (2011)

The mission of UniProt is to support biological research by providing a freely accessible, stable, comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and querying interfaces. UniProt is comprised of four major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledgebase, the UniProt Reference Clusters and the UniProt Metagenomic and Environmental Sequence Database. A key development at UniProt is the provision of complete, reference and representative proteomes. UniProt is updated and distributed every 4 weeks and can be accessed online for searches or download at http://www.uniprot.org.

0 comments Cited 487 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Pseudomonas Genome Database: improved comparative analysis and population genomics capability for Pseudomonas genomes

Geoffrey L. Winsor, David K. W. Lam, Leanne Fleming … (2010)

Pseudomonas is a metabolically-diverse genus of bacteria known for its flexibility and leading free living to pathogenic lifestyles in a wide range of hosts. The Pseudomonas Genome Database (http://www.pseudomonas.com) integrates completely-sequenced Pseudomonas genome sequences and their annotations with genome-scale, high-precision computational predictions and manually curated annotation updates. The latest release implements an ability to view sequence polymorphisms in P. aeruginosa PAO1 versus other reference strains, incomplete genomes and single gene sequences. This aids analysis of phenotypic variation between closely related isolates and strains, as well as wider population genomics and evolutionary studies. The wide range of tools for comparing Pseudomonas annotations and sequences now includes a strain-specific access point for viewing high precision computational predictions including updated, more accurate, protein subcellular localization and genomic island predictions. Views link to genome-scale experimental data as well as comparative genomics analyses that incorporate robust genera-geared methods for predicting and clustering orthologs. These analyses can be exploited for identifying putative essential and core Pseudomonas genes or identifying large-scale evolutionary events. The Pseudomonas Genome Database aims to provide a continually updated, high quality source of genome annotations, specifically tailored for Pseudomonas researchers, but using an approach that may be implemented for other genera-level research communities.

0 comments Cited 240 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Journal ID (publisher-id): nar

Journal ID (hwp): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date (Print): January 2014

Publication date (Electronic): 16 November 2013

Publication date PMC-release: 16 November 2013

Volume: 42

Issue: D1 , Database issue

Pages: D191-D198

Affiliations

¹European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, ²SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, 1211 Geneva 4, Switzerland, ³Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven Street North West, Suite 1200, Washington, DC 20007, USA and ⁴Protein Information Resource, University of Delaware, 15 Innovation Way, Suite 205, Newark, DE 19711, USA

Author notes

*To whom correspondence should be addressed. Tel: +44 1223 494100; Fax: +44 1223 494468; Email: agb@ 123456ebi.ac.uk

Article

Publisher ID: gkt1140

DOI: 10.1093/nar/gkt1140

PMC ID: 3965022

PubMed ID: 24253303

SO-VID: a41d691d-d4d9-4a6e-ba95-e0388ed6697f

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 22 October 2013

Date accepted : 24 October 2013

Page count

Pages: 8

Custom metadata

cover-date 1 January 2014

ScienceOpen disciplines: Genetics

Data availability:

ScienceOpen disciplines: Genetics

Comments

Comment on this article

scite_

Cited by 543

See all cited by

- Version 1

Activities at the Universal Protein Resource (UniProt)

Read this article at

Abstract

Related collections

Genomic Prediction

Most cited references 18

UniRef: comprehensive and non-redundant UniProt reference clusters.

Reorganizing the protein space at the Universal Protein Resource (UniProt)

Pseudomonas Genome Database: improved comparative analysis and population genomics capability for Pseudomonas genomes

Author and article information

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Custom metadata

Comments

Comment on this article

Similar content 300

Cited by 543