53
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      UniProt: the universal protein knowledgebase in 2021

      research-article
      The UniProt Consortium
      Nucleic Acids Research
      Oxford University Press

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this article, we describe significant updates that we have made over the last two years to the resource. The number of sequences in UniProtKB has risen to approximately 190 million, despite continued work to reduce sequence redundancy at the proteome level. We have adopted new methods of assessing proteome completeness and quality. We continue to extract detailed annotations from the literature to add to reviewed entries and supplement these in unreviewed entries with annotations provided by automated systems such as the newly implemented Association-Rule-Based Annotator (ARBA). We have developed a credit-based publication submission interface to allow the community to contribute publications and annotations to UniProt entries. We describe how UniProtKB responded to the COVID-19 pandemic through expert curation of relevant entries that were rapidly made available to the research community through a dedicated portal. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.

          Related collections

          Most cited references54

          • Record: found
          • Abstract: found
          • Article: not found

          Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology

          The American College of Medical Genetics and Genomics (ACMG) previously developed guidance for the interpretation of sequence variants. 1 In the past decade, sequencing technology has evolved rapidly with the advent of high-throughput next generation sequencing. By adopting and leveraging next generation sequencing, clinical laboratories are now performing an ever increasing catalogue of genetic testing spanning genotyping, single genes, gene panels, exomes, genomes, transcriptomes and epigenetic assays for genetic disorders. By virtue of increased complexity, this paradigm shift in genetic testing has been accompanied by new challenges in sequence interpretation. In this context, the ACMG convened a workgroup in 2013 comprised of representatives from the ACMG, the Association for Molecular Pathology (AMP) and the College of American Pathologists (CAP) to revisit and revise the standards and guidelines for the interpretation of sequence variants. The group consisted of clinical laboratory directors and clinicians. This report represents expert opinion of the workgroup with input from ACMG, AMP and CAP stakeholders. These recommendations primarily apply to the breadth of genetic tests used in clinical laboratories including genotyping, single genes, panels, exomes and genomes. This report recommends the use of specific standard terminology: ‘pathogenic’, ‘likely pathogenic’, ‘uncertain significance’, ‘likely benign’, and ‘benign’ to describe variants identified in Mendelian disorders. Moreover, this recommendation describes a process for classification of variants into these five categories based on criteria using typical types of variant evidence (e.g. population data, computational data, functional data, segregation data, etc.). Because of the increased complexity of analysis and interpretation of clinical genetic testing described in this report, the ACMG strongly recommends that clinical molecular genetic testing should be performed in a CLIA-approved laboratory with results interpreted by a board-certified clinical molecular geneticist or molecular genetic pathologist or equivalent.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            UniProt: a worldwide hub of protein knowledge

            (2018)
            Abstract The UniProt Knowledgebase is a collection of sequences and annotations for over 120 million proteins across all branches of life. Detailed annotations extracted from the literature by expert curators have been collected for over half a million of these proteins. These annotations are supplemented by annotations provided by rule based automated systems, and those imported from other resources. In this article we describe significant updates that we have made over the last 2 years to the resource. We have greatly expanded the number of Reference Proteomes that we provide and in particular we have focussed on improving the number of viral Reference Proteomes. The UniProt website has been augmented with new data visualizations for the subcellular localization of proteins as well as their structure and interactions. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The Gene Ontology Resource: 20 years and still GOing strong

              Abstract The Gene Ontology resource (GO; http://geneontology.org) provides structured, computable knowledge regarding the functions of genes and gene products. Founded in 1998, GO has become widely adopted in the life sciences, and its contents are under continual improvement, both in quantity and in quality. Here, we report the major developments of the GO resource during the past two years. Each monthly release of the GO resource is now packaged and given a unique identifier (DOI), enabling GO-based analyses on a specific release to be reproduced in the future. The molecular function ontology has been refactored to better represent the overall activities of gene products, with a focus on transcription regulator activities. Quality assurance efforts have been ramped up to address potentially out-of-date or inaccurate annotations. New evidence codes for high-throughput experiments now enable users to filter out annotations obtained from these sources. GO-CAM, a new framework for representing gene function that is more expressive than standard GO annotations, has been released, and users can now explore the growing repository of these models. We also provide the ‘GO ribbon’ widget for visualizing GO annotations to a gene; the widget can be easily embedded in any web page.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                08 January 2021
                25 November 2020
                25 November 2020
                : 49
                : D1
                : D480-D489
                Affiliations
                European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton CB10 1SD, UK
                Protein Information Resource, Georgetown University Medical Center , 3300 Whitehaven Street NW, Suite 1200, Washington, DC 20007, USA
                Protein Information Resource, University of Delaware, Ammon-Pinizzotto Biopharmaceutical Innovation Building , Suite 147, 590 Avenue 1743, Newark, DE 19713, USA
                SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire , 1 rue Michel Servet, CH-1211 Geneva 4, Switzerland
                Author notes
                To whom correspondence should be addressed. Tel: +44 223 494 100; Email: agb@ 123456ebi.ac.uk
                Article
                gkaa1100
                10.1093/nar/gkaa1100
                7778908
                33237286
                091d7de4-1e67-46ab-99e1-3f70155b4c9b
                © The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                Page count
                Pages: 10
                Product
                Funding
                Funded by: National Eye Institute, DOI 10.13039/100000053;
                Funded by: National Heart, Lung, and Blood Institute, DOI 10.13039/100000050;
                Funded by: National Institute of Allergy and Infectious Diseases, DOI 10.13039/100000060;
                Funded by: National Institute of Diabetes and Digestive and Kidney Diseases, DOI 10.13039/100000062;
                Funded by: National Cancer Institute, DOI 10.13039/100000054;
                Funded by: National Institutes of Health, DOI 10.13039/100000002;
                Award ID: U24HG007822
                Funded by: National Human Genome Research Institute, DOI 10.13039/100000051;
                Award ID: U41HG002273
                Funded by: National Institute of General Medical Sciences, DOI 10.13039/100000057;
                Award ID: R01GM080646
                Award ID: P20GM103446
                Funded by: Biotechnology and Biological Sciences Research Council, DOI 10.13039/501100000268;
                Award ID: BB/T010541/1
                Funded by: British Heart Foundation, DOI 10.13039/501100000274;
                Award ID: RG/13/5/30112
                Funded by: Open Targets;
                Funded by: Swiss Federal Government;
                Funded by: European Molecular Biology Laboratory, DOI 10.13039/100013060;
                Categories
                AcademicSubjects/SCI00010
                Database Issue

                Genetics
                Genetics

                Comments

                Comment on this article