4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      PDBe-KB: collaboratively defining the biological context of structural data

      research-article
      PDBe-KB consortium
      Nucleic Acids Research
      Oxford University Press

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Protein Data Bank in Europe – Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the Protein Data Bank (PDB). The goal of PDBe-KB is to place macromolecular structure data in their biological context by developing standardised data exchange formats and integrating functional annotations from the contributing partner resources into a knowledge graph that can provide valuable biological insights. Since we described PDBe-KB in 2019, there have been significant improvements in the variety of available annotation data sets and user functionality. Here, we provide an overview of the consortium, highlighting the addition of annotations such as predicted covalent binders, phosphorylation sites, effects of mutations on the protein structure and energetic local frustration. In addition, we describe a library of reusable web-based visualisation components and introduce new features such as a bulk download data service and a novel superposition service that generates clusters of superposed protein chains weekly for the whole PDB archive.

          Related collections

          Most cited references52

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Highly accurate protein structure prediction with AlphaFold

          Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            SWISS-MODEL: homology modelling of protein structures and complexes

            Abstract Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known protein sequences and experimentally determined structures. Fully automated workflows and servers simplify and streamline the homology modelling process, also allowing users without a specific computational expertise to generate reliable protein models and have easy access to modelling results, their visualization and interpretation. Here, we present an update to the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and been continuously further developed. Recently, its functionality has been extended to the modelling of homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology modelling. Other major improvements include the implementation of a new modelling engine, ProMod3 and the introduction a new local model quality estimation method, QMEANDisCo. SWISS-MODEL is freely available at https://swissmodel.expasy.org.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              UniProt: the universal protein knowledgebase in 2021

              (2020)
              Abstract The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this article, we describe significant updates that we have made over the last two years to the resource. The number of sequences in UniProtKB has risen to approximately 190 million, despite continued work to reduce sequence redundancy at the proteome level. We have adopted new methods of assessing proteome completeness and quality. We continue to extract detailed annotations from the literature to add to reviewed entries and supplement these in unreviewed entries with annotations provided by automated systems such as the newly implemented Association-Rule-Based Annotator (ARBA). We have developed a credit-based publication submission interface to allow the community to contribute publications and annotations to UniProt entries. We describe how UniProtKB responded to the COVID-19 pandemic through expert curation of relevant entries that were rapidly made available to the research community through a dedicated portal. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                07 January 2022
                10 November 2021
                10 November 2021
                : 50
                : D1
                : D534-D542
                Author notes
                To whom correspondence should be addressed. Email: mvaradi@ 123456ebi.ac.uk

                Protein Data Bank in Europe – Knowledge Base, European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, Cambridgeshire, CB10 1SA, UK.

                Article
                gkab988
                10.1093/nar/gkab988
                8728252
                34755867
                991bdb8c-8a35-48de-a3dd-43871d2b26ac
                © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 14 October 2021
                : 01 October 2021
                : 14 September 2021
                Page count
                Pages: 9
                Funding
                Funded by: ELIXIR;
                Funded by: Biotechnology and Biological Sciences Research Council, DOI 10.13039/501100000268;
                Award ID: BB/T01959X/1
                Funded by: FunPDBe;
                Award ID: BB/P024351/1
                Funded by: European Molecular Biology Laboratory, DOI 10.13039/100013060;
                Funded by: European Bioinformatics Institute, DOI 10.13039/100012116;
                Funded by: The Ministry of Education, Youth and Sports, DOI 10.13039/501100001823;
                Award ID: INBIO CZ.02.1.01/0.0/0.0/16_026/0008451
                Award ID: ELIXIR-CZ LM2018131
                Funded by: European Union's Horizon 2020 Programme;
                Award ID: 823839
                Funded by: Research Foundation Flanders, DOI 10.13039/501100003130;
                Award ID: G032816N
                Award ID: G042518N
                Award ID: G028821N
                Funded by: Fondazione Cassa di Risparmio di Firenze, DOI 10.13039/501100015694;
                Award ID: 24316
                Funded by: European Commission, DOI 10.13039/501100000780;
                Award ID: 101017567
                Funded by: AIRC, DOI 10.13039/501100005010;
                Award ID: IG 23539
                Funded by: Spanish Ministry of Science and Innovation, DOI 10.13039/501100004837;
                Award ID: PID2019-110167RB-I00
                Funded by: Norwegian Research Council, DOI 10.13039/501100005416;
                Award ID: 288008
                Funded by: Horizon 2020, DOI 10.13039/100010661;
                Award ID: 819318
                Funded by: Wellcome Trust, DOI 10.13039/100010269;
                Award ID: 104955/Z/14/Z
                Award ID: 218242/Z/19/Z
                Categories
                AcademicSubjects/SCI00010
                Database Issue

                Genetics
                Genetics

                Comments

                Comment on this article