37
views
0
recommends
+1 Recommend
0 collections
    4
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      RCSB Protein Data Bank: Sustaining a living digital data resource that enables breakthroughs in scientific research and biomedical education

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Protein Data Bank (PDB) is one of two archival resources for experimental data central to biomedical research and education worldwide (the other key Primary Data Archive in biology being the International Nucleotide Sequence Database Collaboration). The PDB currently houses >134,000 atomic level biomolecular structures determined by crystallography, NMR spectroscopy, and 3D electron microscopy. It was established in 1971 as the first open‐access, digital‐data resource in biology, and is managed by the Worldwide Protein Data Bank partnership (wwPDB; wwpdb.org). US PDB operations are conducted by the RCSB Protein Data Bank (RCSB PDB; RCSB.org; Rutgers University and UC San Diego) and funded by NSF, NIH, and DoE. The RCSB PDB serves as the global Archive Keeper for the wwPDB. During calendar 2016, >591 million structure data files were downloaded from the PDB by Data Consumers working in every sovereign nation recognized by the United Nations. During this same period, the RCSB PDB processed >5300 new atomic level biomolecular structures plus experimental data and metadata coming into the archive from Data Depositors working in the Americas and Oceania. In addition, RCSB PDB served >1 million RCSB.org users worldwide with PDB data integrated with ∼40 external data resources providing rich structural views of fundamental biology, biomedicine, and energy sciences, and >600,000 PDB101.rcsb.org educational website users around the globe. RCSB PDB resources are described in detail together with metrics documenting the impact of access to PDB data on basic and applied research, clinical medicine, education, and the economy.

          Related collections

          Most cited references46

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            BioMagResBank

            The BioMagResBank (BMRB: www.bmrb.wisc.edu) is a repository for experimental and derived data gathered from nuclear magnetic resonance (NMR) spectroscopic studies of biological molecules. BMRB is a partner in the Worldwide Protein Data Bank (wwPDB). The BMRB archive consists of four main data depositories: (i) quantitative NMR spectral parameters for proteins, peptides, nucleic acids, carbohydrates and ligands or cofactors (assigned chemical shifts, coupling constants and peak lists) and derived data (relaxation parameters, residual dipolar couplings, hydrogen exchange rates, pKa values, etc.), (ii) databases for NMR restraints processed from original author depositions available from the Protein Data Bank, (iii) time-domain (raw) spectral data from NMR experiments used to assign spectral resonances and determine the structures of biological macromolecules and (iv) a database of one- and two-dimensional 1H and 13C one- and two-dimensional NMR spectra for over 250 metabolites. The BMRB website provides free access to all of these data. BMRB has tools for querying the archive and retrieving information and an ftp site (ftp.bmrb.wisc.edu) where data in the archive can be downloaded in bulk. Two BMRB mirror sites exist: one at the PDBj, Protein Research Institute, Osaka University, Osaka, Japan (bmrb.protein.osaka-u.ac.jp) and the other at CERM, University of Florence, Florence, Italy (bmrb.postgenomicnmr.net/). The site at Osaka also accepts and processes data depositions.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              CATH--a hierarchic classification of protein domain structures.

              Protein evolution gives rise to families of structurally related proteins, within which sequence identities can be extremely low. As a result, structure-based classifications can be effective at identifying unanticipated relationships in known structures and in optimal cases function can also be assigned. The ever increasing number of known protein structures is too large to classify all proteins manually, therefore, automatic methods are needed for fast evaluation of protein structures. We present a semi-automatic procedure for deriving a novel hierarchical classification of protein domain structures (CATH). The four main levels of our classification are protein class (C), architecture (A), topology (T) and homologous superfamily (H). Class is the simplest level, and it essentially describes the secondary structure composition of each domain. In contrast, architecture summarises the shape revealed by the orientations of the secondary structure units, such as barrels and sandwiches. At the topology level, sequential connectivity is considered, such that members of the same architecture might have quite different topologies. When structures belonging to the same T-level have suitably high similarities combined with similar functions, the proteins are assumed to be evolutionarily related and put into the same homologous superfamily. Analysis of the structural families generated by CATH reveals the prominent features of protein structure space. We find that nearly a third of the homologous superfamilies (H-levels) belong to ten major T-levels, which we call superfolds, and furthermore that nearly two-thirds of these H-levels cluster into nine simple architectures. A database of well-characterised protein structure families, such as CATH, will facilitate the assignment of structure-function/evolution relationships to both known and newly determined protein structures.
                Bookmark

                Author and article information

                Contributors
                stephen.burley@rcsb.org
                Journal
                Protein Sci
                Protein Sci
                10.1002/(ISSN)1469-896X
                PRO
                Protein Science : A Publication of the Protein Society
                John Wiley and Sons Inc. (Hoboken )
                0961-8368
                1469-896X
                11 November 2017
                January 2018
                11 November 2017
                : 27
                : 1 , Special Issue on Tools for Protein Science ( doiID: 10.1002/pro.v27.1 )
                : 316-330
                Affiliations
                [ 1 ] Research Collaboratory for Structural Bioinformatics Protein Data Bank Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey Piscataway New Jersey 08854
                [ 2 ] Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School New Brunswick New Jersey 08903
                [ 3 ] Research Collaboratory for Structural Bioinformatics Protein Data Bank San Diego Supercomputer Center, University of California, San Diego La Jolla California 92093
                Author notes
                [*] [* ]Correspondence to: Stephen K. Burley, Director, Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854. E‐mail: stephen.burley@ 123456rcsb.org
                Author information
                http://orcid.org/0000-0002-2487-9713
                http://orcid.org/0000-0001-8896-6878
                http://orcid.org/0000-0002-4149-1745
                Article
                PRO3331
                10.1002/pro.3331
                5734314
                29067736
                dd30df77-952e-4782-b67d-47194c080ff4
                © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society

                This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

                History
                : 31 July 2017
                : 20 October 2017
                : 23 October 2017
                Page count
                Figures: 8, Tables: 2, Pages: 15, Words: 9754
                Funding
                Funded by: National Science Foundation, the National Institutes of Health, and the Department of Energy
                Award ID: NSF‐DBI 1338415
                Categories
                Tools for Protein Science
                Tools for Protein Science
                Custom metadata
                2.0
                pro3331
                January 2018
                Converter:WILEY_ML3GV2_TO_NLMPMC version:5.2.8 mode:remove_FC converted:18.12.2017

                Biochemistry
                research collaboratory for structure bioinformatics,rcsb,protein data bank,pdb,worldwide protein data bank,wwpdb,pdbx/mmcif,chemical component dictionary,crystallography,nmr spectroscopy,3d electron microscopy,integrative/hybrid methods,data archive,macromolecular structure,biocuration,validation,metadata,fair principles,open access,data deposition

                Comments

                Comment on this article