+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      An estimated 5% of new protein structures solved today represent a new Pfam family

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          This study uses the Pfam database to show that the sequence redundancy of protein structures deposited in the PDB is increasing. The possible reasons behind this trend are discussed.


          High-resolution structural knowledge is key to understanding how proteins function at the molecular level. The number of entries in the Protein Data Bank (PDB), the repository of all publicly available protein structures, continues to increase, with more than 8000 structures released in 2012 alone. The authors of this article have studied how structural coverage of the protein-sequence space has changed over time by monitoring the number of Pfam families that acquired their first representative structure each year from 1976 to 2012. Twenty years ago, for every 100 new PDB entries released, an estimated 20 Pfam families acquired their first structure. By 2012, this decreased to only about five families per 100 structures. The reasons behind the slower pace at which previously uncharacterized families are being structurally covered were investigated. It was found that although more than 50% of current Pfam families are still without a structural representative, this set is enriched in families that are small, functionally uncharacterized or rich in problem features such as intrinsically disordered and transmembrane regions. While these are important constraints, the reasons why it may not yet be time to give up the pursuit of a targeted but more comprehensive structural coverage of the protein-sequence space are discussed.

          Related collections

          Author and article information

          Acta Crystallogr D Biol Crystallogr
          Acta Crystallogr. D Biol. Crystallogr
          Acta Cryst. D
          Acta Crystallographica Section D: Biological Crystallography
          International Union of Crystallography
          01 November 2013
          12 October 2013
          12 October 2013
          : 69
          : Pt 11 ( publisher-idID: d131100 )
          : 2186-2193
          [a ]European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, England
          [b ]Department of Bioinformatics and Computational Biology, Fakultät für Informatik, Technical University Munich , Garching, Germany
          [c ]New York Consortium on Membrane Protein Structure, New York Structural Biology Center , 89 Convent Avenue, New York, NY 10027, USA
          [d ]Sanger Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, England
          Author notes
          Correspondence e-mail: mpunta@
          ba5211 ABCRE6 S0907444913027157
          © Mistry et al. 2013

          This is an open-access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

          Molecular replacements
          Research Papers

          Microscopy & Imaging

          protein-sequence space, structural coverage, pfam families


          Comment on this article