+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      An Atlas of the Thioredoxin Fold Class Reveals the Complexity of Function-Enabling Adaptations

      1 , 2 , 2 , 3 , 4 , *
      PLoS Computational Biology
      Public Library of Science

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          The group of proteins that contain a thioredoxin (Trx) fold is huge and diverse. Assessment of the variation in catalytic machinery of Trx fold proteins is essential in providing a foundation for understanding their functional diversity and predicting the function of the many uncharacterized members of the class. The proteins of the Trx fold class retain common features—including variations on a dithiol CxxC active site motif—that lead to delivery of function. We use protein similarity networks to guide an analysis of how structural and sequence motifs track with catalytic function and taxonomic categories for 4,082 representative sequences spanning the known superfamilies of the Trx fold. Domain structure in the fold class is varied and modular, with 2.8% of sequences containing more than one Trx fold domain. Most member proteins are bacterial. The fold class exhibits many modifications to the CxxC active site motif—only 56.8% of proteins have both cysteines, and no functional groupings have absolute conservation of the expected catalytic motif. Only a small fraction of Trx fold sequences have been functionally characterized. This work provides a global view of the complex distribution of domains and catalytic machinery throughout the fold class, showing that each superfamily contains remnants of the CxxC active site. The unifying context provided by this work can guide the comparison of members of different Trx fold superfamilies to gain insight about their structure-function relationships, illustrated here with the thioredoxins and peroxiredoxins.

          Author Summary

          For any large class of proteins, far more protein sequences are known than can be examined experimentally. This is the case with the thioredoxin fold class, a large and diverse collection of proteins, some of which are known to catalyze important steps in metabolism. Some others participate in key processes like protein folding and detoxification of foreign compounds. Many of the unstudied proteins likely participate in other important biological processes and have useful applications in medicine and industry. We used a new network-based computational approach to create similarity-based maps of the thioredoxin fold class. These maps juxtapose unstudied proteins with similar well-characterized proteins, helping to show where existing knowledge can help predict properties of uncharacterized sequences. This information can be used to identify which of these sequences are interesting and deserve experimental characterization. We also used the maps to gain insight about how shared structural features are used and modified to affect catalysis in the different subclasses, leading to a better understanding of the interplay between structure and function in the thioredoxin fold class.

          Related collections

          Most cited references62

          • Record: found
          • Abstract: found
          • Article: not found

          The Pfam protein families database.

          Pfam is a large collection of protein families and domains. Over the past 2 years the number of families in Pfam has doubled and now stands at 6190 (version 10.0). Methodology improvements for searching the Pfam collection locally as well as via the web are described. Other recent innovations include modelling of discontinuous domains allowing Pfam domain definitions to be closer to those found in structure databases. Pfam is available on the web in the UK (http://www.sanger.ac.uk/Software/Pfam/), the USA (http://pfam.wustl.edu/), France (http://pfam.jouy.inra.fr/) and Sweden (http://Pfam.cgb.ki.se/).
            • Record: found
            • Abstract: found
            • Article: not found

            Pfam: clans, web tools and services

            Pfam is a database of protein families that currently contains 7973 entries (release 18.0). A recent development in Pfam has enabled the grouping of related families into clans. Pfam clans are described in detail, together with the new associated web pages. Improvements to the range of Pfam web tools and the first set of Pfam web services that allow programmatic access to the database and associated tools are also presented. Pfam is available on the web in the UK (), the USA (), France () and Sweden ().
              • Record: found
              • Abstract: found
              • Article: not found

              SCOP database in 2004: refinements integrate structure and sequence family data.

              The Structural Classification of Proteins (SCOP) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. Protein domains in SCOP are hierarchically classified into families, superfamilies, folds and classes. The continual accumulation of sequence and structural data allows more rigorous analysis and provides important information for understanding the protein world and its evolutionary repertoire. SCOP participates in a project that aims to rationalize and integrate the data on proteins held in several sequence and structure databases. As part of this project, starting with release 1.63, we have initiated a refinement of the SCOP classification, which introduces a number of changes mostly at the levels below superfamily. The pending SCOP reclassification will be carried out gradually through a number of future releases. In addition to the expanded set of static links to external resources, available at the level of domain entries, we have started modernization of the interface capabilities of SCOP allowing more dynamic links with other databases. SCOP can be accessed at http://scop.mrc-lmb.cam.ac.uk/scop.

                Author and article information

                Role: Editor
                PLoS Comput Biol
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                October 2009
                October 2009
                23 October 2009
                : 5
                : 10
                : e1000541
                [1 ]Graduate Program in Biological and Medical Informatics, University of California, San Francisco, California, United States of America
                [2 ]Institute for Quantitative Biosciences, University of California, San Francisco, California, United States of America
                [3 ]Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, California, United States of America
                [4 ]Department of Pharmaceutical Chemistry, University of California, San Francisco, California, United States of America
                Fox Chase Cancer Center, United States of America
                Author notes

                Conceived and designed the experiments: HJA PCB. Performed the experiments: HJA. Analyzed the data: HJA. Wrote the paper: HJA PCB.

                This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
                : 15 June 2009
                : 21 September 2009
                Page count
                Pages: 17
                Research Article
                Biochemistry/Molecular Evolution

                Quantitative & Systems biology
                Quantitative & Systems biology


                Comment on this article