24
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Evolution of Protein Domain Repeats in Metazoa

      research-article
      , *
      Molecular Biology and Evolution
      Oxford University Press

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Repeats are ubiquitous elements of proteins and they play important roles for cellular function and during evolution. Repeats are, however, also notoriously difficult to capture computationally and large scale studies so far had difficulties in linking genetic causes, structural properties and evolutionary trajectories of protein repeats. Here we apply recently developed methods for repeat detection and analysis to a large dataset comprising over hundred metazoan genomes. We find that repeats in larger protein families experience generally very few insertions or deletions (indels) of repeat units but there is also a significant fraction of noteworthy volatile outliers with very high indel rates. Analysis of structural data indicates that repeats with an open structure and independently folding units are more volatile and more likely to be intrinsically disordered. Such disordered repeats are also significantly enriched in sites with a high functional potential such as linear motifs. Furthermore, the most volatile repeats have a high sequence similarity between their units. Since many volatile repeats also show signs of recombination, we conclude they are often shaped by concerted evolution. Intriguingly, many of these conserved yet volatile repeats are involved in host-pathogen interactions where they might foster fast but subtle adaptation in biological arms races.

          Key Words: protein evolution, domain rearrangements, protein repeats, concerted evolution.

          Related collections

          Most cited references64

          • Record: found
          • Abstract: found
          • Article: not found

          Protein homology detection by HMM-HMM comparison.

          Protein homology detection and sequence alignment are at the basis of protein structure prediction, function prediction and evolution. We have generalized the alignment of protein sequences with a profile hidden Markov model (HMM) to the case of pairwise alignment of profile HMMs. We present a method for detecting distant homologous relationships between proteins based on this approach. The method (HHsearch) is benchmarked together with BLAST, PSI-BLAST, HMMER and the profile-profile comparison tools PROF_SIM and COMPASS, in an all-against-all comparison of a database of 3691 protein domains from SCOP 1.63 with pairwise sequence identities below 20%.Sensitivity: When the predicted secondary structure is included in the HMMs, HHsearch is able to detect between 2.7 and 4.2 times more homologs than PSI-BLAST or HMMER and between 1.44 and 1.9 times more than COMPASS or PROF_SIM for a rate of false positives of 10%. Approximately half of the improvement over the profile-profile comparison methods is attributable to the use of profile HMMs in place of simple profiles. Alignment quality: Higher sensitivity is mirrored by an increased alignment quality. HHsearch produced 1.2, 1.7 and 3.3 times more good alignments ('balanced' score >0.3) than the next best method (COMPASS), and 1.6, 2.9 and 9.4 times more than PSI-BLAST, at the family, superfamily and fold level, respectively.Speed: HHsearch scans a query of 200 residues against 3691 domains in 33 s on an AMD64 2GHz PC. This is 10 times faster than PROF_SIM and 17 times faster than COMPASS.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins.

            The structural stability of a protein requires a large number of interresidue interactions. The energetic contribution of these can be approximated by low-resolution force fields extracted from known structures, based on observed amino acid pairing frequencies. The summation of such energies, however, cannot be carried out for proteins whose structure is not known or for intrinsically unstructured proteins. To overcome these limitations, we present a novel method for estimating the total pairwise interaction energy, based on a quadratic form in the amino acid composition of the protein. This approach is validated by the good correlation of the estimated and actual energies of proteins of known structure and by a clear separation of folded and disordered proteins in the energy space it defines. As the novel algorithm has not been trained on unstructured proteins, it substantiates the concept of protein disorder, i.e. that the inability to form a well-defined 3D structure is an intrinsic property of many proteins and protein domains. This property is encoded in their sequence, because their biased amino acid composition does not allow sufficient stabilizing interactions to form. By limiting the calculation to a predefined sequential neighborhood, the algorithm was turned into a position-specific scoring scheme that characterizes the tendency of a given amino acid to fall into an ordered or disordered region. This application we term IUPred and compare its performance with three generally accepted predictors, PONDR VL3H, DISOPRED2 and GlobPlot on a database of disordered proteins.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              WD40 proteins propel cellular networks.

              Recent findings indicate that WD40 domains play central roles in biological processes by acting as hubs in cellular networks; however, they have been studied less intensely than other common domains, such as the kinase, PDZ or SH3 domains. As suggested by various interactome studies, they are among the most promiscuous interactors. Structural studies suggest that this property stems from their ability, as scaffolds, to interact with diverse proteins, peptides or nucleic acids using multiple surfaces or modes of interaction. A general scaffolding role is supported by the fact that no WD40 domain has been found with intrinsic enzymatic activity despite often being part of large molecular machines. We discuss the WD40 domain distributions in protein networks and structures of WD40-containing assemblies to demonstrate their versatility in mediating critical cellular functions. Copyright © 2010 Elsevier Ltd. All rights reserved.
                Bookmark

                Author and article information

                Journal
                Mol Biol Evol
                Mol. Biol. Evol
                molbev
                molbiolevol
                Molecular Biology and Evolution
                Oxford University Press
                0737-4038
                1537-1719
                December 2016
                26 September 2016
                26 September 2016
                : 33
                : 12
                : 3170-3182
                Affiliations
                Institute for Evolution and Biodiversity, Westfalian Wilhelms University, Huefferstrasse 1, Muenster, Germany
                Author notes
                * Corresponding author: E-mail: ebb@ 123456wwu.de .

                Associate editor: Jeffery Thorne

                Article
                msw194
                10.1093/molbev/msw194
                5100051
                27671125
                31851db8-3291-4ae5-bf0a-873c0afbda84
                © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                Page count
                Pages: 13
                Categories
                Discoveries

                Molecular biology
                Molecular biology

                Comments

                Comment on this article