7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We present a new phylogenetic approach, selection on amino acids and codons (SelAC), whose substitution rates are based on a nested model linking protein expression to population genetics. Unlike simpler codon models that assume a single substitution matrix for all sites, our model more realistically represents the evolution of protein-coding DNA under the assumption of consistent, stabilizing selection using a cost-benefit approach. This cost–benefit approach allows us to generate a set of 20 optimal amino acid-specific matrix families using just a handful of parameters and naturally links the strength of stabilizing selection to protein synthesis levels, which we can estimate. Using a yeast data set of 100 orthologs for 6 taxa, we find SelAC fits the data much better than popular models by 10 4–10 5 Akike information criterion units adjusted for small sample bias. Our results also indicated that nested, mechanistic models better predict observed data patterns highlighting the improvement in biological realism in amino acid sequence evolution that our model provides. Additional parameters estimated by SelAC indicate that a large amount of nonphylogenetic, but biologically meaningful, information can be inferred from existing data. For example, SelAC prediction of gene-specific protein synthesis rates correlates well with both empirical ( r=0.33–0.48) and other theoretical predictions ( r=0.45–0.64) for multiple yeast species. SelAC also provides estimates of the optimal amino acid at each site. Finally, because SelAC is a nested approach based on clearly stated biological assumptions, future modifications, such as including shifts in the optimal amino acid sequence within or across lineages, are possible.

          Related collections

          Most cited references63

          • Record: found
          • Abstract: found
          • Article: not found

          A codon-based model of nucleotide substitution for protein-coding DNA sequences.

          (1994)
          A codon-based model for the evolution of protein-coding DNA sequences is presented for use in phylogenetic estimation. A Markov process is used to describe substitutions between codons. Transition/transversion rate bias and codon usage bias are allowed in the model, and selective restraints at the protein level are accommodated using physicochemical distances between the amino acids coded for by the codons. Analyses of two data sets suggest that the new codon-based model can provide a better fit to data than can nucleotide-based models and can produce more reliable estimates of certain biologically important measures such as the transition/transversion rate ratio and the synonymous/nonsynonymous substitution rate ratio.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene.

            Several codon-based models for the evolution of protein-coding DNA sequences are developed that account for varying selection intensity among amino acid sites. The "neutral model" assumes two categories of sites at which amino acid replacements are either neutral or deleterious. The "positive-selection model" assumes an additional category of positively selected sites at which nonsynonymous substitutions occur at a higher rate than synonymous ones. This model is also used to identify target sites for positive selection. The models are applied to a data set of the V3 region of the HIV-1 envelope gene, sequenced at different years after the infection of one patient. The results provide strong support for variable selection intensity among amino acid sites The neutral model is rejected in favor of the positive-selection model, indicating the operation of positive selection in the region. Positively selected sites are found in both the V3 region and the flanking regions.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution.

              Strikingly consistent correlations between rates of coding-sequence evolution and gene expression levels are apparent across taxa, but the biological causes behind the selective pressures on coding-sequence evolution remain controversial. Here, we demonstrate conserved patterns of simple covariation between sequence evolution, codon usage, and mRNA level in E. coli, yeast, worm, fly, mouse, and human that suggest that all observed trends stem largely from a unified underlying selective pressure. In metazoans, these trends are strongest in tissues composed of neurons, whose structure and lifetime confer extreme sensitivity to protein misfolding. We propose, and demonstrate using a molecular-level evolutionary simulation, that selection against toxicity of misfolded proteins generated by ribosome errors suffices to create all of the observed covariation. The mechanistic model of molecular evolution that emerges yields testable biochemical predictions, calls into question the use of nonsynonymous-to-synonymous substitution ratios (Ka/Ks) to detect functional selection, and suggests how mistranslation may contribute to neurodegenerative disease.
                Bookmark

                Author and article information

                Contributors
                Role: Associate Editor
                Journal
                Mol Biol Evol
                Mol. Biol. Evol
                molbev
                Molecular Biology and Evolution
                Oxford University Press
                0737-4038
                1537-1719
                April 2019
                05 December 2018
                05 December 2018
                : 36
                : 4
                : 834-851
                Affiliations
                [1 ]Department of Biological Sciences, University of Arkansas, Fayetteville, AR
                [2 ]Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville, TN
                [3 ]National Institute for Mathematical and Biological Synthesis, Knoxville, TN
                [4 ]Department of Business Analytics & Statistics, Knoxville, TN
                [5 ]Suite 1039, White Plains, NY
                Author notes
                Corresponding author: E-mail: mikeg@ 123456utk.edu .
                Article
                msy222
                10.1093/molbev/msy222
                6445302
                30521036
                56fbdeef-5f13-4e3d-a8ba-42092e406b67
                © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                Page count
                Pages: 18
                Funding
                Funded by: NSF 10.13039/100000001
                Award ID: MCB-1120370
                Award ID: MCB-1546402
                Award ID: DEB-1355033
                Funded by: The University of Tennessee Knoxville and University of Arkansas
                Funded by: National Institute for Mathematical and Biological Synthesis 10.13039/100008947
                Funded by: National Science Foundation 10.13039/100000001
                Award ID: DBI-1300426
                Categories
                Methods

                Molecular biology
                wright–fisher,stabilizing selection,allele substitution,protein function,gene expression

                Comments

                Comment on this article