Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We present a new phylogenetic approach, selection on amino acids and codons (SelAC), whose substitution rates are based on a nested model linking protein expression to population genetics. Unlike simpler codon models that assume a single substitution matrix for all sites, our model more realistically represents the evolution of protein-coding DNA under the assumption of consistent, stabilizing selection using a cost-benefit approach. This cost–benefit approach allows us to generate a set of 20 optimal amino acid-specific matrix families using just a handful of parameters and naturally links the strength of stabilizing selection to protein synthesis levels, which we can estimate. Using a yeast data set of 100 orthologs for 6 taxa, we find SelAC fits the data much better than popular models by 10 ⁴–10 ⁵ Akike information criterion units adjusted for small sample bias. Our results also indicated that nested, mechanistic models better predict observed data patterns highlighting the improvement in biological realism in amino acid sequence evolution that our model provides. Additional parameters estimated by SelAC indicate that a large amount of nonphylogenetic, but biologically meaningful, information can be inferred from existing data. For example, SelAC prediction of gene-specific protein synthesis rates correlates well with both empirical ( r=0.33–0.48) and other theoretical predictions ( r=0.45–0.64) for multiple yeast species. SelAC also provides estimates of the optimal amino acid at each site. Finally, because SelAC is a nested approach based on clearly stated biological assumptions, future modifications, such as including shifts in the optimal amino acid sequence within or across lineages, are possible.

Related collections

Most cited references 63

Record: found
Abstract: found
Article: not found

A codon-based model of nucleotide substitution for protein-coding DNA sequences.

(1994)

A codon-based model for the evolution of protein-coding DNA sequences is presented for use in phylogenetic estimation. A Markov process is used to describe substitutions between codons. Transition/transversion rate bias and codon usage bias are allowed in the model, and selective restraints at the protein level are accommodated using physicochemical distances between the amino acids coded for by the codons. Analyses of two data sets suggest that the new codon-based model can provide a better fit to data than can nucleotide-based models and can produce more reliable estimates of certain biologically important measures such as the transition/transversion rate ratio and the synonymous/nonsynonymous substitution rate ratio.

0 comments Cited 557 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene.

R Nielsen, Z. Yang (1998)

Several codon-based models for the evolution of protein-coding DNA sequences are developed that account for varying selection intensity among amino acid sites. The "neutral model" assumes two categories of sites at which amino acid replacements are either neutral or deleterious. The "positive-selection model" assumes an additional category of positively selected sites at which nonsynonymous substitutions occur at a higher rate than synonymous ones. This model is also used to identify target sites for positive selection. The models are applied to a data set of the V3 region of the HIV-1 envelope gene, sequenced at different years after the infection of one patient. The results provide strong support for variable selection intensity among amino acid sites The neutral model is rejected in favor of the positive-selection model, indicating the operation of positive selection in the region. Positively selected sites are found in both the V3 region and the flanking regions.

0 comments Cited 439 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution.

D. Drummond, Claus Wilke (2008)

Strikingly consistent correlations between rates of coding-sequence evolution and gene expression levels are apparent across taxa, but the biological causes behind the selective pressures on coding-sequence evolution remain controversial. Here, we demonstrate conserved patterns of simple covariation between sequence evolution, codon usage, and mRNA level in E. coli, yeast, worm, fly, mouse, and human that suggest that all observed trends stem largely from a unified underlying selective pressure. In metazoans, these trends are strongest in tissues composed of neurons, whose structure and lifetime confer extreme sensitivity to protein misfolding. We propose, and demonstrate using a molecular-level evolutionary simulation, that selection against toxicity of misfolded proteins generated by ribosome errors suffices to create all of the observed covariation. The mechanistic model of molecular evolution that emerges yields testable biochemical predictions, calls into question the use of nonsynonymous-to-synonymous substitution ratios (Ka/Ks) to detect functional selection, and suggests how mistranslation may contribute to neurodegenerative disease.

0 comments Cited 436 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Tal Pupko: Role: Associate Editor

Journal

Journal ID (nlm-ta): Mol Biol Evol

Journal ID (iso-abbrev): Mol. Biol. Evol

Journal ID (publisher-id): molbev

Title: Molecular Biology and Evolution

Publisher: Oxford University Press

ISSN (Print): 0737-4038

ISSN (Electronic): 1537-1719

Publication date (Print): April 2019

Publication date (Electronic): 05 December 2018

Publication date PMC-release: 05 December 2018

Volume: 36

Issue: 4

Pages: 834-851

Affiliations

[1 ]Department of Biological Sciences, University of Arkansas, Fayetteville, AR

[2 ]Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville, TN

[3 ]National Institute for Mathematical and Biological Synthesis, Knoxville, TN

[4 ]Department of Business Analytics & Statistics, Knoxville, TN

[5 ]Suite 1039, White Plains, NY

Author notes

Corresponding author: E-mail: mikeg@ 123456utk.edu .

Article

Publisher ID: msy222

DOI: 10.1093/molbev/msy222

PMC ID: 6445302

PubMed ID: 30521036

SO-VID: 56fbdeef-5f13-4e3d-a8ba-42092e406b67

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

History

Page count

Pages: 18

Funding

Funded by: NSF 10.13039/100000001

Award ID: MCB-1120370

Award ID: MCB-1546402

Award ID: DEB-1355033

Funded by: The University of Tennessee Knoxville and University of Arkansas

Funded by: National Institute for Mathematical and Biological Synthesis 10.13039/100008947

Funded by: National Science Foundation 10.13039/100000001

Award ID: DBI-1300426

Comments

Comment on this article

scite_

Cited by 3

See all cited by

Most referenced authors 847

See all reference authors

Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach

Read this article at

Abstract

Related collections

Higher order chromatin architecture

Most cited references 63

A codon-based model of nucleotide substitution for protein-coding DNA sequences.

Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene.

Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 81

Cited by 3

Most referenced authors 847