8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      mRNA/protein sequence complementarity and its determinants: The impact of affinity scales

      research-article
      , *
      PLoS Computational Biology
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          It has recently been demonstrated that the nucleobase-density profiles of mRNA coding sequences are related in a complementary manner to the nucleobase-affinity profiles of their cognate protein sequences. Based on this, it has been proposed that cognate mRNA/protein pairs may bind in a co-aligned manner, especially if unstructured. Here, we study the dependence of mRNA/protein sequence complementarity on the properties of the nucleobase/amino-acid affinity scales used. Specifically, we sample the space of randomly generated scales by employing a Monte Carlo strategy with a fitness function that depends directly on the level of complementarity. For model organisms representing all three domains of life, we show that even short searches reproducibly converge upon highly optimized scales, implying that the topology of the underlying fitness landscape is decidedly funnel-like. Furthermore, the optimized scales, generated without any consideration of the physicochemical attributes of nucleobases or amino acids, resemble closely the nucleobase/amino-acid binding affinity scales obtained from experimental structures of RNA-protein complexes. This provides support for the claim that mRNA/protein sequence complementarity may indeed be related to binding between the two. Finally, we characterize suboptimal scales and show that intermediate-to-high complementarity can be reached by substantially diverse scales, but with select amino acids contributing disproportionally. Our results expose the dependence of cognate mRNA/protein sequence complementarity on the properties of the underlying nucleobase/amino-acid affinity scales and provide quantitative constraints that any physical scales need to satisfy for the complementarity to hold.

          Author summary

          Messenger RNAs and proteins, two essential types of biopolymers, have recently been shown to exhibit closely related, complementary physicochemical properties. Specifically, density profiles of certain groups in messenger RNA sequences directly match the affinity profiles for precisely those groups in protein sequences they encode. Based on this, it has been suggested that these molecules may interact with each other specifically and in a co-aligned fashion, especially when unstructured. Here, we explore different amino-acid scales used in the above analysis to assess which of their properties dictate the observed matching. Specifically, we define the constraints that need to be satisfied by physical scales for the complementarity to hold and show that the previously derived nucleobase/amino-acid affinity scales indeed satisfy these constraints. As a whole, our work provides a quantitative foundation for understanding the putative messenger RNA/protein complementarity with implications in different areas of RNA/protein biology including transcription, translation, splicing and viral assembly.

          Related collections

          Most cited references34

          • Record: found
          • Abstract: found
          • Article: not found

          AAindex: amino acid index database.

          AAindex is a database of amino acid indices and amino acid mutation matrices. An amino acid index is a set of 20 numerical values representing various physico--chemical and biochemical properties of amino acids. An amino acid mutation matrix is generally 20 x 20 numerical values representing similarity of amino acids. AAindex consists of two sections: AAindex1 for the collection of published amino acid indices and AAindex2 for the collection of published amino acid mutation matrices. Each entry of either AAindex1 or AAindex2 consists of the definition, the reference information, a list of related entries in terms of the correlation coefficient and the actual data. The database may be accessed through the DBGET/LinkDB system at GenomeNet (http://www. genome.ad.jp/aaindex/ ) or may be downloaded by anonymous FTP (ftp://ftp.genome.ad.jp/db/genomenet/aaindex/ ).
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Origin and evolution of the genetic code: the universal enigma.

            The genetic code is nearly universal, and the arrangement of the codons in the standard codon table is highly nonrandom. The three main concepts on the origin and evolution of the code are the stereochemical theory, according to which codon assignments are dictated by physicochemical affinity between amino acids and the cognate codons (anticodons); the coevolution theory, which posits that the code structure coevolved with amino acid biosynthesis pathways; and the error minimization theory under which selection to minimize the adverse effect of point mutations and translation errors was the principal factor of the code's evolution. These theories are not mutually exclusive and are also compatible with the frozen accident hypothesis, that is, the notion that the standard code might have no special properties but was fixed simply because all extant life forms share a common ancestor, with subsequent changes to the code, mostly, precluded by the deleterious effect of codon reassignment. Mathematical analysis of the structure and possible evolutionary trajectories of the code shows that it is highly robust to translational misreading but there are numerous more robust codes, so the standard code potentially could evolve from a random code via a short sequence of codon series reassignments. Thus, much of the evolution that led to the standard code could be a combination of frozen accident with selection for error minimization although contributions from coevolution of the code with metabolic pathways and weak affinities between amino acids and nucleotide triplets cannot be ruled out. However, such scenarios for the code evolution are based on formal schemes whose relevance to the actual primordial evolution is uncertain. A real understanding of the code origin and evolution is likely to be attainable only in conjunction with a credible scenario for the evolution of the coding principle itself and the translation system.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Statistical methods for research workers.

                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Data curationRole: Formal analysisRole: InvestigationRole: MethodologyRole: SoftwareRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: ConceptualizationRole: Formal analysisRole: Funding acquisitionRole: InvestigationRole: MethodologyRole: Project administrationRole: ResourcesRole: SupervisionRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                27 July 2017
                July 2017
                : 13
                : 7
                : e1005648
                Affiliations
                [001]Department of Structural and Computational Biology, Max F. Perutz Laboratories, University of Vienna, Campus Vienna Biocenter 5, Vienna, Austria
                University of North Carolina at Chapel Hill, UNITED STATES
                Author notes

                The authors have declared that no competing interests exist.

                Author information
                http://orcid.org/0000-0003-3814-3675
                Article
                PCOMPBIOL-D-17-00466
                10.1371/journal.pcbi.1005648
                5549747
                28750009
                84545ff2-41af-49d2-a30a-30abe163cc75
                © 2017 Bartonek, Zagrovic

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 24 March 2017
                : 26 June 2017
                Page count
                Figures: 5, Tables: 1, Pages: 16
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/501100000781, European Research Council;
                Award ID: 279408
                Award Recipient :
                This work was supported by the European Research Council to BZ (Starting Independent grant PROTINT 279408), https://erc.europa.eu/. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Biochemistry
                Nucleotides
                Nucleobases
                Biology and life sciences
                Biochemistry
                Nucleic acids
                RNA
                Messenger RNA
                Research and Analysis Methods
                Database and Informatics Methods
                Bioinformatics
                Sequence Analysis
                Amino Acid Sequence Analysis
                Research and analysis methods
                Mathematical and statistical techniques
                Statistical methods
                Monte Carlo method
                Physical sciences
                Mathematics
                Statistics (mathematics)
                Statistical methods
                Monte Carlo method
                Research and Analysis Methods
                Database and Informatics Methods
                Biological Databases
                Sequence Databases
                Research and Analysis Methods
                Database and Informatics Methods
                Bioinformatics
                Sequence Analysis
                Sequence Databases
                Biology and Life Sciences
                Biochemistry
                Biochemical Simulations
                Biology and Life Sciences
                Computational Biology
                Biochemical Simulations
                Biology and life sciences
                Molecular biology
                Macromolecular structure analysis
                RNA structure
                Biology and life sciences
                Biochemistry
                Nucleic acids
                RNA
                RNA structure
                Biology and Life Sciences
                Biochemistry
                Proteins
                Protein Interactions
                Custom metadata
                vor-update-to-uncorrected-proof
                2017-08-08
                All relevant data are within the paper and its Supporting Information files.

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article