11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Identification of the sequence determinants of protein N-terminal acetylation through a decision tree approach

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          N-terminal acetylation is one of the most common protein modifications in eukaryotes and occurs co-translationally when the N-terminus of the nascent polypeptide is still attached to the ribosome. This modification has been shown to be involved in a wide range of biological phenomena such as protein half-life regulation, protein-protein and protein-membrane interactions, and protein subcellular localization. Thus, accurately predicting which proteins receive an acetyl group based on their protein sequence is expected to facilitate the functional study of this modification. As the occurrence of N-terminal acetylation strongly depends on the context of protein sequences, attempts to understand the sequence determinants of N-terminal acetylation were conducted initially by simply examining the N-terminal sequences of many acetylated and unacetylated proteins and more recently by machine learning approaches. However, a complete understanding of the sequence determinants of this modification remains to be elucidated.

          Results

          We obtained curated N-terminally acetylated and unacetylated sequences from the UniProt database and employed a decision tree algorithm to identify the sequence determinants of N-terminal acetylation for proteins whose initiator methionine ( iMet) residues have been removed. The results suggested that the main determinants of N-terminal acetylation are contained within the first five residues following iMet and that the first and second positions are the most important discriminator for the occurrence of this phenomenon. The results also indicated the existence of position-specific preferred and inhibitory residues that determine the occurrence of N-terminal acetylation. The developed predictor software, termed NT-AcPredictor, accurately predicted the N-terminal acetylation, with an overall performance comparable or superior to those of preceding predictors incorporating machine learning algorithms.

          Conclusion

          Our machine learning approach based on a decision tree algorithm successfully provided several sequence determinants of N-terminal acetylation for proteins lacking iMet, some of which have not previously been described. Although these sequence determinants remain insufficient to comprehensively predict the occurrence of this modification, indicating that further work on this topic is still required, the developed predictor, NT-AcPredictor, can be used to predict N-terminal acetylation with an accuracy of more than 80%.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s12859-017-1699-4) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: not found
          Is Open Access

          First Things First: Vital Protein Marks by N-Terminal Acetyltransferases.

          N-terminal (Nt) acetylation is known to be a highly abundant co-translational protein modification, but the recent discovery of Golgi- and chloroplast-resident N-terminal acetyltransferases (NATs) revealed that it can also be added post-translationally. Nt-acetylation may act as a degradation signal in a novel branch of the N-end rule pathway, whose functions include the regulation of human blood pressure. Nt-acetylation also modulates protein interactions, targeting, and folding. In plants, Nt-acetylation plays a role in the control of resistance to drought and in regulation of immune responses. Mutations of specific human NATs that decrease their activity can cause either the lethal Ogden syndrome or severe intellectual disability and cardiovascular defects. In sum, recent advances highlight Nt-acetylation as a key factor in many biological pathways.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            N-terminal acetyltransferases and sequence requirements for N-terminal acetylation of eukaryotic proteins.

            N(alpha)-terminal acetylation occurs in the yeast Saccharomyces cerevisiae by any of three N-terminal acetyltransferases (NAT), NatA, NatB, and NatC, which contain Ard1p, Nat3p and Mak3p catalytic subunits, respectively. The N-terminal sequences required for N-terminal acetylation, i.e. the NatA, NatB, and NatC substrates, were evaluated by considering over 450 yeast proteins previously examined in numerous studies, and were compared to the N-terminal sequences of more than 300 acetylated mammalian proteins. In addition, acetylated sequences of eukaryotic proteins were compared to the N termini of 810 eubacterial and 175 archaeal proteins, which are rarely acetylated. Protein orthologs of Ard1p, Nat3p and Mak3p were identified with the eukaryotic genomes of the sequences of model organisms, including Caenorhabditis elegans, Drosophila melanogaster, Arabidopsis thaliana, Mus musculus and Homo sapiens. Those and other putative acetyltransferases were assigned by phylogenetic analysis to the following six protein families: Ard1p; Nat3p; Mak3p; CAM; BAA; and Nat5p. The first three families correspond to the catalytic subunits of three major yeast NATs; these orthologous proteins were identified in eukaryotes, but not in prokaryotes; the CAM family include mammalian orthologs of the recently described Camello1 and Camello2 proteins whose substrates are unknown; the BAA family comprise bacterial and archaeal putative acetyltransferases whose biochemical activity have not been characterized; and the new Nat5p family assignment was on the basis of putative yeast NAT, Nat5p (YOR253W). Overall patterns of N-terminal acetylated proteins and the orthologous genes possibly encoding NATs suggest that yeast and higher eukaryotes have the same systems for N-terminal acetylation.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks.

              Picornaviral proteinases are responsible for maturation cleavages of the viral polyprotein, but also catalyze the degradation of cellular targets. Using graphical visualization techniques and neural network algorithms, we have investigated the sequence specificity of the two proteinases 2Apro and 3Cpro. The cleavage of VP0 (giving rise to VP2 and VP4), which is carried out by a so-far unknown proteinase, was also examined. In combination with a novel surface exposure prediction algorithm, our neural network approach successfully distinguishes known cleavage sites from noncleavage sites and yields a more consistent definition of features common to these sites. The method is able to predict experimentally determined cleavage sites in cellular proteins. We present a list of mammalian and other proteins that are predicted to be possible targets for the viral proteinases. Whether these proteins are indeed cleaved awaits experimental verification. Additionally, we report several errors detected in the protein databases. A computer server for prediction of cleavage sites by picornaviral proteinases is publicly available at the e-mail address NetPicoRNA@cbs.dtu.dk or via WWW at http:@www.cbs.dtu.dk/services/NetPicoRNA/.
                Bookmark

                Author and article information

                Contributors
                kyamada@ecei.tohoku.ac.jp
                masaru.miyagi@case.edu
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                2 June 2017
                2 June 2017
                2017
                : 18
                : 289
                Affiliations
                [1 ]ISNI 0000 0001 2248 6943, GRID grid.69566.3a, Graduate School of Information Sciences, , Tohoku University, ; Sendai, 980-8579 Japan
                [2 ]ISNI 0000 0001 2230 7538, GRID grid.208504.b, , Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), ; Tokyo, 135-0064 Japan
                [3 ]ISNI 0000 0001 2164 3847, GRID grid.67105.35, Center for Proteomics and Bioinformatics, , Case Western Reserve University, ; Cleveland, OH 44106 USA
                [4 ]ISNI 0000 0001 2164 3847, GRID grid.67105.35, Department of Nutrition, , Case Western Reserve University, ; Cleveland, OH 44106 USA
                Article
                1699
                10.1186/s12859-017-1699-4
                5457594
                28578658
                949b26f7-adfa-4d97-84ec-96af57639b08
                © The Author(s). 2017

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 27 January 2017
                : 18 May 2017
                Funding
                Funded by: FundRef http://dx.doi.org/http://dx.doi.org/10.13039/501100001700, Ministry of Education, Culture, Sports, Science and Technology;
                Award ID: Top Global University Project
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2017

                Bioinformatics & Computational biology
                n-terminal acetylation,n-terminal acetyltransferase,decision tree,sequence analysis,sequence context

                Comments

                Comment on this article