Identification of the sequence determinants of protein N-terminal acetylation through a decision tree approach

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

N-terminal acetylation is one of the most common protein modifications in eukaryotes and occurs co-translationally when the N-terminus of the nascent polypeptide is still attached to the ribosome. This modification has been shown to be involved in a wide range of biological phenomena such as protein half-life regulation, protein-protein and protein-membrane interactions, and protein subcellular localization. Thus, accurately predicting which proteins receive an acetyl group based on their protein sequence is expected to facilitate the functional study of this modification. As the occurrence of N-terminal acetylation strongly depends on the context of protein sequences, attempts to understand the sequence determinants of N-terminal acetylation were conducted initially by simply examining the N-terminal sequences of many acetylated and unacetylated proteins and more recently by machine learning approaches. However, a complete understanding of the sequence determinants of this modification remains to be elucidated.

Results

We obtained curated N-terminally acetylated and unacetylated sequences from the UniProt database and employed a decision tree algorithm to identify the sequence determinants of N-terminal acetylation for proteins whose initiator methionine ( ⁱMet) residues have been removed. The results suggested that the main determinants of N-terminal acetylation are contained within the first five residues following ⁱMet and that the first and second positions are the most important discriminator for the occurrence of this phenomenon. The results also indicated the existence of position-specific preferred and inhibitory residues that determine the occurrence of N-terminal acetylation. The developed predictor software, termed NT-AcPredictor, accurately predicted the N-terminal acetylation, with an overall performance comparable or superior to those of preceding predictors incorporating machine learning algorithms.

Conclusion

Our machine learning approach based on a decision tree algorithm successfully provided several sequence determinants of N-terminal acetylation for proteins lacking ⁱMet, some of which have not previously been described. Although these sequence determinants remain insufficient to comprehensively predict the occurrence of this modification, indicating that further work on this topic is still required, the developed predictor, NT-AcPredictor, can be used to predict N-terminal acetylation with an accuracy of more than 80%.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-017-1699-4) contains supplementary material, which is available to authorized users.

Related collections

Most cited references 18

Record: found
Abstract: found
Article: not found

Is Open Access

First Things First: Vital Protein Marks by N-Terminal Acetyltransferases.

Henriette Aksnes, Adrian Drazic, Michaël Marie … (2016)

N-terminal (Nt) acetylation is known to be a highly abundant co-translational protein modification, but the recent discovery of Golgi- and chloroplast-resident N-terminal acetyltransferases (NATs) revealed that it can also be added post-translationally. Nt-acetylation may act as a degradation signal in a novel branch of the N-end rule pathway, whose functions include the regulation of human blood pressure. Nt-acetylation also modulates protein interactions, targeting, and folding. In plants, Nt-acetylation plays a role in the control of resistance to drought and in regulation of immune responses. Mutations of specific human NATs that decrease their activity can cause either the lethal Ogden syndrome or severe intellectual disability and cardiovascular defects. In sum, recent advances highlight Nt-acetylation as a key factor in many biological pathways.

0 comments Cited 103 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

N-terminal acetyltransferases and sequence requirements for N-terminal acetylation of eukaryotic proteins.

Bogdan Polevoda, Fred Sherman (2003)

N(alpha)-terminal acetylation occurs in the yeast Saccharomyces cerevisiae by any of three N-terminal acetyltransferases (NAT), NatA, NatB, and NatC, which contain Ard1p, Nat3p and Mak3p catalytic subunits, respectively. The N-terminal sequences required for N-terminal acetylation, i.e. the NatA, NatB, and NatC substrates, were evaluated by considering over 450 yeast proteins previously examined in numerous studies, and were compared to the N-terminal sequences of more than 300 acetylated mammalian proteins. In addition, acetylated sequences of eukaryotic proteins were compared to the N termini of 810 eubacterial and 175 archaeal proteins, which are rarely acetylated. Protein orthologs of Ard1p, Nat3p and Mak3p were identified with the eukaryotic genomes of the sequences of model organisms, including Caenorhabditis elegans, Drosophila melanogaster, Arabidopsis thaliana, Mus musculus and Homo sapiens. Those and other putative acetyltransferases were assigned by phylogenetic analysis to the following six protein families: Ard1p; Nat3p; Mak3p; CAM; BAA; and Nat5p. The first three families correspond to the catalytic subunits of three major yeast NATs; these orthologous proteins were identified in eukaryotes, but not in prokaryotes; the CAM family include mammalian orthologs of the recently described Camello1 and Camello2 proteins whose substrates are unknown; the BAA family comprise bacterial and archaeal putative acetyltransferases whose biochemical activity have not been characterized; and the new Nat5p family assignment was on the basis of putative yeast NAT, Nat5p (YOR253W). Overall patterns of N-terminal acetylated proteins and the orthologous genes possibly encoding NATs suggest that yeast and higher eukaryotes have the same systems for N-terminal acetylation.

0 comments Cited 92 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks.

N Blom, J. HANSEN, D Blaas … (1996)

Picornaviral proteinases are responsible for maturation cleavages of the viral polyprotein, but also catalyze the degradation of cellular targets. Using graphical visualization techniques and neural network algorithms, we have investigated the sequence specificity of the two proteinases 2Apro and 3Cpro. The cleavage of VP0 (giving rise to VP2 and VP4), which is carried out by a so-far unknown proteinase, was also examined. In combination with a novel surface exposure prediction algorithm, our neural network approach successfully distinguishes known cleavage sites from noncleavage sites and yields a more consistent definition of features common to these sites. The method is able to predict experimentally determined cleavage sites in cellular proteins. We present a list of mammalian and other proteins that are predicted to be possible targets for the viral proteinases. Whether these proteins are indeed cleaved awaits experimental verification. Additionally, we report several errors detected in the protein databases. A computer server for prediction of cleavage sites by picornaviral proteinases is publicly available at the e-mail address NetPicoRNA@cbs.dtu.dk or via WWW at http:@www.cbs.dtu.dk/services/NetPicoRNA/.

0 comments Cited 81 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Kazunori D. Yamada: kyamada@ecei.tohoku.ac.jp

Masaru Miyagi: masaru.miyagi@case.edu

Journal

Journal ID (nlm-ta): BMC Bioinformatics

Journal ID (iso-abbrev): BMC Bioinformatics

Title: BMC Bioinformatics

Publisher: BioMed Central (London )

ISSN (Electronic): 1471-2105

Publication date (Electronic): 2 June 2017

Publication date PMC-release: 2 June 2017

Publication date Collection: 2017

Volume: 18

Electronic Location Identifier: 289

Affiliations

[1 ]ISNI 0000 0001 2248 6943, GRID grid.69566.3a, Graduate School of Information Sciences, , Tohoku University, ; Sendai, 980-8579 Japan

[2 ]ISNI 0000 0001 2230 7538, GRID grid.208504.b, , Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), ; Tokyo, 135-0064 Japan

[3 ]ISNI 0000 0001 2164 3847, GRID grid.67105.35, Center for Proteomics and Bioinformatics, , Case Western Reserve University, ; Cleveland, OH 44106 USA

[4 ]ISNI 0000 0001 2164 3847, GRID grid.67105.35, Department of Nutrition, , Case Western Reserve University, ; Cleveland, OH 44106 USA

Article

Publisher ID: 1699

DOI: 10.1186/s12859-017-1699-4

PMC ID: 5457594

PubMed ID: 28578658

SO-VID: 949b26f7-adfa-4d97-84ec-96af57639b08

License:

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

History

Date received : 27 January 2017

Date accepted : 18 May 2017

Funding

Funded by: FundRef http://dx.doi.org/http://dx.doi.org/10.13039/501100001700, Ministry of Education, Culture, Sports, Science and Technology;

Award ID: Top Global University Project

Custom metadata

ScienceOpen disciplines: Bioinformatics & Computational biology

Keywords: n-terminal acetylation,n-terminal acetyltransferase,decision tree,sequence analysis,sequence context

Data availability:

ScienceOpen disciplines: Bioinformatics & Computational biology

Keywords: n-terminal acetylation, n-terminal acetyltransferase, decision tree, sequence analysis, sequence context

Identification of the sequence determinants of protein N-terminal acetylation through a decision tree approach

Read this article at

Abstract

Background

Results

Conclusion

Electronic supplementary material

Related collections

Genetoberfest

Most cited references 18

First Things First: Vital Protein Marks by N-Terminal Acetyltransferases.

N-terminal acetyltransferases and sequence requirements for N-terminal acetylation of eukaryotic proteins.

Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks.

Author and article information

Contributors

Journal

Affiliations

Article

History

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 39

Cited by 2

Most referenced authors 407