There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.
Abstract
<p class="first" id="d3680807e69">We present a generative long short-term memory (LSTM)
recurrent neural network (RNN)
for combinatorial de novo peptide design. RNN models capture patterns in sequential
data and generate new data instances from the learned context. Amino acid sequences
represent a suitable input for these machine-learning models. Generative models trained
on peptide sequences could therefore facilitate the design of bespoke peptide libraries.
We trained RNNs with LSTM units on pattern recognition of helical antimicrobial peptides
and used the resulting model for de novo sequence generation. Of these sequences,
82% were predicted to be active antimicrobial peptides compared to 65% of randomly
sampled sequences with the same amino acid distribution as the training set. The generated
sequences also lie closer to the training data than manually designed amphipathic
helices. The results of this study showcase the ability of LSTM RNNs to construct
new amino acid sequences within the applicability domain of the model and motivate
their prospective application to peptide and protein design without the need for the
exhaustive enumeration of sequence libraries.
</p>