There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.
Abstract
Computational methods for automated genome annotation are critical to understanding
and interpreting the bewildering mass of genomic sequence data presently being generated
and released. A neural network model of the structural and compositional properties
of a eukaryotic core promoter region has been developed and its application for analysis
of the Drosophila melanogaster genome is presented. The model uses a time-delay architecture,
a special case of a feed-forward neural network. The structure of this model allows
for variable spacing between functional binding sites, which is known to play a key
role in the transcription initiation process. Application of this model to a test
set of core promoters not only gave better discrimination of potential promoter sites
than previous statistical or neural network models, but also revealed indirectly subtle
properties of the transcription initiation signal. When tested in the Adh region of
2.9 Mbases of the Drosophila genome, the neural network for promoter prediction (NNPP)
program that incorporates the time-delay neural network model gives a recognition
rate of 75% (69/92) with a false positive rate of 1/547 bases. The present work can
be regarded as one of the first intensive studies that applies novel gene regulation
technologies to the identification of the complex gene regulation sites in the genome
of Drosophila melanogaster.