Timothy T. Perkins 1 , Robert A. Kingsley 1 , Maria C. Fookes 1 , Paul P. Gardner 1 , Keith D. James 1 , Lu Yu 1 , Samuel A. Assefa 1 , Miao He 1 , Nicholas J. Croucher 1 , Derek J. Pickard 1 , Duncan J. Maskell 2 , Julian Parkhill 1 , Jyoti Choudhary 1 , Nicholas R. Thomson 1 , Gordon Dougan 1 , *
17 July 2009
High-density, strand-specific cDNA sequencing (ssRNA–seq) was used to analyze the transcriptome of Salmonella enterica serovar Typhi ( S. Typhi). By mapping sequence data to the entire S. Typhi genome, we analyzed the transcriptome in a strand-specific manner and further defined transcribed regions encoded within prophages, pseudogenes, previously un-annotated, and 3′- or 5′-untranslated regions (UTR). An additional 40 novel candidate non-coding RNAs were identified beyond those previously annotated. Proteomic analysis was combined with transcriptome data to confirm and refine the annotation of a number of hpothetical genes. ssRNA–seq was also combined with microarray and proteome analysis to further define the S. Typhi OmpR regulon and identify novel OmpR regulated transcripts. Thus, ssRNA–seq provides a novel and powerful approach to the characterization of the bacterial transcriptome.
We have applied a novel, strand-specific variation of RNA–seq (ssRNA–seq) to an analysis of the prokaryotic enteric pathogen Salmonella enterica serovar Typhi, the causative agent of Typhoid fever. Strand-specific data facilitated a high-resolution analysis of RNA transcription at a whole genome level with base-pair resolution. Using this technique, we were able to resolve overlapping transcripts of many genes, identify novel small RNAs, improve the accuracy of annotation, verify operon structure, and identify both transcriptionally active and inactive regions. We have compared the ssRNA–seq approach to standard RT–PCR and microarrays, validating the data. ssRNA–seq was used to redefine the OmpR operon that contributes to the pathogenicity of Typhi, identifying several novel OmpR regulated genes and operons. Finally, we have linked the ssRNA–seq data to the proteome and have provided simple open-access informatics tools to simplify interrogation of the data.