Replication and pathogenesis of the human immunodeficiency virus (HIV) is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent high-throughput SHAPE (selective 2′-hydroxyl acylation analyzed by primer extension) technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower ( p-value < 0.0001) SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further application of this technology will make possible newly informative analysis of any RNA in a cellular transcriptome.
The function of the RNA genome of the human immunodeficiency virus (HIV) is determined both by its sequence and by its ability to fold back on itself to form specific higher-order structures. In order to describe physical structures in a region of the HIV RNA genome known to play multiple, critical roles in viral replication and pathogenesis, we invent a high-throughput, quantitative, and comprehensive structure-mapping approach that locates flexible (unpaired) nucleotides within a folded RNA, assaying hundreds of nucleotides at a time. We find that the first 10% of the HIV-1 genome has a single predominant structure and that regulatory motifs have significantly greater structure than do protein-coding segments. The HIV genome interacts with numerous proteins, including multiple copies of the nucleocapsid protein. We directly map RNA–protein interactions inside virions and discover that the nucleocapsid prottein interacts with viral RNA in at least three distinct ways, depending on the context within the overall genome structure. Further application of the high-throughput RNA-structure analysis tools described here will make it possible to address diverse structure–function relationships in intact cellular and viral RNAs.
Development of novel, quantitative, high-throughput RNA structure analysis tools allows the outline of structure-function relationships for the first 10% of an HIV genome, discovery of structural differences between regulatory and coding regions, and analysis of protein-RNA interactions inside authentic virions.