Genome-wide association studies (GWAS) often identify disease-associated mutations in intergenic and non-coding regions of the genome. Given the high percentage of the human genome that is transcribed, we postulate that for some observed associations the disease phenotype is caused by a structural rearrangement in a regulatory region of the RNA transcript. To identify such mutations, we have performed a genome-wide analysis of all known disease-associated Single Nucleotide Polymorphisms (SNPs) from the Human Gene Mutation Database (HGMD) that map to the untranslated regions (UTRs) of a gene. Rather than using minimum free energy approaches (e.g. mFold), we use a partition function calculation that takes into consideration the ensemble of possible RNA conformations for a given sequence. We identified in the human genome disease-associated SNPs that significantly alter the global conformation of the UTR to which they map. For six disease-states (Hyperferritinemia Cataract Syndrome, β-Thalassemia, Cartilage-Hair Hypoplasia, Retinoblastoma, Chronic Obstructive Pulmonary Disease (COPD), and Hypertension), we identified multiple SNPs in UTRs that alter the mRNA structural ensemble of the associated genes. Using a Boltzmann sampling procedure for sub-optimal RNA structures, we are able to characterize and visualize the nature of the conformational changes induced by the disease-associated mutations in the structural ensemble. We observe in several cases (specifically the 5′ UTRs of FTL and RB1) SNP–induced conformational changes analogous to those observed in bacterial regulatory Riboswitches when specific ligands bind. We propose that the UTR and SNP combinations we identify constitute a “RiboSNitch,” that is a regulatory RNA in which a specific SNP has a structural consequence that results in a disease phenotype. Our SNPfold algorithm can help identify RiboSNitches by leveraging GWAS data and an analysis of the mRNA structural ensemble.
Genome-wide association studies identify mutations in the human genome that correlate with a particular disease. It is common to find mutations associated with disease in the non-coding region of the genome. These non-coding mutations are more difficult to interpret at a molecular level, because they do not affect the protein sequence. In this study, we analyze disease-associated mutations in non-coding regions of our genome in the context of their structural effect on the message of genetic information in our cells, Ribonucleic Acid (RNA). We focus in particular on the regulatory parts of our genes known as untranslated regions. We find that certain disease-associated mutations in these regulatory untranslated regions have a significant effect on the structure of the RNA message. We call these elements “RiboSNitches,” because they act like switches turning on and off genes, but are caused by Single Nucleotide Polymorphisms (SNPs), which are single point mutations in our genome. The RiboSNitches we identify are potentially a new class of pharmaceutical targets, as it is possible to change the structure of RNA with small drug-like molecules.