Increasing numbers of human diseases are being linked to genetic variants, but our understanding of the mechanistic links leading from DNA sequence to disease phenotype is limited. The majority of disease-causing nucleotide variants fall within the non-protein-coding portion of the genome, making it likely that they act by altering gene regulatory sequences. We hypothesised that SNPs within the binding sites of the transcriptional repressor REST alter the degree of repression of target genes. Given that changes in the effective concentration of REST contribute to several pathologies—various cancers, Huntington's disease, cardiac hypertrophy, vascular smooth muscle proliferation—these SNPs should alter disease-susceptibility in carriers. We devised a strategy to identify SNPs that affect the recruitment of REST to target genes through the alteration of its DNA recognition element, the RE1. A multi-step screen combining genetic, genomic, and experimental filters yielded 56 polymorphic RE1 sequences with robust and statistically significant differences of affinity between alleles. These SNPs have a considerable effect on the the functional recruitment of REST to DNA in a range of in vitro, reporter gene, and in vivo analyses. Furthermore, we observe allele-specific biases in deeply sequenced chromatin immunoprecipitation data, consistent with predicted differenes in RE1 affinity. Amongst the targets of polymorphic RE1 elements are important disease genes including NPPA, PTPRT, and CDH4. Thus, considerable genetic variation exists in the DNA motifs that connect gene regulatory networks. Recently available ChIP–seq data allow the annotation of human genetic polymorphisms with regulatory information to generate prior hypotheses about their disease-causing mechanism.
Common human diseases such as cancer, heart disease, or epilepsy have a genetic component that predisposes particular individuals to suffer from them. Huge sums have been invested to map the regions of the human genome where small DNA variations, or SNPs (“single-nucleotide polymorphisms”), determine the probability of developing these diseases. A major problem with this approach, however, is that, once the culprit SNPs are discovered, we know very little about how they cause disease—which is critical if we are to use this information to develop drugs and therapies. In this study, we demonstrate a new approach, employing functional maps of the human genome that have recently been published. We begin with regions of the genome recognised by a gene repressor protein—REST—that is involved in a number of important human diseases. Using information on where REST binds in the human genome, we predict and validate common DNA variations that increase or decrease this binding. By affecting how much REST is recruited to important genes, these variations may predispose or protect individuals from a number of diseases. Studies like this show how we can use genomic information to gain a deeper understanding of the genetics behind human disease.