miRNAs (microRNAs) are a class of endogenous small RNAs that are thought to negatively regulate protein production. Aberrant expression of many miRNAs is linked to cancer and other diseases. Little is known about the factors that regulate the expression of miRNAs. We have identified numerous regulatory elements upstream of miRNA genes that are likely to be essential to the transcriptional and posttranscriptional regulation of miRNAs. Newly identified regulatory motifs occur frequently and in multiple copies upstream of miRNAs. The motifs are highly enriched in G and C nucleotides, in comparison with the nucleotide composition of miRNA upstream sequences. Although the motifs were predicted using sequences that are upstream of miRNAs, we find that 99% of the top-predicted motifs preferentially occur within the first 500 nucleotides upstream of the transcription start sites of protein-coding genes; the observed preference in location underscores the validity and importance of the motifs identified in this study. Our study also raises the possibility that a considerable number of well-characterized, disease-associated transcription factors (TFs) of protein-coding genes contribute to the abnormal miRNA expression in diseases such as cancer. Further analysis of predicted miRNA–protein interactions lead us to hypothesize that TFs that include c-Myb, NF-Y, Sp-1, MTF-1, and AP-2α are master-regulators of miRNA expression. Our predictions are a solid starting point for the systematic elucidation of the causative basis for aberrant expression patterns of disease-related (e.g., cancer) miRNAs. Thus, we point out that focused studies of the TFs that regulate miRNAs will be paramount in developing cures for miRNA-related diseases. The identification of the miRNA regulatory motifs was facilitated by a new computational method, K -Factor. K-Factor predicts regulatory motifs in a set of functionally related sequences, without relying on evolutionary conservation.
microRNAs (miRNAs) are unusually small RNAs that are thought to control the production of proteins in the cell. Recent studies have linked miRNAs to several types of cancers. Several studies strongly suggest that miRNAs could be useful as diagnostic and prognostic markers of various cancers. Thus, although miRNAs appear to have opened up a new chapter in cancer biology, the fundamental question regarding why miRNAs are strongly associated with diseases such as cancer remain unclear. Here, we endeavored to systematically identify the factors that regulate miRNA biogenesis. We first identified a large number of DNA sequence elements that are characteristic of miRNA genes, using a new computational method named K-Factor. The sequence elements were then used to match known protein binding sites to identify specific proteins (transcription factors (TF)) that regulate miRNA biogenesis. Based on our observations, we put forward the hypothesis that a number of known TFs are primarily responsible for the aberrant regulation of miRNAs in cancer and other diseases.