81
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Learning a Prior on Regulatory Potential from eQTL Data

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Genome-wide RNA expression data provide a detailed view of an organism's biological state; hence, a dataset measuring expression variation between genetically diverse individuals (eQTL data) may provide important insights into the genetics of complex traits. However, with data from a relatively small number of individuals, it is difficult to distinguish true causal polymorphisms from the large number of possibilities. The problem is particularly challenging in populations with significant linkage disequilibrium, where traits are often linked to large chromosomal regions containing many genes. Here, we present a novel method, Lirnet, that automatically learns a regulatory potential for each sequence polymorphism, estimating how likely it is to have a significant effect on gene expression. This regulatory potential is defined in terms of “regulatory features”—including the function of the gene and the conservation, type, and position of genetic polymorphisms—that are available for any organism. The extent to which the different features influence the regulatory potential is learned automatically, making Lirnet readily applicable to different datasets, organisms, and feature sets. We apply Lirnet both to the human HapMap eQTL dataset and to a yeast eQTL dataset and provide statistical and biological results demonstrating that Lirnet produces significantly better regulatory programs than other recent approaches. We demonstrate in the yeast data that Lirnet can correctly suggest a specific causal sequence variation within a large, linked chromosomal region. In one example, Lirnet uncovered a novel, experimentally validated connection between Puf3—a sequence-specific RNA binding protein—and P-bodies—cytoplasmic structures that regulate translation and RNA stability—as well as the particular causative polymorphism, a SNP in Mkt1, that induces the variation in the pathway.

          Author Summary

          Gene expression data of genetically diverse individuals (eQTL data) provide a unique perspective on the effect of genetic variation on cellular pathways. However, the burden of multiple hypotheses, combined with the challenges of linkage disequilibrium, makes it difficult to correctly identify causal polymorphisms. Researchers traditionally apply heuristics for selecting among plausible hypotheses, favoring polymorphisms that are more conserved, that lead to significant amino acid change, or that reside in genes whose function is related to that of the targets. But how do we know how much weight to attribute to different regulatory features? We describe Lirnet, which learns from eQTL data how to weight regulatory features and induce a regulatory potential for sequence variations. Lirnet assesses these weights simultaneously to learning a regulatory network, finding weights that lead to a more predictive network. We show that Lirnet constructs high-accuracy regulatory programs and demonstrate its ability to correctly identify causative polymorphisms. Lirnet can flexibly use any regulatory features, including sequence features that are available for any sequenced organism, and automatically learn their weights in a dataset-specific way. This feature makes it especially advantageous for mammalian systems, where many forms of prior knowledge used in simple model organisms are incomplete or unavailable.

          Related collections

          Most cited references62

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Transcriptional regulatory networks in Saccharomyces cerevisiae.

            We have determined how most of the transcriptional regulators encoded in the eukaryote Saccharomyces cerevisiae associate with genes across the genome in living cells. Just as maps of metabolic networks describe the potential pathways that may be used by a cell to accomplish metabolic processes, this network of regulator-gene interactions describes potential pathways yeast cells can use to regulate global gene expression programs. We use this information to identify network motifs, the simplest units of network architecture, and demonstrate that an automated process can use motifs to assemble a transcriptional regulatory network structure. Our results reveal that eukaryotic cellular functions are highly connected through networks of transcriptional regulators that regulate other transcriptional regulators.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Genetic dissection of transcriptional regulation in budding yeast.

              To begin to understand the genetic architecture of natural variation in gene expression, we carried out genetic linkage analysis of genomewide expression patterns in a cross between a laboratory strain and a wild strain of Saccharomyces cerevisiae. Over 1500 genes were differentially expressed between the parent strains. Expression levels of 570 genes were linked to one or more different loci, with most expression levels showing complex inheritance patterns. The loci detected by linkage fell largely into two categories: cis-acting modulators of single genes and trans-acting modulators of many genes. We found eight such trans-acting loci, each affecting the expression of a group of 7 to 94 genes of related function.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                January 2009
                January 2009
                30 January 2009
                : 5
                : 1
                : e1000358
                Affiliations
                [1 ]Computer Science Department, Stanford University, Stanford, California, United States of America
                [2 ]Institute for Systems Biology, Seattle, Washington, United States of America
                [3 ]Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
                [4 ]Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, California, United States of America
                [5 ]Department of Biological Sciences, Columbia University, New York, New York, United States of America
                University of Toronto, Canada
                Author notes

                Conceived and designed the experiments: SIL AMD DK. Performed the experiments: SIL AMD. Analyzed the data: SIL AMD DP DK. Contributed reagents/materials/analysis tools: SIL. Wrote the paper: SIL DK. Assisted in the performance of the microscopy experiments: DD PAS. Generated and analyzed the EMAP data: NJK.

                Article
                08-PLGE-RA-0766R3
                10.1371/journal.pgen.1000358
                2627940
                19180192
                edd1f25b-93ae-4787-9df8-a166d05c9f41
                Lee et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 27 June 2008
                : 29 December 2008
                Page count
                Pages: 24
                Categories
                Research Article
                Computational Biology
                Computational Biology/Genomics

                Genetics
                Genetics

                Comments

                Comment on this article