84
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Combining Genome-Wide Association Mapping and Transcriptional Networks to Identify Novel Genes Controlling Glucosinolates in Arabidopsis thaliana

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Genome-wide association mapping is highly sensitive to environmental changes, but network analysis allows rapid causal gene identification.

          Abstract

          Background

          Genome-wide association (GWA) is gaining popularity as a means to study the architecture of complex quantitative traits, partially due to the improvement of high-throughput low-cost genotyping and phenotyping technologies. Glucosinolate (GSL) secondary metabolites within Arabidopsis spp. can serve as a model system to understand the genomic architecture of adaptive quantitative traits. GSL are key anti-herbivory defenses that impart adaptive advantages within field trials. While little is known about how variation in the external or internal environment of an organism may influence the efficiency of GWA, GSL variation is known to be highly dependent upon the external stresses and developmental processes of the plant lending it to be an excellent model for studying conditional GWA.

          Methodology/Principal Findings

          To understand how development and environment can influence GWA, we conducted a study using 96 Arabidopsis thaliana accessions, >40 GSL phenotypes across three conditions (one developmental comparison and one environmental comparison) and ∼230,000 SNPs. Developmental stage had dramatic effects on the outcome of GWA, with each stage identifying different loci associated with GSL traits. Further, while the molecular bases of numerous quantitative trait loci (QTL) controlling GSL traits have been identified, there is currently no estimate of how many additional genes may control natural variation in these traits. We developed a novel co-expression network approach to prioritize the thousands of GWA candidates and successfully validated a large number of these genes as influencing GSL accumulation within A. thaliana using single gene isogenic lines.

          Conclusions/Significance

          Together, these results suggest that complex traits imparting environmentally contingent adaptive advantages are likely influenced by up to thousands of loci that are sensitive to fluctuations in the environment or developmental state of the organism. Additionally, while GWA is highly conditional upon genetics, the use of additional genomic information can rapidly identify causal loci en masse.

          Author Summary

          Understanding how genetic variation can control phenotypic variation is a fundamental goal of modern biology. A major push has been made using genome-wide association mapping in all organisms to attempt and rapidly identify the genes contributing to phenotypes such as disease and nutritional disorders. But a number of fundamental questions have not been answered about the use of genome-wide association: for example, how does the internal or external environment influence the genes found? Furthermore, the simple question of how many genes may influence a trait is unknown. Finally, a number of studies have identified significant false-positive and -negative issues within genome-wide association studies that are not solvable by direct statistical approaches. We have used genome-wide association mapping in the plant Arabidopsis thaliana to begin exploring these questions. We show that both external and internal environments significantly alter the identified genes, such that using different tissues can lead to the identification of nearly completely different gene sets. Given the large number of potential false-positives, we developed an orthogonal approach to filtering the possible genes, by identifying co-functioning networks using the nominal candidate gene list derived from genome-wide association studies. This allowed us to rapidly identify and validate a large number of novel and unexpected genes that affect Arabidopsis thaliana defense metabolism within phenotypic ranges that have been shown to be selectable within the field. These genes and the associated networks suggest that Arabidopsis thaliana defense metabolism is more readily similar to the infinite gene hypothesis, according to which there is a vast number of causative genes controlling natural variation in this phenotype. It remains to be seen how frequently this is true for other organisms and other phenotypes.

          Related collections

          Most cited references122

          • Record: found
          • Abstract: found
          • Article: not found

          Genome-wide insertional mutagenesis of Arabidopsis thaliana.

          J Alonso (2003)
          Over 225,000 independent Agrobacterium transferred DNA (T-DNA) insertion events in the genome of the reference plant Arabidopsis thaliana have been created that represent near saturation of the gene space. The precise locations were determined for more than 88,000 T-DNA insertions, which resulted in the identification of mutations in more than 21,700 of the approximately 29,454 predicted Arabidopsis genes. Genome-wide analysis of the distribution of integration events revealed the existence of a large integration site bias at both the chromosome and gene levels. Insertion mutations were identified in genes that are regulated in response to the plant hormone ethylene.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The probability of duplicate gene preservation by subfunctionalization.

            It has often been argued that gene-duplication events are most commonly followed by a mutational event that silences one member of the pair, while on rare occasions both members of the pair are preserved as one acquires a mutation with a beneficial function and the other retains the original function. However, empirical evidence from genome duplication events suggests that gene duplicates are preserved in genomes far more commonly and for periods far in excess of the expectations under this model, and whereas some gene duplicates clearly evolve new functions, there is little evidence that this is the most common mechanism of duplicate-gene preservation. An alternative hypothesis is that gene duplicates are frequently preserved by subfunctionalization, whereby both members of a pair experience degenerative mutations that reduce their joint levels and patterns of activity to that of the single ancestral gene. We consider the ways in which the probability of duplicate-gene preservation by such complementary mutations is modified by aspects of gene structure, degree of linkage, mutation rates and effects, and population size. Even if most mutations cause complete loss-of-subfunction, the probability of duplicate-gene preservation can be appreciable if the long-term effective population size is on the order of 10(5) or smaller, especially if there are more than two independently mutable subfunctions per locus. Even a moderate incidence of partial loss-of-function mutations greatly elevates the probability of preservation. The model proposed herein leads to quantitative predictions that are consistent with observations on the frequency of long-term duplicate gene preservation and with observations that indicate that a common fate of the members of duplicate-gene pairs is the partitioning of tissue-specific patterns of expression of the ancestral gene.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Glucosinolate metabolites required for an Arabidopsis innate immune response.

              The perception of pathogen or microbe-associated molecular pattern molecules by plants triggers a basal defense response analogous to animal innate immunity and is defined partly by the deposition of the glucan polymer callose at the cell wall at the site of pathogen contact. Transcriptional and metabolic profiling in Arabidopsis mutants, coupled with the monitoring of pathogen-triggered callose deposition, have identified major roles in pathogen response for the plant hormone ethylene and the secondary metabolite 4-methoxy-indol-3-ylmethylglucosinolate. Two genes, PEN2 and PEN3, are also necessary for resistance to pathogens and are required for both callose deposition and glucosinolate activation, suggesting that the pathogen-triggered callose response is required for resistance to microbial pathogens. Our study shows that well-studied plant metabolites, previously identified as important in avoiding damage by herbivores, are also required as a component of the plant defense response against microbial pathogens.
                Bookmark

                Author and article information

                Contributors
                Role: Academic Editor
                Journal
                PLoS Biol
                plos
                plosbiol
                PLoS Biology
                Public Library of Science (San Francisco, USA )
                1544-9173
                1545-7885
                August 2011
                August 2011
                16 August 2011
                : 9
                : 8
                : e1001125
                Affiliations
                [1 ]Department of Plant Sciences, University of California–Davis, Davis, California, United States of America
                [2 ]Monsanto Company, Vegetable Seeds Division, Woodland, California, United States of America
                Georgia Institute of Technology, United States of America
                Author notes

                The author(s) have made the following declarations about their contributions: Conceived and designed the experiments: EKFC HCR JAC BJ DJK. Performed the experiments: EKFC HCR JAC BJ. Analyzed the data: EKFC HCR JAC BJ DJK. Contributed reagents/materials/analysis tools: EKFC HCR JAC. Wrote the paper: EKFC HCR JAC BJ DJK.

                ¤: Current address: Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada

                Article
                PBIOLOGY-D-10-01411
                10.1371/journal.pbio.1001125
                3156686
                21857804
                41342d77-0dbb-4aeb-9134-49414adaf2d1
                Chan et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 22 December 2010
                : 7 July 2011
                Page count
                Pages: 19
                Categories
                Research Article
                Biology
                Computational Biology
                Metabolic Networks
                Evolutionary Biology
                Population Genetics
                Genetic Polymorphism
                Genetics
                Gene Networks
                Genome-Wide Association Studies
                Plant Genetics
                Genomics
                Functional Genomics
                Genome Complexity
                Plant Science
                Plant Biochemistry
                Plant Genetics
                Systems Biology

                Life sciences
                Life sciences

                Comments

                Comment on this article