30
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Genome Scans for Detecting Footprints of Local Adaptation Using a Bayesian Factor Model

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          There is a considerable impetus in population genomics to pinpoint loci involved in local adaptation. A powerful approach to find genomic regions subject to local adaptation is to genotype numerous molecular markers and look for outlier loci. One of the most common approaches for selection scans is based on statistics that measure population differentiation such as F ST. However, there are important caveats with approaches related to F ST because they require grouping individuals into populations and they additionally assume a particular model of population structure. Here, we implement a more flexible individual-based approach based on Bayesian factor models. Factor models capture population structure with latent variables called factors, which can describe clustering of individuals into populations or isolation-by-distance patterns. Using hierarchical Bayesian modeling, we both infer population structure and identify outlier loci that are candidates for local adaptation. In order to identify outlier loci, the hierarchical factor model searches for loci that are atypically related to population structure as measured by the latent factors. In a model of population divergence, we show that it can achieve a 2-fold or more reduction of false discovery rate compared with the software BayeScan or with an F ST approach. We show that our software can handle large data sets by analyzing the single nucleotide polymorphisms of the Human Genome Diversity Project. The Bayesian factor model is implemented in the open-source PCAdapt software.

          Related collections

          Most cited references44

          • Record: found
          • Abstract: found
          • Article: not found

          Interpreting principal component analyses of spatial population genetic variation.

          Nearly 30 years ago, Cavalli-Sforza et al. pioneered the use of principal component analysis (PCA) in population genetics and used PCA to produce maps summarizing human genetic variation across continental regions. They interpreted gradient and wave patterns in these maps as signatures of specific migration events. These interpretations have been controversial, but influential, and the use of PCA has become widespread in analysis of population genetics data. However, the behavior of PCA for genetic data showing continuous spatial variation, such as might exist within human continental groups, has been less well characterized. Here, we find that gradients and waves observed in Cavalli-Sforza et al.'s maps resemble sinusoidal mathematical artifacts that arise generally when PCA is applied to spatial data, implying that the patterns do not necessarily reflect specific migration events. Our findings aid interpretation of PCA results and suggest how PCA can help correct for continuous population structure in association studies.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Heterogeneous genomic differentiation between walking-stick ecotypes: "isolation by adaptation" and multiple roles for divergent selection.

            Genetic differentiation can be highly variable across the genome. For example, loci under divergent selection and those tightly linked to them may exhibit elevated differentiation compared to neutral regions. These represent "outlier loci" whose differentiation exceeds neutral expectations. Adaptive divergence can also increase genome-wide differentiation by promoting general barriers to neutral gene flow, thereby facilitating genomic divergence via genetic drift. This latter process can yield a positive correlation between adaptive phenotypic divergence and neutral genetic differentiation (described here as "isolation-by-adaptation"). Here, we examine both these processes by combining an AFLP genome scan of two host plant ecotypes of Timema cristinae walking-sticks with existing data on adaptive phenotypic divergence and ecological speciation in these insects. We found that about 8% of loci are outliers in multiple population comparisons. Replicated comparisons between population-pairs using the same versus different host species revealed that 1-2% of loci are subject to host-related selection specifically. Locus-specific analyses revealed that up to 10% of putatively neutral (nonoutlier) AFLP loci exhibit significant isolation-by-adaptation. Our results suggest that selection may affect differentiation directly, via linkage, or by facilitating genetic drift. They thus illustrate the varied and sometimes nonintuitive contributions of selection to heterogeneous genomic differentiation.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A Bayesian missing value estimation method for gene expression profile data.

              Gene expression profile analyses have been used in numerous studies covering a broad range of areas in biology. When unreliable measurements are excluded, missing values are introduced in gene expression profiles. Although existing multivariate analysis methods have difficulty with the treatment of missing values, this problem has received little attention. There are many options for dealing with missing values, each of which reaches drastically different results. Ignoring missing values is the simplest method and is frequently applied. This approach, however, has its flaws. In this article, we propose an estimation method for missing values, which is based on Bayesian principal component analysis (BPCA). Although the methodology that a probabilistic model and latent variables are estimated simultaneously within the framework of Bayes inference is not new in principle, actual BPCA implementation that makes it possible to estimate arbitrary missing variables is new in terms of statistical methodology. When applied to DNA microarray data from various experimental conditions, the BPCA method exhibited markedly better estimation ability than other recently proposed methods, such as singular value decomposition and K-nearest neighbors. While the estimation performance of existing methods depends on model parameters whose determination is difficult, our BPCA method is free from this difficulty. Accordingly, the BPCA method provides accurate and convenient estimation for missing values. The software is available at http://hawaii.aist-nara.ac.jp/~shige-o/tools/.
                Bookmark

                Author and article information

                Journal
                Mol Biol Evol
                Mol. Biol. Evol
                molbev
                molbiolevol
                Molecular Biology and Evolution
                Oxford University Press
                0737-4038
                1537-1719
                September 2014
                03 June 2014
                03 June 2014
                : 31
                : 9
                : 2483-2495
                Affiliations
                1Laboratoire TIMC-IMAG, UMR 5525, Centre National de la Recherche Scientifique, Université Joseph Fourier, Grenoble, France
                2Laboratoire d’Ecologie Alpine, UMR 5553, Centre National de la Recherche Scientifique, Université Joseph Fourier, Grenoble, France
                Author notes
                *Corresponding author: E-mail: michael.blum@ 123456imag.fr .

                Associate editor: John Novembre

                Article
                msu182
                10.1093/molbev/msu182
                4137708
                24899666
                2f527b09-8b65-4f28-aa9c-a23872289a6c
                © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                Page count
                Pages: 13
                Categories
                Methods

                Molecular biology
                fst,population structure,landscape genetics,population genomics,selection scans
                Molecular biology
                fst, population structure, landscape genetics, population genomics, selection scans

                Comments

                Comment on this article