Blog
About

542
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure

      ,

      Bioinformatics

      Oxford University Press (OUP)

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Clustering of individuals into populations on the basis of multilocus genotypes is informative in a variety of settings. In population-genetic clustering algorithms, such as BAPS, STRUCTURE and TESS, individual multilocus genotypes are partitioned over a set of clusters, often using unsupervised approaches that involve stochastic simulation. As a result, replicate cluster analyses of the same data may produce several distinct solutions for estimated cluster membership coefficients, even though the same initial conditions were used. Major differences among clustering solutions have two main sources: (1) 'label switching' of clusters across replicates, caused by the arbitrary way in which clusters in an unsupervised analysis are labeled, and (2) 'genuine multimodality,' truly distinct solutions across replicates. To facilitate the interpretation of population-genetic clustering results, we describe three algorithms for aligning multiple replicate analyses of the same data set. We have implemented these algorithms in the computer program CLUMPP (CLUster Matching and Permutation Program). We illustrate the use of CLUMPP by aligning the cluster membership coefficients from 100 replicate cluster analyses of 600 chickens from 20 different breeds. CLUMPP is freely available at http://rosenberglab.bioinformatics.med.umich.edu/clumpp.html.

          Related collections

          Most cited references 8

          • Record: found
          • Abstract: found
          • Article: not found

          Bayesian identification of admixture events using multilocus molecular markers.

          Bayesian statistical methods for the estimation of hidden genetic structure of populations have gained considerable popularity in the recent years. Utilizing molecular marker data, Bayesian mixture models attempt to identify a hidden population structure by clustering individuals into genetically divergent groups, whereas admixture models target at separating the ancestral sources of the alleles observed in different individuals. We discuss the difficulties involved in the simultaneous estimation of the number of ancestral populations and the levels of admixture in studied individuals' genomes. To resolve this issue, we introduce a computationally efficient method for the identification of admixture events in the population history. Our approach is illustrated by analyses of several challenging real and simulated data sets. The software (baps), implementing the methods introduced here, is freely available at http://www.rni.helsinki.fi/~jic/bapspage.html.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Dealing with label switching in mixture models

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              BAPS 2: enhanced possibilities for the analysis of genetic population structure.

              Bayesian statistical methods based on simulation techniques have recently been shown to provide powerful tools for the analysis of genetic population structure. We have previously developed a Markov chain Monte Carlo (MCMC) algorithm for characterizing genetically divergent groups based on molecular markers and geographical sampling design of the dataset. However, for large-scale datasets such algorithms may get stuck to local maxima in the parameter space. Therefore, we have modified our earlier algorithm to support multiple parallel MCMC chains, with enhanced features that enable considerably faster and more reliable estimation compared to the earlier version of the algorithm. We consider also a hierarchical tree representation, from which a Bayesian model-averaged structure estimate can be extracted. The algorithm is implemented in a computer program that features a user-friendly interface and built-in graphics. The enhanced features are illustrated by analyses of simulated data and an extensive human molecular dataset. Freely available at http://www.rni.helsinki.fi/~jic/bapspage.html.
                Bookmark

                Author and article information

                Journal
                Bioinformatics
                Bioinformatics
                Oxford University Press (OUP)
                1367-4803
                1460-2059
                August 08 2007
                July 15 2007
                May 07 2007
                July 15 2007
                : 23
                : 14
                : 1801-1806
                Article
                10.1093/bioinformatics/btm233
                17485429
                © 2007

                Comments

                Comment on this article