323
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses in R

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Although many computer programs can perform population genetics calculations, they are typically limited in the analyses and data input formats they offer; few applications can process the large data sets produced by whole-genome resequencing projects. Furthermore, there is no coherent framework for the easy integration of new statistics into existing pipelines, hindering the development and application of new population genetics and genomics approaches. Here, we present PopGenome, a population genomics package for the R software environment (a de facto standard for statistical analyses). PopGenome can efficiently process genome-scale data as well as large sets of individual loci. It reads DNA alignments and single-nucleotide polymorphism (SNP) data sets in most common formats, including those used by the HapMap, 1000 human genomes, and 1001 Arabidopsis genomes projects. PopGenome also reads associated annotation files in GFF format, enabling users to easily define regions or classify SNPs based on their annotation; all analyses can also be applied to sliding windows. PopGenome offers a wide range of diverse population genetics analyses, including neutrality tests as well as statistics for population differentiation, linkage disequilibrium, and recombination. PopGenome is linked to Hudson’s MS and Ewing’s MSMS programs to assess statistical significance based on coalescent simulations. PopGenome’s integration in R facilitates effortless and reproducible downstream analyses as well as the production of publication-quality graphics. Developers can easily incorporate new analyses methods into the PopGenome framework. PopGenome and R are freely available from CRAN ( http://cran.r-project.org/) for all major operating systems under the GNU General Public License.

          Related collections

          Most cited references17

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Arlequin (version 3.0): An integrated software package for population genetics data analysis

          Arlequin ver 3.0 is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of departure from linkage equilibrium, departure from selective neutrality and demographic equilibrium, estimation or parameters from past population expansions, and thorough analyses of population subdivision under the AMOVA framework. Arlequin 3 introduces a completely new graphical interface written in C++, a more robust semantic analysis of input files, and two new methods: a Bayesian estimation of gametic phase from multi-locus genotypes, and an estimation of the parameters of an instantaneous spatial expansion from DNA sequence polymorphism. Arlequin can handle several data types like DNA sequences, microsatellite data, or standard multi-locus genotypes. A Windows version of the software is freely available on http://cmpg.unibe.ch/software/arlequin3.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            DnaSP, DNA polymorphism analyses by the coalescent and other methods.

            DnaSP is a software package for the analysis of DNA polymorphism data. Present version introduces several new modules and features which, among other options allow: (1) handling big data sets (approximately 5 Mb per sequence); (2) conducting a large number of coalescent-based tests by Monte Carlo computer simulations; (3) extensive analyses of the genetic differentiation and gene flow among populations; (4) analysing the evolutionary pattern of preferred and unpreferred codons; (5) generating graphical outputs for an easy visualization of results. The software package, including complete documentation and examples, is freely available to academic users from: http://www.ub.es/dnasp
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Genomic scans for selective sweeps using SNP data.

              Detecting selective sweeps from genomic SNP data is complicated by the intricate ascertainment schemes used to discover SNPs, and by the confounding influence of the underlying complex demographics and varying mutation and recombination rates. Current methods for detecting selective sweeps have little or no robustness to the demographic assumptions and varying recombination rates, and provide no method for correcting for ascertainment biases. Here, we present several new tests aimed at detecting selective sweeps from genomic SNP data. Using extensive simulations, we show that a new parametric test, based on composite likelihood, has a high power to detect selective sweeps and is surprisingly robust to assumptions regarding recombination rates and demography (i.e., has low Type I error). Our new test also provides estimates of the location of the selective sweep(s) and the magnitude of the selection coefficient. To illustrate the method, we apply our approach to data from the Seattle SNP project and to Chromosome 2 data from the HapMap project. In Chromosome 2, the most extreme signal is found in the lactase gene, which previously has been shown to be undergoing positive selection. Evidence for selective sweeps is also found in many other regions, including genes known to be associated with disease risk such as DPP10 and COL4A3.
                Bookmark

                Author and article information

                Journal
                Mol Biol Evol
                Mol. Biol. Evol
                molbev
                molbiolevol
                Molecular Biology and Evolution
                Oxford University Press
                0737-4038
                1537-1719
                July 2014
                16 April 2014
                16 April 2014
                : 31
                : 7
                : 1929-1936
                Affiliations
                1Institute for Computer Science, Heinrich Heine University, Düsseldorf, Germany
                2Centre for Research in Agricultural Genomics, Bellaterra, Spain
                3Cluster of Excellence on Plant Sciences, Düsseldorf, Germany
                Author notes
                *Corresponding author: E-mail: lercher@ 123456cs.uni-duesseldorf.de .

                Associate editor: Juliette de Meaux

                Article
                msu136
                10.1093/molbev/msu136
                4069620
                24739305
                24f0e8eb-8352-447f-852e-06159a04486c
                © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                Page count
                Pages: 8
                Categories
                Resources

                Molecular biology
                population genomics,software,single-nucleotide polymorphisms
                Molecular biology
                population genomics, software, single-nucleotide polymorphisms

                Comments

                Comment on this article