134
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      ANGSD: Analysis of Next Generation Sequencing Data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          High-throughput DNA sequencing technologies are generating vast amounts of data. Fast, flexible and memory efficient implementations are needed in order to facilitate analyses of thousands of samples simultaneously.

          Results

          We present a multithreaded program suite called ANGSD. This program can calculate various summary statistics, and perform association mapping and population genetic analyses utilizing the full information in next generation sequencing data by working directly on the raw sequencing data or by using genotype likelihoods.

          Conclusions

          The open source c/c++ program ANGSD is available at http://www.popgen.dk/angsd. The program is tested and validated on GNU/Linux systems. The program facilitates multiple input formats including BAM and imputed beagle genotype probability files. The program allow the user to choose between combinations of existing methods and can perform analysis that is not implemented elsewhere.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s12859-014-0356-4) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references19

          • Record: found
          • Abstract: found
          • Article: not found

          A high-coverage genome sequence from an archaic Denisovan individual.

          We present a DNA library preparation method that has allowed us to reconstruct a high-coverage (30×) genome sequence of a Denisovan, an extinct relative of Neandertals. The quality of this genome allows a direct estimation of Denisovan heterozygosity indicating that genetic diversity in these archaic hominins was extremely low. It also allows tentative dating of the specimen on the basis of "missing evolution" in its genome, detailed measurements of Denisovan and Neandertal admixture into present-day human populations, and the generation of a near-complete catalog of genetic changes that swept to high frequency in modern humans since their divergence from Denisovans.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data

            (2013)
            Motivation: Most existing methods for DNA sequence analysis rely on accurate sequences or genotypes. However, in applications of the next-generation sequencing (NGS), accurate genotypes may not be easily obtained (e.g. multi-sample low-coverage sequencing or somatic mutation discovery). These applications press for the development of new methods for analyzing sequence data with uncertainty. Results: We present a statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing association tests directly based on sequencing data without explicit genotyping or linkage-based imputation. On real data, we demonstrate that our method achieves comparable accuracy to alternative methods for estimating site allele count, for inferring allele frequency spectrum and for association mapping. We also highlight the necessity of using symmetric datasets for finding somatic mutations and confirm that for discovering rare events, mismapping is frequently the leading source of errors. Availability: http://samtools.sourceforge.net. Contact: hengli@broadinstitute.org.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans.

              The origins of the First Americans remain contentious. Although Native Americans seem to be genetically most closely related to east Asians, there is no consensus with regard to which specific Old World populations they are closest to. Here we sequence the draft genome of an approximately 24,000-year-old individual (MA-1), from Mal'ta in south-central Siberia, to an average depth of 1×. To our knowledge this is the oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome belongs to haplogroup U, which has also been found at high frequency among Upper Palaeolithic and Mesolithic European hunter-gatherers, and the Y chromosome of MA-1 is basal to modern-day western Eurasians and near the root of most Native American lineages. Similarly, we find autosomal evidence that MA-1 is basal to modern-day western Eurasians and genetically closely related to modern-day Native Americans, with no close affinity to east Asians. This suggests that populations related to contemporary western Eurasians had a more north-easterly distribution 24,000 years ago than commonly thought. Furthermore, we estimate that 14 to 38% of Native American ancestry may originate through gene flow from this ancient population. This is likely to have occurred after the divergence of Native American ancestors from east Asian ancestors, but before the diversification of Native American populations in the New World. Gene flow from the MA-1 lineage into Native American ancestors could explain why several crania from the First Americans have been reported as bearing morphological characteristics that do not resemble those of east Asians. Sequencing of another south-central Siberian, Afontova Gora-2 dating to approximately 17,000 years ago, revealed similar autosomal genetic signatures as MA-1, suggesting that the region was continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal that western Eurasian genetic signatures in modern-day Native Americans derive not only from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the First Americans.
                Bookmark

                Author and article information

                Contributors
                thorfinn@binf.ku.dk
                albrecht@binf.ku.dk
                rasmus_nielsen@berkeley.edu
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                25 November 2014
                25 November 2014
                2014
                : 15
                : 1
                : 356
                Affiliations
                [ ]Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen, Denmark
                [ ]Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen, DK-2200 Denmark
                [ ]Department of Integrative Biology and Statistics, UC-Berkeley, 4098 VLSB, Berkeley, California, 94720 USA
                Article
                356
                10.1186/s12859-014-0356-4
                4248462
                25420514
                b75328f5-2b5a-42e3-af7e-38f9f91910b8
                © Korneliussen et al.; licensee BioMed Central Ltd. 2014

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 28 June 2014
                : 22 October 2014
                Categories
                Software
                Custom metadata
                © The Author(s) 2014

                Bioinformatics & Computational biology
                next-generation sequencing,bioinformatics,population genetics,association studies

                Comments

                Comment on this article