5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data.

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          DNA sample contamination is a frequent problem in DNA sequencing studies and can result in genotyping errors and reduced power for association testing. We recently described methods to identify within-species DNA sample contamination based on sequencing read data, showed that our methods can reliably detect and estimate contamination levels as low as 1%, and suggested strategies to identify and remove contaminated samples from sequencing studies. Here we propose methods to model contamination during genotype calling as an alternative to removal of contaminated samples from further analyses. We compare our contamination-adjusted calls to calls that ignore contamination and to calls based on uncontaminated data. We demonstrate that, for moderate contamination levels (5%-20%), contamination-adjusted calls eliminate 48%-77% of the genotyping errors. For lower levels of contamination, our contamination correction methods produce genotypes nearly as accurate as those based on uncontaminated data. Our contamination correction methods are useful generally, but are particularly helpful for sample contamination levels from 2% to 20%.

          Related collections

          Author and article information

          Journal
          Am. J. Hum. Genet.
          American journal of human genetics
          Elsevier BV
          1537-6605
          0002-9297
          Aug 06 2015
          : 97
          : 2
          Affiliations
          [1 ] Department of Biostatistics and Center for Statistical Genetics, University of Michigan, School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA.
          [2 ] Department of Biostatistics and Center for Statistical Genetics, University of Michigan, School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA; Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, 7000 Fannin Street, Houston, TX 77030, USA.
          [3 ] Department of Biostatistics and Center for Statistical Genetics, University of Michigan, School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA. Electronic address: boehnke@umich.edu.
          [4 ] Department of Biostatistics and Center for Statistical Genetics, University of Michigan, School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA. Electronic address: hmkang@umich.edu.
          Article
          S0002-9297(15)00278-5
          10.1016/j.ajhg.2015.07.002
          4573246
          26235984
          834c6d48-4928-4a82-b7b2-caf2f9ff156a

          Comments

          Comment on this article