QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Array-based technologies have been used to detect chromosomal copy number changes (aneuploidies) in the human genome. Recent studies identified numerous copy number variants (CNV) and some are common polymorphisms that may contribute to disease susceptibility. We developed, and experimentally validated, a novel computational framework (QuantiSNP) for detecting regions of copy number variation from BeadArray™ SNP genotyping data using an Objective Bayes Hidden-Markov Model (OB-HMM). Objective Bayes measures are used to set certain hyperparameters in the priors using a novel re-sampling framework to calibrate the model to a fixed Type I (false positive) error rate. Other parameters are set via maximum marginal likelihood to prior training data of known structure. QuantiSNP provides probabilistic quantification of state classifications and significantly improves the accuracy of segmental aneuploidy identification and mapping, relative to existing analytical tools (Beadstudio, Illumina), as demonstrated by validation of breakpoint boundaries. QuantiSNP identified both novel and validated CNVs. QuantiSNP was developed using BeadArray™ SNP data but it can be adapted to other platforms and we believe that the OB-HMM framework has widespread applicability in genomic research. In conclusion, QuantiSNP is a novel algorithm for high-resolution CNV/aneuploidy detection with application to clinical genetics, cancer and disease association studies.

Related collections

Most cited references 37

Record: found
Abstract: found
Article: not found

Global variation in copy number in the human genome.

Richard Redon, Shumpei Ishikawa, Karen R Fitch … (2006)

Copy number variation (CNV) of DNA sequences is functionally significant but has yet to be fully ascertained. We have constructed a first-generation CNV map of the human genome through the study of 270 individuals from four populations with ancestry in Europe, Africa or Asia (the HapMap collection). DNA from these individuals was screened for CNV using two complementary technologies: single-nucleotide polymorphism (SNP) genotyping arrays, and clone-based comparative genomic hybridization. A total of 1,447 copy number variable regions (CNVRs), which can encompass overlapping or adjacent gains or losses, covering 360 megabases (12% of the genome) were identified in these populations. These CNVRs contained hundreds of genes, disease loci, functional elements and segmental duplications. Notably, the CNVRs encompassed more nucleotide content per genome than SNPs, underscoring the importance of CNV in genetic diversity and evolution. The data obtained delineate linkage disequilibrium patterns for many CNVs, and reveal marked variation in copy number among populations. We also demonstrate the utility of this resource for genetic disease studies.

0 comments Cited 1209 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Structural variation in the human genome.

Lars Feuk, Andrew R. Carson, Stephen W Scherer (2006)

The first wave of information from the analysis of the human genome revealed SNPs to be the main source of genetic and phenotypic human variation. However, the advent of genome-scanning technologies has now uncovered an unexpectedly large extent of what we term 'structural variation' in the human genome. This comprises microscopic and, more commonly, submicroscopic variants, which include deletions, duplications and large-scale copy-number variants - collectively termed copy-number variants or copy-number polymorphisms - as well as insertions, inversions and translocations. Rapidly accumulating evidence indicates that structural variants can comprise millions of nucleotides of heterogeneity within every genome, and are likely to make an important contribution to human diversity and disease susceptibility.

0 comments Cited 630 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping.

Daniel A Peiffer, Jennie M. Le, Frank J Steemers … (2006)

Array-CGH is a powerful tool for the detection of chromosomal aberrations. The introduction of high-density SNP genotyping technology to genomic profiling, termed SNP-CGH, represents a further advance, since simultaneous measurement of both signal intensity variations and changes in allelic composition makes it possible to detect both copy number changes and copy-neutral loss-of-heterozygosity (LOH) events. We demonstrate the utility of SNP-CGH with two Infinium whole-genome genotyping BeadChips, assaying 109,000 and 317,000 SNP loci, to detect chromosomal aberrations in samples bearing constitutional aberrations as well tumor samples at sub-100 kb effective resolution. Detected aberrations include homozygous deletions, hemizygous deletions, copy-neutral LOH, duplications, and amplifications. The statistical ability to detect common aberrations was modeled by analysis of an X chromosome titration model system, and sensitivity was modeled by titration of gDNA from a tumor cell with that of its paired normal cell line. Analysis was facilitated by using a genome browser that plots log ratios of normalized intensities and allelic ratios along the chromosomes. We developed two modes of SNP-CGH analysis, a single sample and a paired sample mode. The single sample mode computes log intensity ratios and allelic ratios by referencing to canonical genotype clusters generated from approximately 120 reference samples, whereas the paired sample mode uses a paired normal reference sample from the same individual. Finally, the two analysis modes are compared and contrasted for their utility in analyzing different types of input gDNA: low input amounts, fragmented gDNA, and Phi29 whole-genome pre-amplified DNA.

0 comments Cited 174 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Journal ID (pmc): nar

Journal ID (publisher-id): Nucleic Acids Research

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date (Print): March 2007

Publication date (Electronic): 6 March 2007

Publication date PMC-release: 6 March 2007

Volume: 35

Issue: 6

Pages: 2013-2025

Affiliations

¹Genomics Laboratory and ⁴Bioinformatics, Wellcome Trust Centre for Human Genetics, Roosevelt Drive, Oxford, OX3 7BN, ²Life Science Interface Doctoral Training Centre, Wolfson Building, Parks Road, Oxford OX1 3QD, ³Henry Wellcome Centre for Gene Function, Department of Statistics, University of Oxford, Oxford, OX1 3TG, ⁵Oxford Medical Genetics Laboratories, The Churchill Hospital, Oxford, OX3 7LJ, UK, ⁶Centre for Addiction & Mental Health, University of Toronto, 1001 Queen Street West, Toronto, Ontario M6J 1H4, Canada and ⁷MRC Mammalian Genetics Unit, Medical Research Council, Harwell, Oxford, OX11 0RD

Author notes

*To whom correspondence should be addressed. +44-(0)1865 287526+44-(0)1865 287533 ioannis.ragoussis@ 123456well.ox.ac.uk

Correspondence may also be addressed to Christopher C. Holmes. +44 (0)1865 285368+44 (0)1865 285384 cholmes@ 123456stats.ox.ac.uk

The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.

Article

DOI: 10.1093/nar/gkm076

PMC ID: 1874617

PubMed ID: 17341461

SO-VID: 02ce88da-9814-4237-9db8-afb0022d29cc

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 20 December 2006

Date revision received : 24 January 2007

Date accepted : 25 January 2007

Comments

Comment on this article

scite_

Cited by 215

See all cited by

Most referenced authors 661

See all reference authors

QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data

Read this article at

Abstract

Related collections

Genome Engineering using CRISPR

Most cited references 37

Global variation in copy number in the human genome.

Structural variation in the human genome.

High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping.

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 275

Cited by 215

Most referenced authors 661