15
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Gene-based comparative analysis of tools for estimating copy number alterations using whole-exome sequencing data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Accurate detection of copy number alterations (CNAs) using next-generation sequencing technology is essential for the development and application of more precise medical treatments for human cancer. Here, we evaluated seven CNA estimation tools (ExomeCNV, CoNIFER, VarScan2, CODEX, ngCGH, saasCNV, and falcon) using whole-exome sequencing data from 419 breast cancer tumor-normal sample pairs from The Cancer Genome Atlas. Estimations generated using each tool were converted into gene-based copy numbers; concordance for gains and losses and the sensitivity and specificity of each tool were compared to validated copy numbers from a single nucleotide polymorphism reference array. The concordance and sensitivity of the tumor-normal pair methods for estimating CNAs (saasCNV, ExomeCNV, and VarScan2) were better than those of the tumor batch methods (CoNIFER and CODEX). SaasCNV had the highest gain and loss concordances (65.0%), sensitivity (69.4%), and specificity (89.1%) for estimating copy number gains or losses. These findings indicate that improved CNA detection algorithms are needed to more accurately interpret whole-exome sequencing results in human cancer.

          Related collections

          Most cited references22

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Copy number variation detection and genotyping from exome sequence data

          While exome sequencing is readily amenable to single-nucleotide variant discovery, the sparse and nonuniform nature of the exome capture reaction has hindered exome-based detection and characterization of genic copy number variation. We developed a novel method using singular value decomposition (SVD) normalization to discover rare genic copy number variants (CNVs) as well as genotype copy number polymorphic (CNP) loci with high sensitivity and specificity from exome sequencing data. We estimate the precision of our algorithm using 122 trios (366 exomes) and show that this method can be used to reliably predict (94% overall precision) both de novo and inherited rare CNVs involving three or more consecutive exons. We demonstrate that exome-based genotyping of CNPs strongly correlates with whole-genome data (median r 2 = 0.91), especially for loci with fewer than eight copies, and can estimate the absolute copy number of multi-allelic genes with high accuracy (78% call level). The resulting user-friendly computational pipeline, CoNIFER ( co py n umber i nference f rom e xome r eads), can reliably be used to discover disruptive genic CNVs missed by standard approaches and should have broad application in human genetic studies of disease.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth.

            Sequencing of gene-coding regions (the exome) is increasingly used for studying human disease, for which copy-number variants (CNVs) are a critical genetic component. However, detecting copy number from exome sequencing is challenging because of the noncontiguous nature of the captured exons. This is compounded by the complex relationship between read depth and copy number; this results from biases in targeted genomic hybridization, sequence factors such as GC content, and batching of samples during collection and sequencing. We present a statistical tool (exome hidden Markov model [XHMM]) that uses principal-component analysis (PCA) to normalize exome read depth and a hidden Markov model (HMM) to discover exon-resolution CNV and genotype variation across samples. We evaluate performance on 90 schizophrenia trios and 1,017 case-control samples. XHMM detects a median of two rare (<1%) CNVs per individual (one deletion and one duplication) and has 79% sensitivity to similarly rare CNVs overlapping three or more exons discovered with microarrays. With sensitivity similar to state-of-the-art methods, XHMM achieves higher specificity by assigning quality metrics to the CNV calls to filter out bad ones, as well as to statistically genotype the discovered CNV in all individuals, yielding a trio call set with Mendelian-inheritance properties highly consistent with expectation. We also show that XHMM breakpoint quality scores enable researchers to explicitly search for novel classes of structural variation. For example, we apply XHMM to extract those CNVs that are highly likely to disrupt (delete or duplicate) only a portion of a gene. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives

              Copy number variation (CNV) is a prevalent form of critical genetic variation that leads to an abnormal number of copies of large genomic regions in a cell. Microarray-based comparative genome hybridization (arrayCGH) or genotyping arrays have been standard technologies to detect large regions subject to copy number changes in genomes until most recently high-resolution sequence data can be analyzed by next-generation sequencing (NGS). During the last several years, NGS-based analysis has been widely applied to identify CNVs in both healthy and diseased individuals. Correspondingly, the strong demand for NGS-based CNV analyses has fuelled development of numerous computational methods and tools for CNV detection. In this article, we review the recent advances in computational methods pertaining to CNV detection using whole genome and whole exome sequencing data. Additionally, we discuss their strengths and weaknesses and suggest directions for future development.
                Bookmark

                Author and article information

                Journal
                Oncotarget
                Oncotarget
                Oncotarget
                ImpactJ
                Oncotarget
                Impact Journals LLC
                1949-2553
                18 April 2017
                6 March 2017
                : 8
                : 16
                : 27277-27285
                Affiliations
                1 Department of Pathology, College of Medicine, Hanyang University, Seoul, Republic of Korea
                2 Institute for Bioengineering and Biopharmaceutical Research (IBBR), Hanyang University, Seoul, Republic of Korea
                Author notes
                Correspondence to: Gu Kong, gkong@ 123456hanyang.ac.kr
                Article
                15932
                10.18632/oncotarget.15932
                5432334
                28460482
                19e6d545-c1fe-4b19-a33e-0d571b77cef0
                Copyright: © 2017 Kim et al.

                This article is distributed under the terms of the Creative Commons Attribution License (CC-BY), which permits unrestricted use and redistribution provided that the original author and source are credited.

                History
                : 10 April 2016
                : 20 February 2017
                Categories
                Research Paper

                Oncology & Radiotherapy
                cancer cnv,cna estimation,wes,ngs,copy number
                Oncology & Radiotherapy
                cancer cnv, cna estimation, wes, ngs, copy number

                Comments

                Comment on this article