Blog
About

221
views
0
recommends
+1 Recommend
0 collections
    4
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A survey of tools for variant analysis of next-generation genome sequencing data

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers.

          Related collections

          Most cited references 91

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences

          Increased reliance on computational approaches in the life sciences has revealed grave concerns about how accessible and reproducible computation-reliant results truly are. Galaxy http://usegalaxy.org, an open web-based platform for genomic research, addresses these problems. Galaxy automatically tracks and manages data provenance and provides support for capturing the context and intent of computational methods. Galaxy Pages are interactive, web-based documents that provide users with a medium to communicate a complete computational analysis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads.

            High-volume sequencing of DNA and RNA is now within reach of any research laboratory and is quickly becoming established as a key research tool. In many workflows, each of the short sequences ("reads") resulting from a sequencing run are first "mapped" (aligned) to a reference sequence to infer the read from which the genomic location derived, a challenging task because of the high data volumes and often large genomes. Existing read mapping software excel in either speed (e.g., BWA, Bowtie, ELAND) or sensitivity (e.g., Novoalign), but not in both. In addition, performance often deteriorates in the presence of sequence variation, particularly so for short insertions and deletions (indels). Here, we present a read mapper, Stampy, which uses a hybrid mapping algorithm and a detailed statistical model to achieve both speed and sensitivity, particularly when reads include sequence variation. This results in a higher useable sequence yield and improved accuracy compared to that of existing software.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              An integrated semiconductor device enabling non-optical genome sequencing.

              The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.
                Bookmark

                Author and article information

                Journal
                Brief Bioinform
                Brief. Bioinformatics
                bib
                bib
                Briefings in Bioinformatics
                Oxford University Press
                1467-5463
                1477-4054
                March 2014
                21 January 2013
                21 January 2013
                : 15
                : 2 , Special Issue: Papers inspired by the meeting ISCB-Asia/SCCG 2012
                : 256-278
                Author notes
                Corresponding author. Zlatko Trajanoski, Division for Bioinformatics, Innsbruck Medical University, Innrain 80, 6020 Innsbruck, Austria. Tel.: +43-512-9003-71401; Fax: +43-512-9003-73100; E-mail: zlatko.trajanoski@ 123456i-med.ac.at
                Article
                bbs086
                10.1093/bib/bbs086
                3956068
                23341494
                © The Author 2013. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                Page count
                Pages: 1
                Categories
                Papers

                Comments

                Comment on this article