32
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Tools for mapping high-throughput sequencing data

      , , ,
      Bioinformatics
      Oxford University Press (OUP)

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          A ubiquitous and fundamental step in high-throughput sequencing analysis is the alignment (mapping) of the generated reads to a reference sequence. To accomplish this task, numerous software tools have been proposed. Determining the mappers that are most suitable for a specific application is not trivial. This survey focuses on classifying mappers through a wide number of characteristics. The goal is to allow practitioners to compare the mappers more easily and find those that are most suitable for their specific problem.

          Related collections

          Most cited references42

          • Record: found
          • Abstract: found
          • Article: not found

          Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning.

          Cytosine DNA methylation is important in regulating gene expression and in silencing transposons and other repetitive sequences. Recent genomic studies in Arabidopsis thaliana have revealed that many endogenous genes are methylated either within their promoters or within their transcribed regions, and that gene methylation is highly correlated with transcription levels. However, plants have different types of methylation controlled by different genetic pathways, and detailed information on the methylation status of each cytosine in any given genome is lacking. To this end, we generated a map at single-base-pair resolution of methylated cytosines for Arabidopsis, by combining bisulphite treatment of genomic DNA with ultra-high-throughput sequencing using the Illumina 1G Genome Analyser and Solexa sequencing technology. This approach, termed BS-Seq, unlike previous microarray-based methods, allows one to sensitively measure cytosine methylation on a genome-wide scale within specific sequence contexts. Here we describe methylation on previously inaccessible components of the genome and analyse the DNA methylation sequence composition and distribution. We also describe the effect of various DNA methylation mutants on genome-wide methylation patterns, and demonstrate that our newly developed library construction and computational methods can be applied to large genomes such as that of mouse.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            ART: a next-generation sequencing read simulator.

            ART is a set of simulation tools that generate synthetic next-generation sequencing reads. This functionality is essential for testing and benchmarking tools for next-generation sequencing data analysis including read alignment, de novo assembly and genetic variation discovery. ART generates simulated sequencing reads by emulating the sequencing process with built-in, technology-specific read error models and base quality value profiles parameterized empirically in large sequencing datasets. We currently support all three major commercial next-generation sequencing platforms: Roche's 454, Illumina's Solexa and Applied Biosystems' SOLiD. ART also allows the flexibility to use customized read error model parameters and quality profiles. Both source and binary software packages are available at http://www.niehs.nih.gov/research/resources/software/art.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads.

              High-volume sequencing of DNA and RNA is now within reach of any research laboratory and is quickly becoming established as a key research tool. In many workflows, each of the short sequences ("reads") resulting from a sequencing run are first "mapped" (aligned) to a reference sequence to infer the read from which the genomic location derived, a challenging task because of the high data volumes and often large genomes. Existing read mapping software excel in either speed (e.g., BWA, Bowtie, ELAND) or sensitivity (e.g., Novoalign), but not in both. In addition, performance often deteriorates in the presence of sequence variation, particularly so for short insertions and deletions (indels). Here, we present a read mapper, Stampy, which uses a hybrid mapping algorithm and a detailed statistical model to achieve both speed and sensitivity, particularly when reads include sequence variation. This results in a higher useable sequence yield and improved accuracy compared to that of existing software.
                Bookmark

                Author and article information

                Journal
                Bioinformatics
                Oxford University Press (OUP)
                1460-2059
                1367-4803
                December 2012
                December 01 2012
                October 11 2012
                December 2012
                December 01 2012
                October 11 2012
                : 28
                : 24
                : 3169-3177
                Article
                10.1093/bioinformatics/bts605
                23060614
                6ba6b3b2-0cb2-4a1d-9baa-1cda98bbbce1
                © 2012
                History

                Comments

                Comment on this article