Blog
About

  • Record: found
  • Abstract: found
  • Article: found
Is Open Access

Trimmomatic: a flexible trimmer for Illumina sequence data

1 , 2 , 1 , 2 , 3 , *

Bioinformatics

Oxford University Press

Read this article at

Bookmark
      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

      Abstract

      Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data.

      Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.

      Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic

      Contact: usadel@ 123456bio1.rwth-aachen.de

      Supplementary information: Supplementary data are available at Bioinformatics online.

      Related collections

      Most cited references 10

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Fast and accurate short read alignment with Burrows–Wheeler transform

      Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk
        Bookmark
        • Record: found
        • Abstract: found
        • Article: not found

        Fast gapped-read alignment with Bowtie 2.

        As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
          Bookmark
          • Record: found
          • Abstract: found
          • Article: not found

          Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

          We have developed a new set of algorithms, collectively called "Velvet," to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words (k-mers) that is ideal for high coverage, very short read (25-50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of approximately 8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.
            Bookmark

            Author and article information

            Affiliations
            1Department Metabolic Networks, Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Golm, 2Institut für Biologie I, RWTH Aachen, Worringer Weg 3, 52074 Aachen and 3Institute of Bio- and Geosciences: Plant Sciences, Forschungszentrum Jülich, Leo-Brandt-Straße, 52425 Jülich, Germany
            Author notes
            *To whom correspondence should be addressed.

            Associate Editor: Inanc Birol

            Journal
            Bioinformatics
            Bioinformatics
            bioinformatics
            bioinfo
            Bioinformatics
            Oxford University Press
            1367-4803
            1367-4811
            01 August 2014
            01 April 2014
            01 April 2014
            : 30
            : 15
            : 2114-2120
            24695404 4103590 10.1093/bioinformatics/btu170 btu170
            © The Author 2014. Published by Oxford University Press.

            This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

            Counts
            Pages: 7
            Categories
            Original Papers
            Genome Analysis

            Bioinformatics & Computational biology

            Comments

            Comment on this article