Blog
About

1,503
views
0
recommends
+1 Recommend
0 collections
    158
    shares
    • Record: found
    • Abstract: found
    • Article: found
    Is Open Access

    Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

    , 1 , 1 , 1 ,   1

    Genome Biology

    BioMed Central

    Read this article at

    Bookmark
        There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

        Abstract

        Bowtie: a new ultrafast memory-efficient tool for the alignment of short DNA sequence reads to large genomes.

        Abstract

        Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.

        Related collections

        Most cited references 29

        • Record: found
        • Abstract: not found
        • Article: not found

        Identification of common molecular subsequences.

          Bookmark
          • Record: found
          • Abstract: found
          • Article: not found

          Base-calling of automated sequencer traces using phred. II. Error probabilities.

           B T Ewing,  P. Green (1998)
          Elimination of the data processing bottleneck in high-throughput sequencing will require both improved accuracy of data processing software and reliable measures of that accuracy. We have developed and implemented in our base-calling program phred the ability to estimate a probability of error for each base-call, as a function of certain parameters computed from the trace data. These error probabilities are shown here to be valid (correspond to actual error rates) and to have high power to discriminate correct base-calls from incorrect ones, for read data collected under several different chemistries and electrophoretic conditions. They play a critical role in our assembly program phrap and our finishing program consed.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays.

            Ultra-high-throughput sequencing is emerging as an attractive alternative to microarrays for genotyping, analysis of methylation patterns, and identification of transcription factor binding sites. Here, we describe an application of the Illumina sequencing (formerly Solexa sequencing) platform to study mRNA expression levels. Our goals were to estimate technical variance associated with Illumina sequencing in this context and to compare its ability to identify differentially expressed genes with existing array technologies. To do so, we estimated gene expression differences between liver and kidney RNA samples using multiple sequencing replicates, and compared the sequencing data to results obtained from Affymetrix arrays using the same RNA samples. We find that the Illumina sequencing data are highly replicable, with relatively little technical variation, and thus, for many purposes, it may suffice to sequence each mRNA sample only once (i.e., using one lane). The information in a single lane of Illumina sequencing data appears comparable to that in a single array in enabling identification of differentially expressed genes, while allowing for additional analyses such as detection of low-expressed genes, alternative splice variants, and novel transcripts. Based on our observations, we propose an empirical protocol and a statistical framework for the analysis of gene expression using ultra-high-throughput sequencing technology.
              Bookmark

              Author and article information

              Affiliations
              [1 ]Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA
              Contributors
              Journal
              Genome Biol
              Genome Biology
              BioMed Central
              1465-6906
              1465-6914
              2009
              4 March 2009
              : 10
              : 3
              : R25
              2690996
              gb-2009-10-3-r25
              19261174
              10.1186/gb-2009-10-3-r25
              Copyright © 2009 Langmead et al.; licensee BioMed Central Ltd.

              This is an open access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

              Categories
              Software

              Genetics

              Comments

              Comment on this article