Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Bowtie: a new ultrafast memory-efficient tool for the alignment of short DNA sequence reads to large genomes.

Abstract

Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.

Related collections

Most cited references 20

Record: found
Abstract: not found
Article: not found

Identification of common molecular subsequences.

T.F. Smith, M.S. Waterman (1981)

0 comments Cited 1696 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The transcriptional landscape of the yeast genome defined by RNA sequencing.

U Nagalakshmi, Z. Wang, K. Waern … (2008)

The identification of untranslated regions, introns, and coding regions within an organism remains challenging. We developed a quantitative sequencing-based method called RNA-Seq for mapping transcribed regions, in which complementary DNA fragments are subjected to high-throughput sequencing and mapped to the genome. We applied RNA-Seq to generate a high-resolution transcriptome map of the yeast genome and demonstrated that most (74.5%) of the nonrepetitive sequence of the yeast genome is transcribed. We confirmed many known and predicted introns and demonstrated that others are not actively used. Alternative initiation codons and upstream open reading frames also were identified for many yeast genes. We also found unexpected 3'-end heterogeneity and the presence of many overlapping genes. These results indicate that the yeast transcriptome is more complex than previously appreciated.

0 comments Cited 962 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

DNA sequencing of a cytogenetically normal acute myeloid leukemia genome

Timothy Ley, Elaine R. Mardis, Li Ding … (2008)

Lay Summary Acute myeloid leukemia is a highly malignant hematopoietic tumor that affects about 13,000 adults yearly in the United States. The treatment of this disease has changed little in the past two decades, since most of the genetic events that initiate the disease remain undiscovered. Whole genome sequencing is now possible at a reasonable cost and timeframe to utilize this approach for unbiased discovery of tumor-specific somatic mutations that alter the protein-coding genes. Here we show the results obtained by sequencing a typical acute myeloid leukemia genome and its matched normal counterpart, obtained from the patient’s skin. We discovered 10 genes with acquired mutations; two were previously described mutations thought to contribute to tumor progression, and 8 were novel mutations present in virtually all tumor cells at presentation and relapse, whose function is not yet known. Our study establishes whole genome sequencing as an unbiased method for discovering initiating mutations in cancer genomes, and for identifying novel genes that may respond to targeted therapies. We used massively parallel sequencing technology to sequence the genomic DNA of tumor and normal skin cells obtained from a patient with a typical presentation of FAB M1 Acute Myeloid Leukemia (AML) with normal cytogenetics. 32.7-fold ‘haploid’ coverage (98 billion bases) was obtained for the tumor genome, and 13.9-fold coverage (41.8 billion bases) was obtained for the normal sample. Of 2,647,695 well-supported Single Nucleotide Variants (SNVs) found in the tumor genome, 2,588,486 (97.7%) also were detected in the patient’s skin genome, limiting the number of variants that required further study. For the purposes of this initial study, we restricted our downstream analysis to the coding sequences of annotated genes: we found only eight heterozygous, non-synonymous somatic SNVs in the entire genome. All were novel, including mutations in protocadherin/cadherin family members (CDH24 and PCLKC), G-protein coupled receptors (GPR123 and EBI2), a protein phosphatase (PTPRT), a potential guanine nucleotide exchange factor (KNDC1), a peptide/drug transporter (SLC15A1), and a glutamate receptor gene (GRINL1B). We also detected previously described, recurrent somatic insertions in the FLT3 and NPM1 genes. Based on deep readcount data, we determined that all of these mutations (except FLT3) were present in nearly all tumor cells at presentation, and again at relapse 11 months later, suggesting that the patient had a single dominant clone containing all of the mutations. These results demonstrate the power of whole genome sequencing to discover novel cancer-associated mutations.

0 comments Cited 349 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Genome Biol

Title: Genome Biology

Publisher: BioMed Central

ISSN (Print): 1465-6906

ISSN (Electronic): 1465-6914

Publication date (Print): 2009

Publication date (Electronic): 4 March 2009

Volume: 10

Issue: 3

Page: R25

Affiliations

[1 ]Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA

Article

Publisher ID: gb-2009-10-3-r25

DOI: 10.1186/gb-2009-10-3-r25

PMC ID: 2690996

PubMed ID: 19261174

SO-VID: 58f00e70-5d3e-4df5-8062-a0999bf3cb46

License:

This is an open access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 21 October 2008

Date revision received : 19 December 2008

Date accepted : 4 March 2009

Comments

Comment on this article

scite_

Cited by 9,531

See all cited by

- Version 1

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

Read this article at

Abstract

Abstract

Related collections

Genome Integrity

Most cited references 20

Identification of common molecular subsequences.

The transcriptional landscape of the yeast genome defined by RNA sequencing.

DNA sequencing of a cytogenetically normal acute myeloid leukemia genome

Author and article information

Journal

Affiliations

Article

History

Categories

Comments

Comment on this article

Similar content 11

Cited by 9,531