Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED.

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Conventional targeted sequencing methods eliminate many of the benefits of nanopore sequencing, such as the ability to accurately detect structural variants or epigenetic modifications. The ReadUntil method allows nanopore devices to selectively eject reads from pores in real time, which could enable purely computational targeted sequencing. However, this requires rapid identification of on-target reads while most mapping methods require computationally intensive basecalling. We present UNCALLED ( https://github.com/skovaka/UNCALLED ), an open source mapper that rapidly matches streaming of nanopore current signals to a reference sequence. UNCALLED probabilistically considers k-mers that could be represented by the signal and then prunes the candidates based on the reference encoded within a Ferragina-Manzini index. We used UNCALLED to deplete sequencing of known bacterial genomes within a metagenomics community, enriching the remaining species 4.46-fold. UNCALLED also enriched 148 human genes associated with hereditary cancers to 29.6× coverage using one MinION flowcell, enabling accurate detection of single-nucleotide polymorphisms, insertions and deletions, structural variants and methylation in these genes.

Related collections

Most cited references 50

Record: found
Abstract: found
Article: found

Is Open Access

BEDTools: a flexible suite of utilities for comparing genomic features

Aaron Quinlan, Ira Hall (2010)

Motivation: Testing for correlations between different sets of genomic features is a fundamental task in genomics research. However, searching for overlaps between features with existing web-based methods is complicated by the massive datasets that are routinely produced with current sequencing technologies. Fast and flexible tools are therefore required to ask complex questions of these data in an efficient manner. Results: This article introduces a new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets. Availability and implementation: BEDTools was written in C++. Source code and a comprehensive user manual are freely available at http://code.google.com/p/bedtools Contact: aaronquinlan@gmail.com; imh4y@virginia.edu Supplementary information: Supplementary data are available at Bioinformatics online.

0 comments Cited 6562 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

HISAT: a fast spliced aligner with low memory requirements.

Daehwan Kim, Ben Langmead, Steven L Salzberg (2018)

HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of ∼64,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.

0 comments Cited 5970 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Minimap2: pairwise alignment for nucleotide sequences

Heng Li (2018)

Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 Mb in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms.

0 comments Cited 3868 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (iso-abbrev): Nat Biotechnol

Title: Nature biotechnology

Publisher: Springer Science and Business Media LLC

ISSN (Electronic): 1546-1696

ISSN (Print): 1087-0156

Publication date (Electronic): April 2021

Volume: 39

Issue: 4

Affiliations

[1 ] Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA. skovaka1@jhu.edu.

[2 ] Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.

[3 ] Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.

[4 ] Department of Biology, Johns Hopkins University, Baltimore, MD, USA.

[5 ] Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.

Article

Publisher Item ID: 10.1038/s41587-020-0731-9 Mid ID: NIHMS1636148

DOI: 10.1038/s41587-020-0731-9

PMC ID: 8567335

PubMed ID: 33257863

SO-VID: a228cc5f-f9bc-48b5-8af0-a292ddc2dc72

History

Data availability:

Comments

Comment on this article

scite_

Cited by 90

See all cited by

Most referenced authors 2,402

See all reference authors

- Version 1
- Version 1

Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED.

Read this article at

Abstract

Related collections

Neuronal Signaling

Most cited references 50

BEDTools: a flexible suite of utilities for comparing genomic features

HISAT: a fast spliced aligner with low memory requirements.

Minimap2: pairwise alignment for nucleotide sequences

Author and article information

Journal

Affiliations

Article

History

Comments

Comment on this article

Similar content 282

Cited by 90

Most referenced authors 2,402