Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Accurate fusion transcript detection is essential for comprehensive characterization of cancer transcriptomes. Over the last decade, multiple bioinformatic tools have been developed to predict fusions from RNA-seq, based on either read mapping or de novo fusion transcript assembly.

Results

We benchmark 23 different methods including applications we develop, STAR-Fusion and TrinityFusion, leveraging both simulated and real RNA-seq. Overall, STAR-Fusion, Arriba, and STAR-SEQR are the most accurate and fastest for fusion detection on cancer transcriptomes.

Conclusion

The lower accuracy of de novo assembly-based methods notwithstanding, they are useful for reconstructing fusion isoforms and tumor viruses, both of which are important in cancer research.

Related collections

Most cited references 77

Record: found
Abstract: found
Article: not found

STAR: ultrafast universal RNA-seq aligner.

Alexander Dobin, Carrie A. Davis, Felix Schlesinger … (2013)

Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

0 comments Cited 13209 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Fast gapped-read alignment with Bowtie 2.

Ben Langmead, Steven L Salzberg (2022)

As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

0 comments Cited 12547 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Fast and accurate short read alignment with Burrows–Wheeler transform

Heng Li, Richard Durbin (2009)

Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk

0 comments Cited 10148 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Brian J. Haas:

ORCID: http://orcid.org/0000-0002-6609-4973

bhaas@broadinstitute.org

Alexander Dobin: dobin@cshl.edu

Bo Li: bli28@mgh.harvard.edu

Nicolas Stransky: nstransky@celsiustx.com

Nathalie Pochet: npochet@bwh.harvard.edu

Aviv Regev: aregev@broadinstitute.org

Journal

Journal ID (nlm-ta): Genome Biol

Journal ID (iso-abbrev): Genome Biol

Title: Genome Biology

Publisher: BioMed Central (London )

ISSN (Print): 1474-7596

ISSN (Electronic): 1474-760X

Publication date (Electronic): 21 October 2019

Publication date PMC-release: 21 October 2019

Publication date Collection: 2019

Volume: 20

Electronic Location Identifier: 213

Affiliations

[1 ]GRID grid.66859.34, Broad Institute of MIT and Harvard, ; Cambridge, MA 02142 USA

[2 ]ISNI 0000 0004 0387 3667, GRID grid.225279.9, Cold Spring Harbor Laboratory, ; Cold Spring Harbor, NY 11724 USA

[3 ]ISNI 0000 0004 0386 9924, GRID grid.32224.35, Center for Immunology and Inflammatory Diseases, Division of Rheumatology, Allergy, and Immunology, , Massachusetts General Hospital and Harvard Medical School, ; Boston, MA 02129 USA

[4 ]Celsius Therapeutics, Cambridge, MA 02139 USA

[5 ]ISNI 000000041936754X, GRID grid.38142.3c, Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women’s Hospital, , Harvard Medical School, ; Boston, MA 02115 USA

[6 ]ISNI 0000 0001 2341 2786, GRID grid.116068.8, Howard Hughes Medical Institute, and Koch Institute for Integrative Cancer Research, Department of Biology, , Massachusetts Institute of Technology, ; Cambridge, MA 02140 USA

Author information

Brian J. Haas http://orcid.org/0000-0002-6609-4973

Article

Publisher ID: 1842

DOI: 10.1186/s13059-019-1842-9

PMC ID: 6802306

PubMed ID: 31639029

SO-VID: 6632ad6a-2b6d-4fd6-9b1e-ac4d7c07fdaa

License:

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

History

Date received : 19 May 2019

Date accepted : 28 September 2019

Funding

Funded by: National Cancer Institute

Award ID: U24CA180922

Award ID: R50CA211461

Award ID: R21CA209940

Award ID: U01CA214846

Award Recipient : Nathalie Pochet Aviv Regev

Custom metadata

ScienceOpen disciplines: Genetics

Keywords: fusion,rna-seq,cancer,benchmarking,star-fusion,trinityfusion

Data availability:

ScienceOpen disciplines: Genetics

Keywords: fusion, rna-seq, cancer, benchmarking, star-fusion, trinityfusion

Comments

Comment on this article

scite_

Cited by 199

See all cited by

Most referenced authors 2,458

See all reference authors

- Version 1

Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods

Read this article at

Abstract

Background

Results

Conclusion

Related collections

Genes & Diseases

Most cited references 77

STAR: ultrafast universal RNA-seq aligner.

Fast gapped-read alignment with Bowtie 2.

Fast and accurate short read alignment with Burrows–Wheeler transform

Author and article information

Contributors

Journal

Affiliations

Author information

Article

History

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 220

Cited by 199

Most referenced authors 2,458