38
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Accurate fusion transcript detection is essential for comprehensive characterization of cancer transcriptomes. Over the last decade, multiple bioinformatic tools have been developed to predict fusions from RNA-seq, based on either read mapping or de novo fusion transcript assembly.

          Results

          We benchmark 23 different methods including applications we develop, STAR-Fusion and TrinityFusion, leveraging both simulated and real RNA-seq. Overall, STAR-Fusion, Arriba, and STAR-SEQR are the most accurate and fastest for fusion detection on cancer transcriptomes.

          Conclusion

          The lower accuracy of de novo assembly-based methods notwithstanding, they are useful for reconstructing fusion isoforms and tumor viruses, both of which are important in cancer research.

          Related collections

          Most cited references77

          • Record: found
          • Abstract: found
          • Article: not found

          STAR: ultrafast universal RNA-seq aligner.

          Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Fast gapped-read alignment with Bowtie 2.

            As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Fast and accurate short read alignment with Burrows–Wheeler transform

              Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk
                Bookmark

                Author and article information

                Contributors
                bhaas@broadinstitute.org
                dobin@cshl.edu
                bli28@mgh.harvard.edu
                nstransky@celsiustx.com
                npochet@bwh.harvard.edu
                aregev@broadinstitute.org
                Journal
                Genome Biol
                Genome Biol
                Genome Biology
                BioMed Central (London )
                1474-7596
                1474-760X
                21 October 2019
                21 October 2019
                2019
                : 20
                : 213
                Affiliations
                [1 ]GRID grid.66859.34, Broad Institute of MIT and Harvard, ; Cambridge, MA 02142 USA
                [2 ]ISNI 0000 0004 0387 3667, GRID grid.225279.9, Cold Spring Harbor Laboratory, ; Cold Spring Harbor, NY 11724 USA
                [3 ]ISNI 0000 0004 0386 9924, GRID grid.32224.35, Center for Immunology and Inflammatory Diseases, Division of Rheumatology, Allergy, and Immunology, , Massachusetts General Hospital and Harvard Medical School, ; Boston, MA 02129 USA
                [4 ]Celsius Therapeutics, Cambridge, MA 02139 USA
                [5 ]ISNI 000000041936754X, GRID grid.38142.3c, Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women’s Hospital, , Harvard Medical School, ; Boston, MA 02115 USA
                [6 ]ISNI 0000 0001 2341 2786, GRID grid.116068.8, Howard Hughes Medical Institute, and Koch Institute for Integrative Cancer Research, Department of Biology, , Massachusetts Institute of Technology, ; Cambridge, MA 02140 USA
                Author information
                http://orcid.org/0000-0002-6609-4973
                Article
                1842
                10.1186/s13059-019-1842-9
                6802306
                31639029
                6632ad6a-2b6d-4fd6-9b1e-ac4d7c07fdaa
                © The Author(s). 2019

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 19 May 2019
                : 28 September 2019
                Funding
                Funded by: National Cancer Institute
                Award ID: U24CA180922
                Award ID: R50CA211461
                Award ID: R21CA209940
                Award ID: U01CA214846
                Award Recipient :
                Categories
                Research
                Custom metadata
                © The Author(s) 2019

                Genetics
                fusion,rna-seq,cancer,benchmarking,star-fusion,trinityfusion
                Genetics
                fusion, rna-seq, cancer, benchmarking, star-fusion, trinityfusion

                Comments

                Comment on this article