1,958
views
0
recommends
+1 Recommend
0 collections
    25
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Massively-parallel cDNA sequencing has opened the way to deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here, we present the Trinity methodology for de novo full-length transcriptome reconstruction, and evaluate it on samples from fission yeast, mouse, and whitefly – an insect whose genome has not yet been sequenced. Trinity fully reconstructs a large fraction of the transcripts present in the data, also reporting alternative splice isoforms and transcripts from recently duplicated genes. In all cases, Trinity performs better than other available de novo transcriptome assembly programs, and its sensitivity is comparable to methods relying on genome alignments. Our approach provides a unified and general solution for transcriptome reconstruction in any sample, especially in the complete absence of a reference genome.

          Related collections

          Most cited references35

          • Record: found
          • Abstract: found
          • Article: not found

          Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms

          High-throughput mRNA sequencing (RNA-Seq) holds the promise of simultaneous transcript discovery and abundance estimation 1-3 . We introduce an algorithm for transcript assembly coupled with a statistical model for RNA-Seq experiments that produces estimates of abundances. Our algorithms are implemented in an open source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed more than 430 million paired 75bp RNA-Seq reads from a mouse myoblast cell line representing a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Analysis of transcript expression over the time series revealed complete switches in the dominant transcription start site (TSS) or splice-isoform in 330 genes, along with more subtle shifts in a further 1,304 genes. These dynamics suggest substantial regulatory flexibility and complexity in this well-studied model of muscle development.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            TopHat: discovering splice junctions with RNA-Seq

            Motivation: A new protocol for sequencing the messenger RNA in a cell, known as RNA-Seq, generates millions of short sequence fragments in a single run. These fragments, or ‘reads’, can be used to measure levels of gene expression and to identify novel splice variants of genes. However, current software for aligning RNA-Seq data to a genome relies on known splice junctions and cannot identify novel ones. TopHat is an efficient read-mapping algorithm designed to align reads from an RNA-Seq experiment to a reference genome without relying on known splice sites. Results: We mapped the RNA-Seq reads from a recent mammalian RNA-Seq experiment and recovered more than 72% of the splice junctions reported by the annotation-based software from that study, along with nearly 20 000 previously unreported junctions. The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer. We describe several challenges unique to ab initio splice site discovery from RNA-Seq reads that will require further algorithm development. Availability: TopHat is free, open-source software available from http://tophat.cbcb.umd.edu Contact: cole@cs.umd.edu Supplementary information: Supplementary data are available at Bioinformatics online.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

              We have developed a new set of algorithms, collectively called "Velvet," to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words (k-mers) that is ideal for high coverage, very short read (25-50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of approximately 8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.
                Bookmark

                Author and article information

                Journal
                9604648
                20305
                Nat Biotechnol
                Nat. Biotechnol.
                Nature biotechnology
                1087-0156
                1546-1696
                29 April 2011
                15 May 2011
                13 February 2013
                : 29
                : 7
                : 644-652
                Affiliations
                [1 ] Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge MA, 02142, USA
                [2 ] School of Computer Science, Hebrew University, Jerusalem, 91904, Israel
                [3 ] Department of Biology, Massachusetts Institute of Technology, Cambridge MA, USA
                [4 ] Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester MA 01605, USA
                [5 ] Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University
                [6 ] Alexander Silberman Institute of Life Sciences, Hebrew University, Jerusalem, 91904, Israel
                [7 ] Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge MA, 02140
                Author notes
                Correspondence and requests for materials should be addressed to nir@ 123456cs.huji.ac.il (NF), aregev@ 123456broad.mit.edu (AR)
                [*]

                These authors contributed equally to this work and appear in alphabetical order

                [‡]

                These authors contributed equally to this work

                Article
                NIHMS292662
                10.1038/nbt.1883
                3571712
                21572440
                e4d5fe74-1984-48dc-b0db-4cffb73e2f0e

                Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

                History
                Funding
                Funded by: National Human Genome Research Institute : NHGRI
                Award ID: U54 HG003067-06 || HG
                Funded by: Office of the Director : NIH
                Award ID: DP1 OD003958-03 || OD
                Funded by: Howard Hughes Medical Institute :
                Award ID: || HHMI_
                Categories
                Article

                Biotechnology
                Biotechnology

                Comments

                Comment on this article