Manfred G. Grabherr 1 , Brian J. Haas 1 , Moran Yassour 1 , 2 , 3 , Joshua Z. Levin 1 , Dawn A. Thompson 1 , Ido Amit 1 , Xian Adiconis 1 , Lin Fan 1 , Raktima Raychowdhury 1 , Qiandong Zeng 1 , Zehua Chen 1 , Evan Mauceli 1 , Nir Hacohen 1 , Andreas Gnirke 1 , Nicholas Rhind 4 , Federica di Palma 1 , Bruce W. Birren 1 , Chad Nusbaum 1 , Kerstin Lindblad-Toh 1 , 5 , Nir Friedman 2 , 6 , Aviv Regev 1 , 3 , 7
15 May 2011
Massively-parallel cDNA sequencing has opened the way to deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here, we present the Trinity methodology for de novo full-length transcriptome reconstruction, and evaluate it on samples from fission yeast, mouse, and whitefly – an insect whose genome has not yet been sequenced. Trinity fully reconstructs a large fraction of the transcripts present in the data, also reporting alternative splice isoforms and transcripts from recently duplicated genes. In all cases, Trinity performs better than other available de novo transcriptome assembly programs, and its sensitivity is comparable to methods relying on genome alignments. Our approach provides a unified and general solution for transcriptome reconstruction in any sample, especially in the complete absence of a reference genome.