Manfred G. Grabherr1, Brian J. Haas1, Moran Yassour1,2,3, Joshua Z. Levin1, Dawn A. Thompson1, Ido Amit1, Xian Adiconis1, Lin Fan1, Raktima Raychowdhury1, Qiandong Zeng1, Zehua Chen1, Evan Mauceli1, Nir Hacohen1, Andreas Gnirke1, Nicholas Rhind4, Federica di Palma1, Bruce W. Birren1, Chad Nusbaum1, Kerstin Lindblad-Toh1,5, Nir Friedman2,6, Aviv Regev1,3,7
15 May 2011
Massively-parallel cDNA sequencing has opened the way to deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here, we present the Trinity methodology for de novo full-length transcriptome reconstruction, and evaluate it on samples from fission yeast, mouse, and whitefly – an insect whose genome has not yet been sequenced. Trinity fully reconstructs a large fraction of the transcripts present in the data, also reporting alternative splice isoforms and transcripts from recently duplicated genes. In all cases, Trinity performs better than other available de novo transcriptome assembly programs, and its sensitivity is comparable to methods relying on genome alignments. Our approach provides a unified and general solution for transcriptome reconstruction in any sample, especially in the complete absence of a reference genome.