Transcriptome databases are an important source of structural and functional information about an organism, for example, plants without a sequenced genome. This is the case of the olive tree (Olea europaea L.), one of the most important oil-producing plant species all over the world. In addition, reproductive tissues and seeds are the less studied part of these plant species in spite of their importance in allergies, germination success, plant sterility, as well as being an important source of valuable components for agro-food industries, including seed storage proteins and trialcylglycerides. Therefore, an automated workflow has been developed using our tool AutoFlow to construct an annotated transcriptome from raw reads (Sanger, Illumina or Roche/454 or a combination of them) combining open source software (Bowtie2, CAP2, Euler-SR, MIRA3, Velvet/Oases, AutoFact, MREPS, GigaBayes…) with software developed by our group (SeqTrimNext, Full-LengtherNext, Sma3). The resulting transcriptomes were used to build a database ReprOlive (http://reprolive.eez.csic.es) where descriptions, GO terms, InterPro signatures, EC numbers, graphical localization of enzymes in KEGG pathways, ORFs, SSRs, and the corresponding orthologues in Arabidopsis thaliana from TAIR and RefSeq can be browsed. Finally, expression data can be accessed and, in addition to a BLAST search, a the semantic conceptualization using RDF allowing for Linked Data search was implemented to extract the most updated information related to enzymes, interactions, allergens, and structures. The olive tree reproductive transcriptome was constructed from 2,077,309 raw reads (454/Roche Titanium+) and 1,549 Sanger sequences from different stages of pollen and stigma development, resulting in 72,846 contigs, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an orthologue. Using different seed stages, 1,425,911 raw reads (454/Roche Titanium+) are in use for obtaining the seed transcriptome. Uses of these transcriptomes can be found in communications by Carmona et al. and JIménez-Quesada et al. in this congress. This work was supported by co-funding from the ERDF and Spanish MINECO and Andalusian PAIDI to the grants BFU2011-22779, TIN2011-25840, TIN2014-58304-R, P10-CVI-6075, P10-AGR-6274, P11-CVI-7487, P11-TIC-7529 and P12-TIC-1519. Authors also acknowledge the use of the SCBI facilities of UMA.

Content

Author and article information

Conference

Title: ScienceOpen Posters

Publisher: ScienceOpen

Publication date: May 28 2015

Article

DOI: 10.14293/P2199-8442.1.SOP-LIFE.PC37GO.v1

SO-VID: 609a15a2-15e8-487a-bc4e-ee918a47431e

License:

This work has been published open access under Creative Commons Attribution License CC BY 4.0 , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com .

Conference name: 3rd Plant Genomics Congress

History

ScienceOpen disciplines: Plant science & Botany,Bioinformatics & Computational biology

Keywords: transcriptome, olive tree, pollen, stigma, seeds, assembling, annotation, allergen

AUTOMATED CONSTRUCTION OF TRANSCRIPTOME DATABASES FOR UNCHARACTERISED-GENOME PLANTS, SUCH AS OLIVE TREE

Abstract

Content

Author and article information

Conference

Article

History

Comments

Comment on this article