+1 Recommend
1 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Proteomics informed by transcriptomics for characterising active transposable elements and genome annotation in Aedes aegypti

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          Aedes aegypti is a vector for the (re-)emerging human pathogens dengue, chikungunya, yellow fever and Zika viruses. Almost half of the Ae. aegypti genome is comprised of transposable elements (TEs). Transposons have been linked to diverse cellular processes, including the establishment of viral persistence in insects, an essential step in the transmission of vector-borne viruses. However, up until now it has not been possible to study the overall proteome derived from an organism’s mobile genetic elements, partly due to the highly divergent nature of TEs. Furthermore, as for many non-model organisms, incomplete genome annotation has hampered proteomic studies on Ae. aegypti.


          We analysed the Ae. aegypti proteome using our new proteomics informed by transcriptomics (PIT) technique, which bypasses the need for genome annotation by identifying proteins through matched transcriptomic (rather than genomic) data. Our data vastly increase the number of experimentally confirmed Ae. aegypti proteins. The PIT analysis also identified hotspots of incomplete genome annotation, and showed that poor sequence and assembly quality do not explain all annotation gaps. Finally, in a proof-of-principle study, we developed criteria for the characterisation of proteomically active TEs. Protein expression did not correlate with a TE’s genomic abundance at different levels of classification. Most notably, long terminal repeat (LTR) retrotransposons were markedly enriched compared to other elements. PIT was superior to ‘conventional’ proteomic approaches in both our transposon and genome annotation analyses.


          We present the first proteomic characterisation of an organism’s repertoire of mobile genetic elements, which will open new avenues of research into the function of transposon proteins in health and disease. Furthermore, our study provides a proof-of-concept that PIT can be used to evaluate a genome’s annotation to guide annotation efforts which has the potential to improve the efficiency of annotation projects in non-model organisms. PIT therefore represents a valuable new tool to study the biology of the important vector species Ae. aegypti, including its role in transmitting emerging viruses of global public health concern.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s12864-016-3432-5) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references 63

          • Record: found
          • Abstract: found
          • Article: not found

          Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data

          Massively-parallel cDNA sequencing has opened the way to deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here, we present the Trinity methodology for de novo full-length transcriptome reconstruction, and evaluate it on samples from fission yeast, mouse, and whitefly – an insect whose genome has not yet been sequenced. Trinity fully reconstructs a large fraction of the transcripts present in the data, also reporting alternative splice isoforms and transcripts from recently duplicated genes. In all cases, Trinity performs better than other available de novo transcriptome assembly programs, and its sensitivity is comparable to methods relying on genome alignments. Our approach provides a unified and general solution for transcriptome reconstruction in any sample, especially in the complete absence of a reference genome.
            • Record: found
            • Abstract: found
            • Article: not found

            MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification.

            Efficient analysis of very large amounts of raw data for peptide identification and protein quantification is a principal challenge in mass spectrometry (MS)-based proteomics. Here we describe MaxQuant, an integrated suite of algorithms specifically developed for high-resolution, quantitative MS data. Using correlation analysis and graph theory, MaxQuant detects peaks, isotope clusters and stable amino acid isotope-labeled (SILAC) peptide pairs as three-dimensional objects in m/z, elution time and signal intensity space. By integrating multiple mass measurements and correcting for linear and nonlinear mass offsets, we achieve mass accuracy in the p.p.b. range, a sixfold increase over standard techniques. We increase the proportion of identified fragmentation spectra to 73% for SILAC peptide pairs via unambiguous assignment of isotope and missed-cleavage state and individual mass precision. MaxQuant automatically quantifies several hundred thousand peptides per SILAC-proteome experiment and allows statistically robust identification and quantification of >4,000 proteins in mammalian cell lysates.
              • Record: found
              • Abstract: found
              • Article: not found

              Andromeda: a peptide search engine integrated into the MaxQuant environment.

              A key step in mass spectrometry (MS)-based proteomics is the identification of peptides in sequence databases by their fragmentation spectra. Here we describe Andromeda, a novel peptide search engine using a probabilistic scoring model. On proteome data, Andromeda performs as well as Mascot, a widely used commercial search engine, as judged by sensitivity and specificity analysis based on target decoy searches. Furthermore, it can handle data with arbitrarily high fragment mass accuracy, is able to assign and score complex patterns of post-translational modifications, such as highly phosphorylated peptides, and accommodates extremely large databases. The algorithms of Andromeda are provided. Andromeda can function independently or as an integrated search engine of the widely used MaxQuant computational proteomics platform and both are freely available at The combination enables analysis of large data sets in a simple analysis workflow on a desktop computer. For searching individual spectra Andromeda is also accessible via a web server. We demonstrate the flexibility of the system by implementing the capability to identify cofragmented peptides, significantly improving the total number of identified peptides.

                Author and article information

                BMC Genomics
                BMC Genomics
                BMC Genomics
                BioMed Central (London )
                19 January 2017
                19 January 2017
                : 18
                [1 ]ISNI 0000 0004 1936 7603, GRID grid.5337.2, School of Cellular and Molecular Medicine, , University of Bristol, ; Bristol, BS8 1TD UK
                [2 ]ISNI 0000 0001 0670 2351, GRID grid.59734.3c, Department of Microbiology, , Icahn School of Medicine at Mount Sinai, ; New York, 10029 NY USA
                [3 ]ISNI 0000 0004 1754 9358, GRID grid.412892.4, College of Applied Medical Sciences, , Taibah University, ; Medina, Kingdom of Saudi Arabia
                [4 ]ISNI 0000 0004 1936 7603, GRID grid.5337.2, School of Biochemistry, , University of Bristol, ; Bristol, BS8 1TD UK
                [5 ]ISNI 0000 0001 2171 1133, GRID grid.4868.2, School of Biological and Chemical Sciences, , Queen Mary University of London, ; London, E1 4NS UK
                [6 ]ISNI 0000 0004 0407 4824, GRID grid.5475.3, Present address: Department of Microbial Sciences, , University of Surrey, ; Guildford, GU2 7XH UK
                © The Author(s). 2017

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

                Funded by: FundRef, Medical Research Council;
                Award ID: G0801973
                Award Recipient :
                Funded by: FundRef, Biotechnology and Biological Sciences Research Council;
                Award ID: BB/L018438/1
                Award ID: BB/K016075/1
                Award Recipient :
                Funded by: FundRef, National Institute of Allergy and Infectious Diseases;
                Award ID: R01AI073450
                Award Recipient :
                Funded by: FundRef, Wellcome Trust;
                Award ID: 096062
                Award Recipient :
                Research Article
                Custom metadata
                © The Author(s) 2017


                Comment on this article