57
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Rapid protein evolution, organellar reductions, and invasive intronic elements in the marine aerobic parasite dinoflagellate Amoebophrya spp

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Dinoflagellates are aquatic protists particularly widespread in the oceans worldwide. Some are responsible for toxic blooms while others live in symbiotic relationships, either as mutualistic symbionts in corals or as parasites infecting other protists and animals. Dinoflagellates harbor atypically large genomes (~ 3 to 250 Gb), with gene organization and gene expression patterns very different from closely related apicomplexan parasites. Here we sequenced and analyzed the genomes of two early-diverging and co-occurring parasitic dinoflagellate Amoebophrya strains, to shed light on the emergence of such atypical genomic features, dinoflagellate evolution, and host specialization.

          Results

          We sequenced, assembled, and annotated high-quality genomes for two Amoebophrya strains (A25 and A120), using a combination of Illumina paired-end short-read and Oxford Nanopore Technology (ONT) MinION long-read sequencing approaches. We found a small number of transposable elements, along with short introns and intergenic regions, and a limited number of gene families, together contribute to the compactness of the Amoebophrya genomes, a feature potentially linked with parasitism. While the majority of Amoebophrya proteins (63.7% of A25 and 59.3% of A120) had no functional assignment, we found many orthologs shared with Dinophyceae. Our analyses revealed a strong tendency for genes encoded by unidirectional clusters and high levels of synteny conservation between the two genomes despite low interspecific protein sequence similarity, suggesting rapid protein evolution. Most strikingly, we identified a large portion of non-canonical introns, including repeated introns, displaying a broad variability of associated splicing motifs never observed among eukaryotes. Those introner elements appear to have the capacity to spread over their respective genomes in a manner similar to transposable elements. Finally, we confirmed the reduction of organelles observed in Amoebophrya spp., i.e., loss of the plastid, potential loss of a mitochondrial genome and functions.

          Conclusion

          These results expand the range of atypical genome features found in basal dinoflagellates and raise questions regarding speciation and the evolutionary mechanisms at play while parastitism was selected for in this particular unicellular lineage.

          Supplementary information

          The online version contains supplementary material available at 10.1186/s12915-020-00927-9.

          Related collections

          Most cited references107

          • Record: found
          • Abstract: found
          • Article: not found

          Basic local alignment search tool.

          A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            MUSCLE: multiple sequence alignment with high accuracy and high throughput.

            We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              HISAT: a fast spliced aligner with low memory requirements.

              HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of ∼64,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.
                Bookmark

                Author and article information

                Contributors
                betina@genoscope.cns.fr
                lguillou@sb-roscoff.fr
                Journal
                BMC Biol
                BMC Biol
                BMC Biology
                BioMed Central (London )
                1741-7007
                6 January 2021
                6 January 2021
                2021
                : 19
                : 1
                Affiliations
                [1 ]GRID grid.460789.4, ISNI 0000 0004 4910 6535, Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, , Univ. Evry, Université Paris-Saclay, ; 91057 Evry, France
                [2 ]GRID grid.36425.36, ISNI 0000 0001 2216 9681, School of Marine and Atmospheric Sciences, , Stony Brook University, ; Stony Brook, New York 11794 USA
                [3 ]GRID grid.5342.0, ISNI 0000 0001 2069 7798, Center for Plant Systems Biology, VIB, Ghent, Belgium, & Department of Plant Biotechnology and Bioinformatics, , Ghent University, ; Ghent, Belgium
                [4 ]GRID grid.464101.6, ISNI 0000 0001 2203 0006, Sorbonne Université, CNRS, FR2424, Station Biologique de Roscoff, ; Place Georges Teissier, 29680 Roscoff, France
                [5 ]GRID grid.462844.8, ISNI 0000 0001 2308 1657, Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, ; 29680 Roscoff, France
                [6 ]GRID grid.460789.4, ISNI 0000 0004 4910 6535, URGI, INRA, , Université Paris-Saclay, ; 78026 Versailles, France
                [7 ]Unité Molécules de Communication et Adaptation des Microorganismes (MCAM, UMR7245), Muséum national d’Histoire naturelle, CNRS, CP 52, 57 rue Cuvier, 75005 Paris, France
                [8 ]GRID grid.464101.6, ISNI 0000 0001 2203 0006, Sorbonne Université, CNRS, UMR 8227, Station Biologique de Roscoff, ; Place Georges Teissier, 29680 Roscoff, France
                [9 ]GRID grid.5685.e, ISNI 0000 0004 1936 9668, Centre for Novel Agricultural Products, Department of Biology, , University of York, ; Heslington, York, YO10 5DD UK
                [10 ]GRID grid.217197.b, ISNI 0000 0000 9813 0452, Algal Resources Collection, MARBIONC, Center for Marine Sciences, , University of North Carolina Wilmington, ; 5600 Marvin K. Moss Lane, Wilmington, NC 28409 USA
                [11 ]Department of Biochemistry, Genetics and Microbiology, Pretoria, South Africa
                Author information
                https://orcid.org/0000-0001-7725-2589
                https://orcid.org/0000-0002-8738-6705
                https://orcid.org/0000-0003-3494-7916
                https://orcid.org/0000-0002-5830-3253
                https://orcid.org/0000-0001-6354-2278
                https://orcid.org/0000-0002-7140-6417
                https://orcid.org/0000-0002-1454-6018
                https://orcid.org/0000-0003-0169-5302
                https://orcid.org/0000-0003-1032-7958
                Article
                927
                10.1186/s12915-020-00927-9
                7789003
                33407428
                5b45ea27-9a71-40b2-bf8e-119ef0803b45
                © The Author(s) 2021

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 24 May 2020
                : 12 November 2020
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2021

                Life sciences
                non-canonical introns,introner elements,genome,parasite,dinoflagellate
                Life sciences
                non-canonical introns, introner elements, genome, parasite, dinoflagellate

                Comments

                Comment on this article