20
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Nanopore sequencing and full genome de novo assembly of human cytomegalovirus TB40/E reveals clonal diversity and structural variations

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Human cytomegalovirus (HCMV) has a double-stranded DNA genome of approximately 235 Kbp that is structurally complex including extended GC-rich repeated regions. Genomic recombination events are frequent in HCMV cultures but have also been observed in vivo. Thus, the assembly of HCMV whole genomes from technologies producing shorter than 500 bp sequences is technically challenging. Here we improved the reconstruction of HCMV full genomes by means of a hybrid, de novo genome-assembly bioinformatics pipeline upon data generated from the recently released MinION MkI B sequencer from Oxford Nanopore Technologies.

          Results

          The MinION run of the HCMV (strain TB40/E) library resulted in ~ 47,000 reads from a single R9 flowcell and in ~ 100× average read depth across the virus genome. We developed a novel, self-correcting bioinformatics algorithm to assemble the pooled HCMV genomes in three stages. In the first stage of the bioinformatics algorithm, long contigs (N50 = 21,892) of lower accuracy were reconstructed. In the second stage, short contigs (N50 = 5686) of higher accuracy were assembled, while in the final stage the high quality contigs served as template for the correction of the longer contigs resulting in a high-accuracy, full genome assembly (N50 = 41,056). We were able to reconstruct a single representative haplotype without employing any scaffolding steps. The majority (98.8%) of the genomic features from the reference strain were accurately annotated on this full genome construct. Our method also allowed the detection of multiple alternative sub-genomic fragments and non-canonical structures suggesting rearrangement events between the unique (UL /US) and the repeated (T/IRL/S) genomic regions.

          Conclusions

          Third generation high-throughput sequencing technologies can accurately reconstruct full-length HCMV genomes including their low-complexity and highly repetitive regions. Full-length HCMV genomes could prove crucial in understanding the genetic determinants and viral evolution underpinning drug resistance, virulence and pathogenesis.

          Related collections

          Most cited references37

          • Record: found
          • Abstract: found
          • Article: not found

          MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island.

          Short-read, high-throughput sequencing technology cannot identify the chromosomal position of repetitive insertion sequences that typically flank horizontally acquired genes such as bacterial virulence genes and antibiotic resistance genes. The MinION nanopore sequencer can produce long sequencing reads on a device similar in size to a USB memory stick. Here we apply a MinION sequencer to resolve the structure and chromosomal insertion site of a composite antibiotic resistance island in Salmonella Typhi Haplotype 58. Nanopore sequencing data from a single 18-h run was used to create a scaffold for an assembly generated from short-read Illumina data. Our results demonstrate the potential of the MinION device in clinical laboratories to fully characterize the epidemic spread of bacterial pathogens.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Genetic content of wild-type human cytomegalovirus.

            The genetic content of wild-type human cytomegalovirus was investigated by sequencing the 235 645 bp genome of a low passage strain (Merlin). Substantial regions of the genome (genes RL1-UL11, UL105-UL112 and UL120-UL150) were also sequenced in several other strains, including two that had not been passaged in cell culture. Comparative analyses, which employed the published genome sequence of a high passage strain (AD169), indicated that Merlin accurately reflects the wild-type complement of 165 genes, containing no obvious mutations other than a single nucleotide substitution that truncates gene UL128. A sizeable subset of genes exhibits unusually high variation between strains, and comprises many, but not all, of those that encode proteins known or predicted to be secreted or membrane-associated. In contrast to unpassaged strains, all of the passaged strains analysed have visibly disabling mutations in one or both of two groups of genes that may influence cell tropism. One comprises UL128, UL130 and UL131A, which putatively encode secreted proteins, and the other contains RL5A, RL13 and UL9, which are members of the RL11 glycoprotein gene family. The case in support of a lack of protein-coding potential in the region between UL105 and UL111A was also strengthened.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              RATT: Rapid Annotation Transfer Tool

              Second-generation sequencing technologies have made large-scale sequencing projects commonplace. However, making use of these datasets often requires gene function to be ascribed genome wide. Although tool development has kept pace with the changes in sequence production, for tasks such as mapping, de novo assembly or visualization, genome annotation remains a challenge. We have developed a method to rapidly provide accurate annotation for new genomes using previously annotated genomes as a reference. The method, implemented in a tool called RATT (Rapid Annotation Transfer Tool), transfers annotations from a high-quality reference to a new genome on the basis of conserved synteny. We demonstrate that a Mycobacterium tuberculosis genome or a single 2.5 Mb chromosome from a malaria parasite can be annotated in less than five minutes with only modest computational resources. RATT is available at http://ratt.sourceforge.net.
                Bookmark

                Author and article information

                Contributors
                +306970072433 , tkaram@pasteur.gr , timokratis@gmail.com
                bonnie.van-wilgenburg@ndm.ox.ac.uk
                mrw1004@cam.ac.uk
                paul.klenerman@ndm.ox.ac.uk
                +306973687010 , gmagi@med.uoa.gr
                Journal
                BMC Genomics
                BMC Genomics
                BMC Genomics
                BioMed Central (London )
                1471-2164
                2 August 2018
                2 August 2018
                2018
                : 19
                : 577
                Affiliations
                [1 ]ISNI 0000 0004 1936 8948, GRID grid.4991.5, Department of Zoology, , University of Oxford, ; Oxford, United Kingdom
                [2 ]GRID grid.418497.7, Public Health Laboratories, Department of Microbiology, , Hellenic Pasteur Institute, ; 127 Vas Sofias Ave, 11527 Athens, Greece
                [3 ]ISNI 0000 0004 1936 8948, GRID grid.4991.5, Nuffield Department of Clinical Medicine, , University of Oxford, ; Oxford, United Kingdom
                [4 ]ISNI 0000000121885934, GRID grid.5335.0, Department of Medicine, , University of Cambridge, ; Cambridge, United Kingdom
                [5 ]ISNI 0000 0001 2116 3923, GRID grid.451056.3, NIHR Biomedical Research Centre, ; Oxford, United Kingdom
                [6 ]ISNI 0000 0001 2155 0800, GRID grid.5216.0, Department of Hygiene, Epidemiology and Medical Statistics, Medical School, , National and Kapodistrian University of Athens, ; M. Asias 75 str., 11527 Athens, Greece
                Author information
                http://orcid.org/0000-0003-0841-9159
                Article
                4949
                10.1186/s12864-018-4949-6
                6090854
                30068288
                2b02909a-379e-41b5-89a3-6b5523227e04
                © The Author(s). 2018

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 6 March 2018
                : 19 July 2018
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100000265, Medical Research Council;
                Award ID: MR/K010565/1
                Award Recipient :
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2018

                Genetics
                human cytomegalovirus,nanopore,minion,de novo assembly,recombination,mutation,variable number tandem repeats,quasi-species

                Comments

                Comment on this article