7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Murine and related chapparvoviruses are nephro-tropic and produce novel accessory proteins in infected kidneys

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Mouse kidney parvovirus (MKPV) is a member of the provisional genus Chapparvovirus that causes renal disease in immune-compromised mice, with a disease course reminiscent of polyomavirus-associated nephropathy in immune-suppressed kidney transplant patients. Here we map four major MKPV transcripts, created by alternative splicing, to a common initiator region, and use mass spectrometry to identify “p10” and “p15” as novel chapparvovirus accessory proteins produced in MKPV-infected kidneys. p15 and the splicing-dependent putative accessory protein NS2 are conserved in all near-complete amniote chapparvovirus genomes currently available (from mammals, birds and a reptile). In contrast, p10 may be encoded only by viruses with >60% amino acid identity to MKPV. We show that MKPV is kidney-tropic and that the bat chapparvovirus DrPV-1 and a non-human primate chapparvovirus, CKPV, are also found in the kidneys of their hosts. We propose, therefore, that many mammal chapparvoviruses are likely to be nephrotropic.

          Author summary

          Parvoviruses are small, genetically simple single-strand DNA viruses that remain viable outside their hosts for very long periods of time. They cause disease in several domesticated species and in humans. Mouse kidney parvovirus (MKPV) is a causative agent of kidney failure in immune-compromised mice and is the only member of the provisional Chapparvovirus genus for which the complete genome including telomeres is known. Here, we show that MKPV propagates almost exclusively in the kidneys of mice infected naturally, wherein it produces novel accessory proteins whose coding regions are conserved in amniote-associated chapparvovirus sequences. We assemble a closely related complete viral genome present in DNA extracted from the kidney of a wild Cebus imitator monkey, and show that another related chapparvovirus is preferentially found in kidneys of the vampire bat Desmodus rotundus. We conclude that many mammal-hosted chapparvovirus are adapted to the kidney niche and may therefore cause disease following kidney stress in multiple species.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          ICTV Virus Taxonomy Profile: Parvoviridae

          Members of the family Parvoviridae are small, resilient, non-enveloped viruses with linear, single-stranded DNA genomes of 4–6 kb. Viruses in two subfamilies, the Parvovirinae and Densovirinae, are distinguished primarily by their respective ability to infect vertebrates (including humans) versus invertebrates. Being genetically limited, most parvoviruses require actively dividing host cells and are host and/or tissue specific. Some cause diseases, which range from subclinical to lethal. A few require co-infection with helper viruses from other families. This is a summary of the International Committee on Taxonomy of Viruses (ICTV) Report on the Parvoviridae, which is available at www.ictv.global/report/parvoviridae.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters.

            The core promoter of eukaryotic genes is the minimal DNA region that recruits the basal transcription machinery to direct efficient and accurate transcription initiation. The fraction of human and yeast genes that contain specific core promoter elements such as the TATA box and the initiator (INR) remains unclear and core promoter motifs specific for TATA-less genes remain to be identified. Here, we present genome-scale computational analyses indicating that approximately 76% of human core promoters lack TATA-like elements, have a high GC content, and are enriched in Sp1-binding sites. We further identify two motifs - M3 (SCGGAAGY) and M22 (TGCGCANK) - that occur preferentially in human TATA-less core promoters. About 24% of human genes have a TATA-like element and their promoters are generally AT-rich; however, only approximately 10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR). In contrast, approximately 46% of human core promoters contain the consensus INR (YYANWYY) and approximately 30% are INR-containing TATA-less genes. Significantly, approximately 46% of human promoters lack both TATA-like and consensus INR elements. Surprisingly, mammalian-type INR sequences are present - and tend to cluster - in the transcription start site (TSS) region of approximately 40% of yeast core promoters and the frequency of specific core promoter types appears to be conserved in yeast and human genomes. Gene Ontology analyses reveal that TATA-less genes in humans, as in yeast, are frequently involved in basic "housekeeping" processes, while TATA-containing genes are more often highly regulated, such as by biotic or stress stimuli. These results reveal unexpected similarities in the occurrence of specific core promoter types and in their associated biological processes in yeast and humans and point to novel vertebrate-specific DNA motifs that might play a selective role in TATA-independent transcription.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Metagenomic study of the viruses of African straw-coloured fruit bats: Detection of a chiropteran poxvirus and isolation of a novel adenovirus

              Introduction Zoonoses caused by unknown agents represent a significant proportion of the challenge of emerging infectious diseases (EIDs) (Morens et al., 2004). Viruses account for approximately 25–44% of all EIDs (Jones et al., 2008; Taylor et al., 2001) and studies suggest they are the pathogen class most likely to emerge (Cleaveland et al., 2007; Dobson and Foufopoulos, 2001). Hantaviruses, henipaviruses, SARS coronaviruses and filoviruses are all viruses of zoonotic origin. Nearly 80% of zoonotic EIDs originate from wildlife, and the overall contribution of wildlife pathogens to human EID events is increasing and represent an ongoing threat to global health (Cleaveland et al., 2007; Jones et al., 2008). For example, a novel coronavirus associated with acute respiratory disease was recently diagnosed in pneumonia patients in Saudi Arabia and London (Bermingham et al., 2012; Zaki et al., 2012). Analysis of the novel coronavirus genome suggests a possible bat origin (Bermingham et al., 2012). South and East Asia, Eastern Europe, Latin America and tropical Africa constitute areas of increased relative risk for zoonotic emergence from wildlife (Jones et al., 2008; Morens et al., 2004). Numerous studies have successfully combined metagenomics with next generation sequencing to explore the viruses of different animal species, including: domestic pigs and turkeys (Day et al., 2010; Shan et al., 2011); Californian sea lions (Li et al., 2011); and rodents (Phan et al., 2011). Characterising the viruses of candidate reservoir species in high-risk geographical areas is an important step toward better understanding viral emergence. Bats are the primary reservoirs for many viral zoonoses, including henipaviruses, filoviruses, some lyssaviruses and SARS-like coronaviruses (Halpin et al., 2000; Kuzmin et al., 2008; Li et al., 2005; Luby et al., 2009; Towner et al., 2007). Indeed, seminal work has been recently published on the role of bats as natural reservoirs of paramyxoviruses (Drexler et al., 2012). Detailed studies of the viruses of insectivorous bat species in both North America and China have been conducted (Donaldson et al., 2010; Ge et al., 2012; Li et al., 2010a; Wu et al., 2012). These studies found large numbers of insect and plant viruses, which were thought to reflect dietary inputs, as well phage sequences and mammalian viruses. The majority of the mammalian viruses identified in those studies were those previously identified in bats (often with high diversity being reported in individual populations) and include: Adenoviridae (Li et al., 2010c); Parvoviridae (Li et al., 2010b); Circoviridae (Ge et al., 2011); Coronaviridae (Tang et al., 2006; Woo et al., 2006) and Astroviridae (Chu et al., 2008). Papillomaviridae and Herpesvirdae sequences were also commonly found (Donaldson et al., 2010; Ge et al., 2012; Wu et al., 2012) and some studies also reported Picornaviridae, Flaviviridae and Retroviridae (Li et al., 2010a; Wu et al., 2012). If we consider that the ∼1200 bat species constitute approximately 20% of the class Mammalia and that they are near-globally distributed, the benefits of expanding our knowledge of bat viruses on geographic and taxonomic levels become evident. Here we conducted a metagenomic study to detect viruses of E. helvum, a frugivorous African bat species that is widely-distributed and migratory throughout much of sub-Saharan Africa. The species is eaten as bushmeat, and the populations studied have ample opportunities for human contact, including a roost directly over a hospital in Accra, Ghana (Hayman et al., 2012). Results Bioinformatics analysis Performance comparison among assemblers Here, we show important performance differences among four different assemblers (Velvet, ABySS, MetaIDBA and MetaCortex) and three sample types. The assemblers generated different numbers of contigs, with MetaCortex and ABySS producing more sequences than Velvet and MetaIDBA for each sample type (Fig. 1A). Sample type also affected the number of contigs, with increasing cellularity resulting in more contigs (Table 1), except with MetaIDBA, where the sample allowed to iterate over a kmer size-range generated the most contigs. Contig length parameters also varied with assembler and sample type. Velvet generated contigs with the longest average length across sample types, while ABySS typically produced the longest contigs. Regarding sample type, the longest contigs were generated from the throat sample for each assembler (Table 1). As well as generating contigs of differing length and number, the nucleotide composition of contigs varied among assemblers. Base composition of the total assembled contigs revealed important differences (Supp. Fig. 2A). Velvet and MetaCortex contigs had similar base compositions, while ABySS contigs incorporated non-ATCG notations (e.g. N, R, Y, Supp. Fig. 2A), and MetaIDBA contigs were primarily composed of adenine (though this was not true for MetaIDBA contigs included in final analyses (Supp. Fig. 2B)). Consolidation of contigs combines the strengths of assembler approaches and reduces complexity De novo assemblers have different contig-construction methods, resulting in strengths and weaknesses in different situations. By consolidating contigs from multiple assemblers, we combined the strengths, while reducing the computational complexity of analyzing contigs from assemblers separately. The proportion of contigs retained after consolidation differed among assemblers. For example, ≤22% of ABySS contigs but ≥94% of MetaIDBA contigs were retained into the final consolidated set for each sample type (Table 2). The consolidation resulted in the discard of approximately 30% of the total assembled contigs (∼4.4 million sequences). Consistent with the observed assembler differences in contig generation, a variable proportion of sequences were also length-excluded per assembler, but these proportions were approximately equal among sample types (Fig. 1B, Table 2). Eidolon helvum samples contain large numbers of viral sequences To identify viral sequences in the consolidated contigs, we used multiple algorithms, which had different efficacies. While BLASTn identified 258 suspect-viral sequences among the sample types, BLASTx and tBLASTx identified a further 6448 and 2563 viral sequences, respectively. Manual exclusion and curation of these 9269 suspect-viral sequences was used to further focus analysis on viral sequences of interest (Fig. 1B). Here, we aimed to explore the viruses that likely infect E. helvum (for which they probably constitute a natural reservoir) not the viruses infecting their dietary inputs or bacterial flora. Consequently, 8095 suspect-viral sequences related to viral families not known to infect vertebrates were excluded from further analysis. Subsequent to close inspection of the remaining 1174 suspect-viral sequences, a further 11 sequences were removed due to incorrect classification in the database (not shown). This resulted in 1363 viral sequences related to eight mammalian-infecting viral families being identified. While the majority (77%) were related to viruses with double stranded DNA (dsDNA) genomes, 21% were related to retroviruses (classified separately as sequences may have derived from exogenous-RNA or proviral-DNA forms) and single-stranded DNA and positive-sense single-stranded RNA viruses were also present (Table 3, Fig. 1C). All sample types, assembly algorithms and BLAST comparison algorithms identified viral sequences (Table 3, Fig. 1C). By using multiple assembly and identification algorithms we generated more contigs and identified more viral sequences. Analysis of viral sequences by family Herpesviridae We identified 539 sequences related to herpesviruses, mostly from the throat sample(Table 3). Sequences related to a wide range of genes and proteins involved in diverse functions including gene regulation, nucleotide metabolism, DNA replication as well as envelope glycoproteins and other structural proteins. Most sequences related to members of the betaherpesvirinae (n=366) and gammaherpesvirinae (n=171), and only two sequences most closely related with alphaherpesvirinae (Supp. Table 1). Phylogenetic analysis of a region of the DNA polymerase showed the presence of distinct herpesviruses in the throat sample, including some related to other bat betaherpesviruses (Zhang et al., 2012) and a novel gammaherpesvirus (Fig. 2). The presence of contig th_687866 in the throat sample was confirmed by PCR and sequencing (not shown). Papillomaviridae Most papillomavirus sequences derived from the throat sample (405 of 408), but the other sample types also contained papillomavirus sequences (Table 3). Sequences related to both early (E) and late (L) genes of viral replication in proportions approximate to gene length (Fig. 3A). The sequences related to members of genetically-diverse genera within the Papillomaviridae (Supp. Table 1, Fig. 3B). Phylogenetic analysis of overlapping fragments showed two related sequences of novel papillomavirus(es), one (th_683255) with 76% aa ID with an incomplete, unpublished ‘Eidolon helvum papillomavirus 1’ and another (th_NODE_12326) with 64% aa ID with Rousettus aegyptiacus papillomavirus 1. We confirmed the presence of contig th_679786 by PCR and sequencing (not shown). Adenoviridae Sequences related to adenoviruses were present in all three sample types (Table 3). All sequences related to members of the mammalian-infecting genus Mastadenovirus. When aligned with the prototypic member of this genus (Human adenovirus C, NC_001405), the 68 adenovirus-like contigs displayed 65–100% aa identity with twelve proteins involved in capsid morphogenesis, DNA replication and encapsidation, and apoptosis (Supp. Table 1). We isolated an adenovirus (referred to as E. helvum adenovirus 1) from a urine sample obtained from this bat population (Fig. 4A and B), and although phylogenetic analysis identified as a mastadenovirus, it was distinct from those previously described in bats (Fig. 4C). Notably, E. helvum adenovirus 1 clustered with human adenoviruses. Contigs from the throat (th_NODE_10144) and urine samples (ur_NODE_27579) shared 77% and 90% aa identity with the isolated virus’ hexon protein over distances of 58 and 63aa, respectively. Poxviridae We detected 38 contigs related to poxviruses, all derived from the throat sample and related to chordate-infecting poxviruses. Most (n=25) were related to Molluscum contagiosum (MC), a human contagion (Supp. Table 1). We compared all poxvirus contigs against MC reference proteins (NC_001731) using BLASTx. The sequences shared 29–74% aa identity with 23 different MC proteins and had e-values of 3.45 e −111–8.03 e −4, showing that all sequences had significant similarity to MC. Sequences were related to proteins in the outer (variable) as well the inner (core) regions of the genome (Supp. Fig. 3), but no proteins unique to Molluscipox were detected. This relationship was exemplified by phylogenetic analysis of contig th_node1036_0_0_38518, related to the major core protein (Fig. 5). We confirmed the presence of this contig in individual throat swabs by PCR. Five of the forty throat swabs (13% of samples, 95% CI 6–26%) contained this sequence. To further confirm the presence of poxvirus sequences in E. helvum, we aligned the MetaCortex contigs (of any length, from all three sample types) against the MC genome using BLAT (which reports alignment blocks of over 95% identity) and increased the number of poxviral related contigs to 12,845 (Supp. Fig. 3) Polyomaviridae We assembled a sequence from the throat sample that was related to the VP1 capsid protein of polyomaviruses. This phylogenetically-clustered with primate polyomaviruses with low confidence, likely due to the short length of the sequence (Supp. Fig. 4). Retroviridae There were 292 sequences related to retroviruses, primarily derived from the lung sample, though sequences were also present in the urine and throat samples (Table 3). Retroviral sequences related primarily to gamma, beta and unclassified retroviruses (Fig. 6A). The sequences related to all three canonical genes of retroviruses in proportions approximate to gene length (Fig. 6B). Translations of many retrovirus contigs contained stop codons within the region of BLAST alignment, suggesting that they derived from non-functional, endogenous retroviruses. The longest ORF of a partial polymerase protein sequence (th_NODE_62045) was phylogenetically related to, but distinct from, both avian and mammalian viruses (Fig. 6C). Parvoviridae Ten sequences derived from the throat (n=8) and urine (n=2) related to members of the Parvoviridae, from both the mammalian-infecting Parvovirinae subfamily and the invertebrate-infecting Densovirinae subfamily (Supp. Fig. 5). The analysed Parvovirinae-like sequence (th_node7292_0_0_7345) related to members of different genera (e.g. Erythrovirus and Betaparvovirus) and was distinct from a known E. helvum parvovirus (Supp. Fig. 5B). Picornaviridae Contigs from the urine (n=6) and lung (n=1) sample related to picornaviruses. Urine contigs related to the polyprotein of members of the genus Kobuvirus and the longest sequence, ur_181630, phylogenetically-clustered with human and canine kobuviruses (Supp. Fig. 6). Short sequence lengths precluded useful phylogenetic comparison of these Kobuvirus sequences with those detected in North American insectivorous bats (Li et al., 2010a), however, there was 50% identity over a 90aa overlapping region. The lung picornavirus sequence related to members of the genus Enterovirus, but was too short (79 bp) for useful phylogenetic analysis. Discussion Here we described the first detailed study of metagenomic viral sequences from a megachiropteran species. E. helvum have a wide geographical distribution and live in close contact with human populations. As such, this bat species is an ideal candidate reservoir host, also being a source of bushmeat in Ghana and likely being infected with henipaviruses, Lagos bat virus (lyssavirus) and Ebola virus (filovirus) in this location (Drexler et al., 2009; Drexler et al., 2012; Hayman et al., 2010; Hayman et al., 2008a; Hayman et al., 2008b; Wright et al., 2010). Our results highlight the utility of metagenomic studies to assess zoonotic risk in wildlife. The impact of bioinformatics tools on metagenomic studies Differences in the assembler efficacy manifested as differences in number, length-parameters and base-composition of the generated contigs. ABySS and Velvet are consensus assemblers designed to assemble a single genome from sequence reads. Contrastingly, MetaCortex and MetaIDBA are meta-assemblers specifically designed to address situations where multiple genomes would be expected. Generally, consensus assemblers adopt more stringent algorithms for error removal in order to build longer contigs, while meta-assemblers preserve sample variation. The consolidation of contigs from multiple assemblers generated a more robust contig set and reduced the number of sequences by one third, facilitating downstream processing. The BLAST algorithm used also impacted the number of contigs identified as suspect-viral. BLASTn identified fewer sequences than tBLASTx against the same database with identical retention criteria. Similarly, BLASTx identified more suspect-viral sequences than BLASTn (although a different database was used), supporting the observation that protein-based comparisons are more effective than nucleotide-based comparisons where divergent sequences are expected (Kunin et al., 2008). The use of multiple identification algorithms here, enabled the detection of more viral sequences. Viral sequences The relative identification success for mammalian viral sequences in this metagenomic study compared with others, as well as among sample types and analytical tools, provide guidance on how best to approach such studies. Using the Illumina platform, and working with a frugivorous bat species, we found that 28% of viral sequences identified were of mammalian-origin, more than the ≤10% previously identified in insectivorous bat species (Donaldson et al., 2010; Ge et al., 2012; Li et al., 2010a; Wu et al., 2012). The sample type also affected the level of detection, with most viral sequences being derived from the throat sample (though differences attributable to colony differences from which the samples were collected cannot be ruled out). These discrepancies in detection show that the quantitative and qualitative success of viral metagenomic studies is determined partly by the study species, sample type, and molecular and bioinformatic tools used. Here we aimed to identify viruses circulating in E. helvum that might have zoonotic potential. We detected novel, often diverse, viruses in many viral families, in samples collected over a short time frame from a small number of bats. The viral sequences were often distinct from those previously described in bats and often we saw diversity within the viral family (e.g. herpesvirus and retroviral sequences from different subfamilies, and at least six phylogenetically-distinct papillomaviruses from different genera). Clearly, a wide range of diverse and previously uncharacterized viruses circulate in E. helvum. Given the proximity of this species with humans over a large geographical area, it is important to consider the zoonotic potential of these viruses. We detected poxvirus, adenovirus and polyomavirus sequences closely related with those from humans and primates (in the latter two cases, more closely related than with those viruses previously described in bats). We also isolated an adenovirus from urine collected directly underneath the colony, a sample type to which humans are regularly exposed. The relatedness of these viruses to human pathogens may indicate that these viruses are more likely to emerge (Antia et al., 2003). Given the relationship of these viruses with human pathogens and E. helvum's high rate of human contact, more extensive active surveillance such as molecular and serological studies of humans in contact with relevant bat populations seems appropriate. Our study expands the known chiropteran viral profile. While many viral families detected here have previously been reported in bat populations, here we report the detection of poxvirus sequences in bats. Although Li et al. reported pox-related sequences derived from a circovirus (Li et al., 2010a), the viral sequences reported here are likely to derive from a true poxvirus. We detected sequences with a high degree of relatedness to 23 different proteins throughout the genome of M. contagiosum. The presence of this virus in 5% of the throat swabs suggests a high prevalence of poxvirus infection in this bat population. Given the relatedness of the virus to MC, the possibility of zoonotic transmission of poxviruses from bats to humans should be considered further, and in other geographical areas. The commonalities between our findings and those in other metagenomic studies of bats provide insight on the relationship that bats may have with their natural pathogens. Of the eight viral families detected here, six have previously been detected during metagenomic studies of insectivorous bats (Donaldson et al., 2010; Ge et al., 2012; Li et al., 2010a; Wu et al., 2012) and one (the Polyomaviridae) was detected in Myotis spp. using consensus PCR (Misra et al., 2009). We worked with a host from a separate taxonomic suborder to those studies and still found similar viruses, suggesting that a common viral footprint may be present in all chiropteran species. Continued description of viral profiles of disparate host species and geographical locations will deepen our understanding of host-pathogen relationships in these important zoonotic reservoir species. While these results represent an interesting development in the study of these reservoir hosts, the limitations of metagenomic methods should be acknowledged. Only partial sequence information was generated for each viral family under study, limiting analysis. Additionally, although some viral sequences were confirmed by PCR and virus isolation, making them likely-derived from true viruses, the same cannot be said of the retroviral sequences. The high proportion of retroviral sequences in the lung, as well as the presence of stop codons in a number of sequences indicate that they likely derive from endogenous proviruses. Furthermore, in common with other studies of this nature, most viral sequences here were derived from those with dsDNA genomes. Drexler et al. (2012) showed that metagenomic methods could not detect paramyxoviral RNA where consensus PCR was successful. Similarly, this population of E. helvum has been demonstrated to harbour a high prevalence and diversity of paramyxoviruses yet none were detected here (Baker et al., 2012). Although some progress is being made toward validating the scope of viral metagenomic studies (Sachsenroder et al., 2012), further work is needed to conclude whether this repeatedly-observed bias is biological (perhaps consequent to the long, sometimes latent, infection periods of dsDNA viruses) or if it constitutes a laboratory artifact. Due to these methodological limitations, it is likely that the viral sequences described here are not an exhaustive representation of the viruses of E. helvum. Materials and methods Ethics declaration This study was approved by the Zoological Society of London's animal ethics committee. Populations under study Two colonial populations (250,000–1,000,000 bats each) of E. helvum in Ghana were sampled: one in Accra and one in Tano Sacred Grove (TSG, approx. 400 km North, Supp. Fig. 1). The Accra population is urban, roosting in trees over a city center hospital. The TSG population is rural, roosting in a protected forest area. The two populations comprise part of a metapopulation (Peel, 2012). Interspecies co-roosting of these populations was not observed in five years of field study. Sample collection Urine for metagenomic analysis was collected twice from beneath the Accra roost; in January and March 2009. Sterile cotton swabs were saturated with urine on plastic sheeting placed beneath the roost, and placed in 1 ml of virus transport medium (VTM: Hank's Balanced Salt Solution, 1% BSA [w/v], gentamicin 100 µg/ml and amphotericin B 2 µg/ml). Urine samples for virus isolation were collected in 2010, as previously described (Baker et al., 2012). Throat swabs were collected from individual, manually-restrained bats caught from TSG in March 2009. Swabs were placed in 1 ml of VTM. Lung tissue was collected from healthy, adult bats euthanized by anaesthetic overdose (ketamine/medetomidine), captured in Accra in March 2009. An individual piece of tissue (approx. 5 mm2) from each bat was used. Samples were frozen at −80 °C until further processing. Sample pooling, enrichment and nucleic acid manipulation This work was performed at the Department of Veterinary Medicine, University of Cambridge. Lung tissues (n=5) were disrupted in 1 ml PBS using sterile pestles before homogenisation and pooling. Urine samples (n=80) were pooled, with individual samples contributing 500 µl of VTM. Throat swabs in VTM (n=40) were pooled similarly. Sample pools were centrifuged (1500 g, 10 min, room temperature) and supernatants filtered (5 µm spin filters then 0.85 µm syringe filters). Filtrates were divided in two before overnight precipitation (at 4 °C) using polyethylene glycol (5% PEG, 0.15 M NaCl), and centrifugated (10,700 g, 1 h, 4 °C). Pellets resuspended in 175 µl PBS were DNAse (RNAse-free DNAse 1, 100 U, Ambion) (adapted from (Allander et al., 2001), and then RNAse (RNAse A, 0.5 U, Invitrogen) treated, both for 30 min at 37 °C. Total nucleic acids were extracted (High Pure Viral Nucleic Acid Kit, Roche). Sequence-Independent Single-Primer Amplification (adapted from (Djikeng et al., 2008)) was performed as follows: cDNA was generated by reverse transcription (Superscript III, Invitrogen) followed by complementary-strand synthesis (DNA polymerase I large [Klenow] fragment, NEB) using primer FR26RV-N. PCR amplification was performed with FR20RV (Advantage 2 PCR kit, Clontech) with thermal profiling: 1 min at 95 °C, 25 cycles of 30 s at 95 °C, 1 min at 65 °C, 2 min 30 s at 68 °C, followed by 1 min at 68 °C. PCR products were size-fractionated (Chromospin 200 columns, Clontech) and visualized on a Bioanalyser DNAchip (Agilent Technologies). For each sample type, 1 µg of DNA was prepared for sequencing. Next generation sequencing This work was carried out at the Wellcome Trust Sanger Institute. The DNA library was sheared (Covaris AFA, Covaris) to 200–300 bp and purified (QIAquick spin columns, Qiagen), blunt-end repaired, and ligated to sequencing primers. Ligation products were purified (Agencourt Ampure SPRI beads, Beckman Coultor Genomics) and the library was 200 bp size-selected by agarose gel electrophoresis. Following purification (Gel extraction kit, Qiagen), the library was PCR amplified for 10 cycles (Phusion DNA polymerase) in triplicate using Illumina adaptor-specific primers. The primers were removed (Agencourt Ampure SPRI beads, Beckman Coultor Genomics), and libraries quantified by qPCR were diluted to 40 nM for cluster generation. Libraries were sequenced on an Illumina GAII (Illumina Inc) for 76 bp paired-end reads. Following data QA, QC and computational primer removal there were 5,218,132 sequence reads from the urine sample, 15,809,698 from the throat sample and 22,530,774 from the lung tissue sample. Generation of consolidated contiguous sequences (contigs) Sequences from each sample type were processed individually. Reads were de novo assembled using four assembly algorithms: Velvet 1.1.04 (Zerbino and Birney, 2008), ABySS 1.2.7 (Simpson et al., 2009), MetaIDBA 0.19 (Peng et al., 2011) and MetaCortex (Leggett and Caccamo, personal communication), a recently-developed variant of Cortex (Iqbal et al., 2012). These assemblers are based on de Bruijn graphs, which are constructed by dividing reads into smaller, overlapping sequences called kmers. For ABySS, Velvet and MetaCortex, a range of kmer sizes (21, 31, 41, 51, 61 and 71) were evaluated, with 31 proving optimal (providing the largest number of contigs ≥100 nt and the largest number of viral matches to the NCBI nt database) for Velvet and ABySS, and 61 being optimal for MetaCortex. MetaIDBA iterates over a kmer size range, and the throat sample was iterated over 21–71  nt. When assembling the urine and lung samples, MetaIDBA was unable to reach completion when allowed to iterate, so was fixed to 71 nt. The four contig sets for each sample type then underwent a consolidation process comprised of sequential BLAT alignments (Kent, 2002), followed by removal of shorter contigs that were 95% identical over a 95% length fraction (Fig. 1A). First, contigs generated by Velvet and ABySS were compared (Comparison 1). The retained contigs were then compared with MetaIDBA contigs (Comparison 2), and then those still retained were compared with MetaCortex contigs (Comparison 3, Fig. 1A). Consequently, sequences retained after Comparison 3 were a consolidated contig set with reduced redundancy. Identification of suspect-viral sequences Contigs≤76 nt in length (considered potentially-derived from single sequencing reads) were not analysed further (Fig. 1B). Remaining sequences underwent sequential BLAST comparison (Altschul et al., 1990) against NCBI databases (at November 25, 2011) and taxonomic classification queried from the NCBI taxonomy web service. Contigs underwent BLASTn comparison with the NCBI nt database. The taxonomic classification of the source organism of the reference sequence with which the contig ‘best’ aligned (i.e. the alignment with the lowest expect value [e-value]) was retrieved. If the reference sequence taxonomy was viral and aligned with the contig with an e-value of≤0.0001, the sequence was flagged as suspect-viral and retained for further analysis. Sequences not flagged as suspect-viral then proceeded to BLASTx comparison with the NCBI nr database, with the same retention criteria. Sequences still not retained underwent tBLASTx comparison with the NCBI nt database, and were similarly retained. Those not retained in this final comparison round were discarded. Classification and curation of viral sequences Suspect-viral sequences related to viral families not known to infect vertebrates were excluded. Remaining suspect-viral sequences were manually curated by examination of the region of database sequence that matched the contig in BLAST alignment. Where the database sequence providing the alignment with the lowest e-value appeared to be classified into the incorrect taxonomic group, all BLAST hits for the contig were examined and the majority taxa was determined to be the likely origin of the sequence (Fig. 1B). Remaining sequences and BLAST results were then grouped by viral family. Analysis of viral sequences Reference sequences were downloaded from NCBI, and global alignments with contigs generated using Clustal X (Version 2 (Thompson et al., 1994)) and Muscle 3.8.31 (Edgar, 2004). Gap-stripped alignments (columns with>50% gaps were removed) were then used to infer phylogenetic trees using MrBayes (Ronquist and Huelsenbeck, 2003), as previously described (Baker et al., 2012). Local BLAST comparisons and pairwise identities were performed using Genomics workbench (Vs 5.1, CLC Bio). Amplification of viral sequences by PCR Primers (sequences available on request) were designed to detect poxvirus, herpesvirus and papillomavirus contigs in the throat sample nucleic acids submitted for sequencing. Poxvirus PCR was also performed on nucleic acids extracted from the individual throat swabs. PCR (DreamTaqGreenPCR Mastermix, Fermentas) products were visualized by gel electrophoresis, purified (Gel extraction kit, QIAGEN) and sequenced. Adenovirus isolation and characterization An adenovirus causing cytopathic effect in Pteropus alecto primary kidney cells (Crameri et al., 2009) was isolated from sample U69 (Baker et al., 2012). Negative contrast electron microscopy (EM) was used to examine 6 day post-infection culture supernatant. Supernatant was adsorbed onto parlodion-filmed copper grids coated with carbon and stained with nano-W stain (Nanoprobes, Yaphank, NY, USA). Thin section EM was used to examine cells 5 days post-infection (as in (Weir et al., 2012) except using Sorenson's phosphate buffer (300 mosM/kg, pH 7.2)). The full-length hexon gene of this isolate was sequenced, as in (Zhang et al., 2012). Nucleotide sequences Viral sequences discussed here were deposited in GenBank (JX885594 – JX885611), except for the too-short polyomavirus sequence (Supp. Fig. 4). Sequencing data were deposited in the European Nucleotide Archive (ERP001979). Supplementary fasta files of all viral and suspect-viral sequences and comparison outputs are available online, and assembled contigs are available on request.
                Bookmark

                Author and article information

                Contributors
                Role: Data curationRole: InvestigationRole: Writing – review & editing
                Role: Data curationRole: Formal analysisRole: InvestigationRole: Methodology
                Role: Data curationRole: Investigation
                Role: Formal analysisRole: Investigation
                Role: Data curationRole: Formal analysisRole: Investigation
                Role: Investigation
                Role: Formal analysisRole: Investigation
                Role: Formal analysisRole: Methodology
                Role: Formal analysis
                Role: Resources
                Role: Methodology
                Role: Funding acquisitionRole: Resources
                Role: Data curationRole: InvestigationRole: Writing – review & editing
                Role: Resources
                Role: MethodologyRole: SupervisionRole: Writing – review & editing
                Role: MethodologyRole: ResourcesRole: ValidationRole: Writing – review & editing
                Role: Data curationRole: Formal analysisRole: Funding acquisitionRole: MethodologyRole: ResourcesRole: SupervisionRole: ValidationRole: Writing – review & editing
                Role: ConceptualizationRole: Data curationRole: Formal analysisRole: Funding acquisitionRole: MethodologyRole: Project administrationRole: ResourcesRole: SupervisionRole: ValidationRole: Writing – review & editing
                Role: ConceptualizationRole: Data curationRole: Formal analysisRole: Funding acquisitionRole: MethodologyRole: Project administrationRole: ResourcesRole: SupervisionRole: ValidationRole: Writing – original draftRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS Pathog
                PLoS Pathog
                plos
                plospath
                PLoS Pathogens
                Public Library of Science (San Francisco, CA USA )
                1553-7366
                1553-7374
                23 January 2020
                January 2020
                : 16
                : 1
                : e1008262
                Affiliations
                [1 ] Centenary Institute, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
                [2 ] Proteomics Core Facility, University of Technology Sydney, Sydney, NSW, Australia
                [3 ] Center for Infection & Immunity, Mailman School of Public Health, Columbia University, New York, NY, United States of America
                [4 ] Kolling Institute of Medical Research, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
                [5 ] Virology Research Center, School of Medicine of Ribeirão Preto of the University of São Paulo, Ribeirão Preto, Brazil
                [6 ] Institut de Biologia Evolutiva, CSIC-Universitat Pompeu Fabra, Barcelona, Spain
                [7 ] Department of Anthropology and Archaeology, University of Calgary, Alberta, Canada
                [8 ] Melbourne Integrative Genomics, University of Melbourne, Melbourne, Victoria, Australia
                [9 ] Department of Veterinary Resources, Weizmann Institute of Science, Rehovot, Israel
                [10 ] Lowy Cancer Research Centre, University of New South Wales Sydney, Sydney, NSW, Australia
                [11 ] Department of Dermatology, Medical University of Vienna, Vienna, Austria
                [12 ] Department of Medical Genetics and Alberta Children’s Hospital Research Institute, Cumming School of Medicine, University of Calgary, Alberta, Canada
                [13 ] Microbiology and Aquatic Diagnostics, IDEXX BioAnalytics, Discovery Drive, Columbia, MO, United States of America
                [14 ] Laboratory of Comparative Pathology, Center of Comparative Medicine and Pathology, Memorial Sloan Kettering Cancer Center, The Rockefeller University, Weill Cornell Medicine, New York, NY, United States of America
                [15 ] Autoimmunity, Transplantation, Inflammation (ATI) Disease Area, Novartis Institutes for Biomedical Research, Basel, Switzerland
                University of Kansas Medical Center, UNITED STATES
                Author notes

                B.R. and W.W. are co-inventors on an international patent application (PCT/AU2018/050505) submitted by the Centenary Institute that is related to the detection and use of MKPV in research and commercial applications. M.J.C is an employee of IDEXX BioAnalytics, a division of IDEXX Laboratories, Inc., a veterinary diagnostics company with a commercial interest in MKPV. B.R. is presently an employee at Novartis Institutes for BioMedical Research. Novartis did not fund the study. IDEXX BioAnalytics and Novartis had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. All other authors declare no competing interests.

                Author information
                http://orcid.org/0000-0001-6429-5344
                http://orcid.org/0000-0002-8283-0643
                http://orcid.org/0000-0003-3514-6627
                http://orcid.org/0000-0002-6276-4685
                http://orcid.org/0000-0002-0552-2962
                http://orcid.org/0000-0002-0207-5423
                http://orcid.org/0000-0001-6922-2072
                http://orcid.org/0000-0002-8228-7281
                http://orcid.org/0000-0002-7393-810X
                http://orcid.org/0000-0001-8221-7278
                http://orcid.org/0000-0002-0509-8962
                http://orcid.org/0000-0002-0025-8293
                http://orcid.org/0000-0002-0612-2514
                http://orcid.org/0000-0002-7038-5079
                http://orcid.org/0000-0001-6705-831X
                http://orcid.org/0000-0003-2841-6887
                http://orcid.org/0000-0002-4593-091X
                http://orcid.org/0000-0002-9307-1056
                Article
                PPATHOGENS-D-19-01559
                10.1371/journal.ppat.1008262
                6999912
                31971979
                027b8d21-f7f0-4eeb-8a45-0c2f13ef734f
                © 2020 Lee et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 1 September 2019
                : 8 December 2019
                Page count
                Figures: 6, Tables: 0, Pages: 23
                Funding
                Supported by the Australian National Health and Medical Research Council (W.W., B.R., P.B. & J.J.-L.W), the Cancer Institute NSW (B.R. & J.J.-L.W), the Hillcrest Foundation (C.J.J.), the Alfred P. Sloan Foundation (S.H.W.), the National Institutes of Health (U19AI109761 Center for Research in Diagnostics and Discovery, S.H.W.), the National Cancer Institute Cancer Center Support Grant P30 CA008748 (S.M.), the Fundação de Amparo à Pesquisa do Estado de São Paulo, Brazil (No. 17/13981-0 and 18/09383-3, W.M.S, & M.J.F.), the National Sciences and Engineering Research Council of Canada (A.D.M), the Canada Research Chairs program (A.D.M.), the Alberta Children’s Hospital Research Institute (A.D.M. & J.D.O.) and the Beatriu de Pinós postdoctoral programme of the Government of Catalonia's Secretariat for Universities and Research of the Ministry of Economy and Knowledge (J.D.O.). IDEXX BioAnalytics funded the portion of the reported work performed in their laboratories. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Anatomy
                Renal System
                Kidneys
                Medicine and Health Sciences
                Anatomy
                Renal System
                Kidneys
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Artificial Gene Amplification and Extension
                Polymerase Chain Reaction
                Research and Analysis Methods
                Molecular Biology Techniques
                Artificial Gene Amplification and Extension
                Polymerase Chain Reaction
                Biology and Life Sciences
                Genetics
                Genomics
                Animal Genomics
                Mammalian Genomics
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Gene Prediction
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Gene Prediction
                Biology and Life Sciences
                Genetics
                Genomics
                Animal Genomics
                Bird Genomics
                Biology and life sciences
                Organisms
                Viruses
                DNA viruses
                Parvoviruses
                Biology and Life Sciences
                Microbiology
                Medical Microbiology
                Microbial Pathogens
                Viral Pathogens
                Parvoviruses
                Medicine and Health Sciences
                Pathology and Laboratory Medicine
                Pathogens
                Microbial Pathogens
                Viral Pathogens
                Parvoviruses
                Biology and Life Sciences
                Organisms
                Viruses
                Viral Pathogens
                Parvoviruses
                Biology and Life Sciences
                Genetics
                Gene Expression
                Polyadenylation
                Biology and Life Sciences
                Biochemistry
                Peptides
                Polypeptides
                Custom metadata
                vor-update-to-uncorrected-proof
                2020-02-04
                Viral sequences are stored on GenBank. Accession numbers are: MF175078 MH670587 MH670588 MN265364 Proteomic data accessions: PXD010540 PXD014938 Accession numbers are referenced in the manuscript.

                Infectious disease & Microbiology
                Infectious disease & Microbiology

                Comments

                Comment on this article