As we sit six feet apart in the San Francisco airport terminal, waiting for a flight
to our field site, we hear an attendant's voice echoing, ‘All passengers must provide
proof of a negative RT‐qPCR COVID‐19 test prior to boarding the airplane’. A year
ago, we would have been hard‐pressed to hear such terminology on any loudspeaker in
a major US airport. But a year ago, we were not mid‐pandemic. When we reach the front
of the boarding line, the attendant checks our documentation as another scans the
crowd for anyone looking ill, sweating, coughing. In the corner, a teenager reads
about viral replication in the New York Times (Corum and Zimmer, 2020). Another few
rows over, a child is teaching his two stuffed dinosaurs ‐ both wearing tiny masks
‐ how to properly distance themselves. After we land in French Polynesia, we are briefed
by an army of attendants and biosafety agents on what COVID‐19 is, how SARS‐CoV‐2
is transmitted, and how to self‐administer a diagnostic test and return it to a local
processing facility.
This is virology gone mainstream.
For anyone who has witnessed and characterized epizootics and heard the many predictions
of the next major emerging infectious disease (EID) in wildlife, humans, or both (Ogden
et al., 2017), this has been a surreal experience. The surfacing and spread of SARS‐CoV‐2
has been an explicit (and sobering) reminder that increased human interaction with
wildlife and habitat encroachment pose a threat not only to wildlife health but our
own. As human influence advances, these potential threats extend beyond the terrestrial
and into aquatic ecosystems through the aquaculture we consume, the waterways we utilize,
and the organisms we increasingly encounter (Cotruvo et al., 2013). The magnitude
and frequency of mass mortality events (MMEs) within marine ecosystems are escalating
incrementally, although it is often unclear if these are due to greater detection
efforts or external factors such as pollution and thermal stress mediated by climate
change (Fey et al., 2015; Sanderson and Alexander, 2020). Uniting trends in the emergence
of marine epizootics have included changes in either (i) host distribution (e.g. the
joined proximity of normally allopatric species through alterations in land use, trade,
travel, or migration, and increases in host density) or (ii) microbial phenotype (e.g.
change in transmissibility, pathogenicity, or host niche through genetic adaptation)
(Daszak, 2000; Ogden et al., 2017).
In marine mammals, a recent study concluded that 72% of MMEs were likely attributable
to viral pathogens, indicating unique attributes for spillover and transmissibility
as EIDs and reflecting their potential zoonotic threat (Sanderson and Alexander, 2020).
These viruses pose risks to aquatic community stability, biodiversity, conservation
efforts and aquaculture economy, and do not appear to be isolated from terrestrial
ecosystems. For example, evidence of multiple instances of morbillivirus infection
(e.g. canine distemper ) spillover from domesticated dogs to pinnipeds suggest proximity
of the two hosts may have played a factor, arbovirus identification (e.g. mosquito‐borne
togaviruses and flaviviruses in cetaceans) may be indicative of viral vectoring by
terrestrial invertebrates, and the atypical spread of a herpesvirus‐like MME among
pilchards (Australia, 1995–98) suggest involvement of seabirds (Lafferty and Harvell,
2014; Bossart and Duignan, 2018). This epizootic among pilchards also showed the ability
of marine viruses to spread rapidly (5000 km in 7 months), further driving the hypothesis
that MMEs may advance faster in aquatic ecosystems (due to water having a higher connectivity
and lower granularity than air), with pathogens exploiting indirect mechanisms of
infection (Harvell, 1999; McCallum et al., 2004). The biological and economic results
of such fast‐spreading MMEs can be dramatic. This MME among pilchards alone resulted
in >$12 million AUD loss to the Australian aquaculture industry over a 3‐year period.
Yet this value is an extraordinarily trivial value compared with the billions of dollars
lost to epizootics in penaeid shrimp, oysters, abalone, lobster, and other invertebrates
and countless other viral pathogens exerting pressure on fisheries and aquatic cultivation
industries worldwide (Lafferty et al., 2015).
The new realm of viral detection and the democratization of environmental virology
The reality we face in investigating viral diversity in non‐model hosts is that this
exploration may become salient faster than we realize. This is supported by the ratio
between studies on Coronaviridae in Chiroptera or wildlife published in 2020 relative
to the total sum of those published in the previous decade or two decades (≥1:1 respectively,
search: November 2020, Pubmed). Thankfully, the tools to examine the viral community
composition of non‐model hosts and ecosystems are accessible now more than ever. Technology
ranging from smartphone‐enabled nanoscale microscopy (Diederich et al., 2020 (preprint),
Wei et al., 2013) to desk‐side sequencing (e.g. minION; Greninger et al.,
2015) is becoming available and utilitarian for everyday users, expanding not only
our insight into the repertoire of viral biogeography and tropism in the animals we
consume and sell, but in our understanding of the ecosystems in our own backyards.
Moreover, terabases of data from over a decade of metagenomic sequencing represent
a trove of potential unmined viral sequences if paired with sufficient metadata; viruses
are routinely caught on water filters or within tissues and sequenced in tandem with
their hosts. In a 2017 issue, Sullivan, Weitz, and Wilhelm noted that by 2020, computational
tools to analyse viral diversity and the mechanisms of infection and disease would
become easily accessible to biologists ‐ democratized (Sullivan et al., 2017). As
virology rests in the spotlight, many of these tools are being assimilated into everyday
conventional language and advanced by the small flood of computationally minded individuals
with a newly vested interest in solving bioinformatic hurdles in viral discovery.
A few of these innovations have provided the fundamental steps for navigating surveillance,
management, and perhaps prediction of marine EIDs in a generation defined by viral
discovery.
Order from chaos ‐ the value of discovery‐based sequencing in non‐model hosts
The global ‘virome’ has long served as a frontier for genetic discovery. In the past
decade, we have witnessed exponential growth in cultivated and uncultivated viral
genomes (currently >2 million putative virus‐like sequences; IMG/vr; Roux et al.,
2020) and cellular hosts, contributing to an ever‐growing inventory of possible therapeutic
bacteriophage, drug therapy vectors, oncoviruses, pathogens, and more. In particular,
our knowledge of two groups of viruses ‐ large DNA (often previously excluded due
to size‐specific purification efforts) and RNA viruses (previously excluded due to
DNA‐centric sequencing efforts) ‐ have matured. One study alone increased the total
number of genomes in one viral group, the nucleocytoplasmic large DNA viruses (NCLDVs)
by >1100% (Schulz et al., 2020). Congruently, giant/jumbo bacteriophage with genomes
>500 kb, virions longer than 600 nm, tRNA synthetases and proteinaceous nucleus‐like
compartments posit exciting new questions about virion architectures, infection and
defence strategies, and evolutionary trajectories (Malone et al., 2020). Highlighting
the utility of database mining, another study identified 10 000 RdRp genes, hallmarks
of RNA viruses, prompting reevaluation of gene exchange between RNA viral families
and complete reconstruction of the new ‘realm’ of the Ribovaria (Koonin
2020
; Wolf et al., 2020). Members of these newly discovered RNA viruses disproportionately
infect invertebrates (Wolf et al., 2020), plants (Dolja et al., 2020), and metazoans
(Koonin et al., 2020), yet it remains unclear what impact that these fast‐evolving
genomes have on their hosts. Without this knowledge, it is near impossible to anticipate
zoonoses in aquatic ecosystems and conserve wildlife threatened by disease. However,
many of these discoveries provide evolutionary context for highly pathogenic viruses
in non‐model hosts or may inform future sampling efforts if sequences sharing features
comparable to viruses eliciting MMEs are identified.
Collectively, viral discovery in non‐model hosts has reduced the viral world from
an unknowable immensity of taxonomic singletons to a more limited, interconnected
taxonomy linked by gene‐sharing networks (Koonin and Dolja, 2014; Koonin et al., 2020)
and allowed risk‐based surveillance of potential MME‐eliciting aquatic viruses. While
the total scope of diversity remains uncharted in most ecosystems, discovery rates
of viruses in some deeply sampled systems and hosts have started to approach saturation,
placing constraints on viral community complexity, as well as our understanding of
virion morphotypes, gene repertoire and more (Koonin and Dolja, 2014; Gregory et al.,
2019). For example, some conservative estimates put the total number of dsDNA viral
genes at 4 million ‐ a value with a finite number of possible virion structural designs
and replication strategies. While total gene repertoire may be nearly cosmic, only
few genes are shared by a wide variety of viruses (e.g. polymerases ‐ RdRps, RTs,
structural elements ‐ SJR and DJR capsids, and helicases and endonucleases ‐ S3Hs
and RCREs; Koonin and Dolja, 2014, Koonin et al., 2020). These genes have served as
cornerstones to redefine viral taxonomy (or ‘Megataxonomy’), contextualized by other
mobile genetic elements and retroelements (Koonin et al., 2020). This taxonomic organization
may guide our understanding of shared ecological, epidemiological, or evolutionary
traits, such as broad predictions of host (e.g. DJR; Nayfach et al., 2020), disease
emergence, or infection dynamics in marine ecosystems. Though forecasting power is
currently low, these genomic tools ‐ when paired with sufficient data of disease‐eliciting
environmental conditions ‐ may provide a starting point for pathogen diagnostics,
particularly in non‐multifactorial marine EIDs. Secondary to its value in disease
prediction and spillover prevention, this organization begins to link sequence to
function, demonstrating the innovative evolutionary strategies that viruses utilize
to navigate their hosts.
Viral signal versus sequence
Animal–virus interactions are contingent on factors ranging from host ontogenetic
immunity to environmental temperature. There are numerous hurdles – from lack of continuous
cell culture of appropriate hosts to unfamiliar histopathology – that can inhibit
investigations of infection dynamics in aquatic systems. Although the approachability
of multi‐omic (metagenomic, transcriptomic, etc.) sequencing has revealed increasingly
vast and diverse viromes of wildlife, the ecology of many of these viruses ‐ their
hosts, genetic capabilities, pathogenicity, transmissibility, and so on ‐ remain in
their nascency. However, snapshot and time‐series sequencing have provided a framework
for ecological inference based on both (i) the aggregate viral signal within ecosystems
or hosts and (ii) the whole discrete genome sequence (and subsequent deduction of
ecoevolutionary characteristics).
Macroscale inferences illuminate individual viruses
Many tools developed for identifying viruses have become increasingly reference‐independent,
reliant on both genetic identities and features specific to viral genomes (such as
CDS density, kmer content, ORF orientation, and so on; e.g. Roux et al., 2015; Kieft
et al., 2020 among others). Organization of viral signal into shared gene or protein
networks have proved essential for describing ecosystem‐ and community‐wide patterns
such as niche differentiation, community structure and cohesion, and virome functional
similarities (Gregory et al., 2019; Hurwitz et al., 2015). Determining ‘who infects
whom’ is a deceivingly simple, but fundamentally important first step in determining
how viruses are transmitted and alter wildlife populations, particularly among hosts
positioned to expedite spillovers or with a fragile conservation status. Microscopy
(e.g. FISH, RNAscope, fluorophore tagging, etc.) and microfluidics (e.g. mining SAGs)
provide viable alternatives to cultivation in non‐model hosts but identifying which
virus infects which host is now commonly achieved in silico. The field of ‘paleovirology’
can identify the remnants of past infections in the predicted host (e.g. Geering et
al., 2014; Moniruzzaman et al., 2020) through integrations in a germline sequence
in a eukaryote, somewhat analogous to components of sequence‐dependent defence systems
such as CRISPR, though not always in a functional capacity. As it stands, 85% of viral
sequences are affiliated with a predicted host (Roux et al., 2020), and a database
of more than 700 000 nonretroviral endogenized genes have provided insight into the
history of infection in mammals (Nakagawa and Takahashi, 2016).
Genomic inferences illuminate community ecology
Full viral genome assemblies, while often the norm in those infecting human and model
systems, are now also becoming attainable in wildlife and environmental systems. Contingent
on sequencing depth, viromes and metagenomes enable assembly of high‐abundance viral
genomes and provide a snapshot of the net diversity of low‐abundance viruses in the
form of fragmented contigs. Single‐virus genomics (SVG) ‐ flow cytometric sorting,
whole genome amplification and sequencing ‐ now enables more complete genome sequences
of these low‐abundance virions. The development of SVG, in particular, has delivered
near‐complete genomes of large viruses previously thought to be cellular (Martínez
Martínez et al., 2020), and highly microdiverse genomes that previously could not
be assembled (Martinez‐Hernandez et al., 2017). These glimpses of full viral genomes
also provide insight into viral microdiversity on a population level, lending insight
into transmission and persistence (Gregory et al., 2019). For example, read recruitment
and single nucleotide polymorphism detection at the level of complete viral sequences
‐ even those at low abundance ‐ may define vectored transmission between specific
wildlife in much the same ways that we may contact‐trace those with high‐titre infections
of SARS‐CoV‐2 (Laha et al., 2020; Meredith et al., 2020). Ultimately, these genomic
sequences are essential to extend questions beyond ‘what is where/when’ (viral discovery
and evaluation of viral signal) to ask ‘what are they doing?’ (functional capacity,
genomic conformation, transcriptomics, proteomics, etc.) and begin to anticipate their
potential to elicit consequential epizootics.
Advances in the study of marine wildlife viral infection dynamics
When higher resolution is required to determine the individual impact a virus has
on the cell(s) it infects, cultivation remains the gold (and often unattainable) standard
in virus–host pairing and infection dynamics. Viral isolation and cultivation remain
definitive to fulfil ‘River's Postulates’ and demonstrate causation between pathogen
and disease (Rivers, 1937). However, even if host prediction is correct, culture represents
an abnormal system, with the potential for contamination by latent or other opportunists,
with conceivably atypical cell types, atypical abiotic characteristics and atypical
potential for coinfection. These conditions often make viral challenge experiments
(such as those performed when investigating putative pathogens OsHV‐1 or SSaDV; Burge
et al., 2016) preferable to viral isolation and culture, though they may be similarly
convoluted by many of these same factors. Although ambitious endeavours to examine
the mechanisms of co‐evolution and the genomic underpinnings of infection have made
a renaissance in marine bacteria and archaea (Kauffman et al., 2018), those in marine
metazoans continue to lag. In wildlife EIDs, serology, histopathology, microscopy
and other gene‐based detection methods (e.g. qPCR, host transcriptomics, etc.) fill
the gap. In silico protein–protein interaction networks may accelerate our understanding
of the viral ‘interactome’, beyond model hosts as protein functional prediction advances.
For example, in silico prediction of viral protein binding site residues with host
receptor followed by protein expression and experimental validation via affinity purification
mass spectrometry can provide key information about tropism, cell biology and expression
patterns during infection, and spillover risk. Indeed, this in silico approach was
utilized as proof‐of‐concept in a range of scenarios, from identifying binding sites
in non‐model hosts (Kamal et al., 2019) to evaluating differences in protein interaction
networks between bat and human coronaviruses in this most recent pandemic (Ortega
et al., 2020).
Conclusions
In 2013, titans of microbiology and symbiosis fields remarked that animals inhabit
a bacterial world (McFall‐Ngai et al., 2013). It is not hard to argue that we, in
fact, also live in a viral one. With advancing computational tools, the field has
developed the ability to explore how viruses and their hosts have altered each other's
origins and evolution, and continue to transmit, infect and affect each other's genomes.
We have the ability to investigate the intersection between host development and infection,
and external environmental impacts and infection. We are also facing a potential epochal
shift in the way that we apply these democratized computational tools. The cost of
zoonoses and illiteracy in viral diversity in threatened wildlife is no longer hypothetical.
These tools may be applied to investigate this larger viral epizootic pool to better
anticipate spillover into new species or ourselves and preserve global biodiversity
as we encroach on new habitats.
Many fields continue to undergo rapid and profound change in response to the viral
pandemic ‐ viral ecology is no exception. Programs ‐ both established and new ‐ have
coalesced to strategically sequence non‐model hosts and ecosystems (Kress et al.,
2020; Watsa et al., 2020), with many calling for coordinated efforts to prevent future
spillovers from wildlife. We predict that any high‐throughput efforts will generate
the development of initial rapid and ‘low‐investment’ in silico analyses that provide
a deeper understanding of genomic conformation/modification, viral protein expression/folding/maturation,
protein–protein interactions and their relevance to zoonotic risk, in addition to
basic taxonomic and evolutionary context. Though not without bias, we hope that the
expansion of single‐cell sequencing will also provide a higher resolution understanding
of viral ecology. Coupled with (i) sufficient accessibility to democratized in silico
tools, (ii) well‐documented data provenance and (iii) well‐reported metadata, open
access data are an underutilized resource to explore viral diversity in non‐model
hosts. Only 20% of open access metagenomes are accessible, and even this does not
imply functionality (Eckert et al., 2020). From this dataset, Nayfach et al. (2020)
were able to predict putative hosts for >81 000 viral sequences and link multiple
viral clades. Further high‐throughput discovery of RNA viruses and investigation of
their ecology is just beginning. Though sequencing provides the groundwork for many
questions, we believe that a deeper understanding of infection dynamics through culture,
challenge, histopathology, serology and hypothesis‐driven experimentation will endure.
In our attempt to identify recent field‐revolutionizing advances or predict transformative
trends, we could not overlook those provided by the pandemic ‐ at a substantial cost.
We could not justify a single new advancement or tool ‐ computational or otherwise
‐ that we think will have more of an impact on the field of viral ecology than new
human capital. We have seen openly available computational programs blossom over the
last few years, and a wealth of data that is underexplored that is both publicly accessible
and relatively inexpensive to access. When you add curiosity, a bit of time, and a
ruthless need to understand the viral world we find ourselves inhabiting into this
mix ‐ all the things that a disproportionately large number of this next generation
may harbour ‐ there is unequivocally no predicting what we will learn about viral
ecology in the natural world.
This is the mainstream, entering virology.