Molecular Detection of Candidatus Coxiella mudorwiae in Haemaphysalis concinna in China

Objective: Coxiella burnetii and Coxiella -like endosymbionts (CLEs) have been widely discovered in various ticks, animals, and even human beings. To estimate the possible origin of C. burnetii and its relatives CLEs, the prevalence of C. burnetii and CLEs has been intensively surveyed all over the world. Method: In the present study, the possible infection of C. burnetii and CLEs in host-seeking Haemaphysalis concinna was performed with meta-transcript analysis with tick specimens harvested from Mudanjiang City, Heilongjiang province, China. The meta-transcript results were subsequently confirmed by the specific sequence of partial 16S rRNA. Results: A total of three arrays of gene transcripts were harvested, including pyrophosphate-fructose 6-phosphate 1-phosphotransferase-eda -thiol-disulfide isomerase and thioredoxin-greA , carB - carA - DnaJ - DnaK - grpE - ppnk , ropC - ropB , and ubiA-non-canonical purine NTP pyrophosphatase-hemK-prfA , which suggest the infection of Candidatus Coxiella mudorwiae in H. concinna . The high identity of the 16S rRNA gene of Candidatus C. mudorwiae achieved in our study strongly supports our meta-transcripts analysis. Conclusion: The prevalence of Candidatus C. mudorwiae in hard ticks has been discovered in China. More detailed surveys are imperative to clarify the emergence of CLEs and their implication in the epidemiologic characteristics of Q fever.


INTRODUCTION
Coxiellae are intracellular bacteria that infect humans, vectorial arthropods, and a variety of vertebrates worldwide [1]. Both symbiotic and pathogenic bacteria in the family Coxiellaceae cause morbidity and acute disabling diseases in humans and animals. Coxiella burnetii, the only officially recorded species in the genus Coxiella, has been well-characterized as the agent of Q fever, which occurs sporadically with occasional outbreaks in humans worldwide annually [2]. The well-known Q fever epidemic in The Netherlands during 2007-2010 has attracted much attention due to the > 4000 human cases and health burden [161-336 million Euro] [3,4]. C. burnetii is also a category B potential aerosolized biological weapon [5]. The potential health risk from Coxiella is the subject of intensive epidemiologic surveys, which have revealed that diverse species, including various Coxiella-like endosymbionts (CLEs), exist within the genus Coxiella. CLEs isolated from different tick species [6] and their vertebrate hosts [7] have suggested that the natural occurrence of CLEs is by no means fortuitous but bound to commensal parasites in multiple tick species. Moreover, CLEs have been shown to provide host ticks with dietary supplementation, such as vitamin B, when deficient in vertebrate blood [8]. The well-characterized CLEs present close affinity with reference strains of C. burnetii with respect to morphologic, biologic, and evolutionary traits; however, the CLEs are genetically distinct. Coxiellaceae members have been classified into a minimum of 4 highly-divergent genetic clades (A-D) based on the phylogenetic analysis of 16S rRNA, lifestyle transitions, and maternal transmission traits [9]. Based on the taxonomic hierarchy, clade A includes all C. burnetii reference strains and CLEs from Argasidae ticks. In clade B, however, CLEs of hard ticks (Amblyomma and Ixodes) and some soft ticks (Orinthodoros) are represented with reduced genome sizes [~1M] [10]. Clade C is comprised of CLEs from ticks in the genus Rhipicephalus and their relatives [11]. The representative species, Candidatus Coxiella mudrowiae (Ca. C. mudrowiae) from R. turanicus was shown to have larger genome size, which has a very low protein-coding content and an extremely high number of identifiable pseudogenes [12]. Clade C members have also been shown to be opportunistic agents of human skin infections [8,[12][13][14][15]. Regarding clade D, CLEs that are mainly derived from various Haemaphysalis, Dermacentor, and Amblyomma ticks were presumably pathogenic Coxiellae responsible for horse infections [16,17]. The presence of genetically-divergent CLEs in more tick species converge to support the hypothesis that these CLEs are specific endosymbionts of ticks and clade A CLEs hosted primarily by soft ticks might have served as the ancestor of C. burnetii. Thus, more detailed genetic information and rigorous assessment of the risk of infection to vertebrates are required for a comprehensive description.
The discovery that ticks carry C. burnetii or CLEs underscores the demand to elucidate the transmission routes of Coxiellae in natural cycles. First, it is noteworthy that the highly-virulent reference strain of C. burnetii, Nine Mile, was isolated from a guinea pig fed by Dermacentor andersoni ticks [18,19]. Second, > 50 tick species have been reported carrying visual C. burnetii or CLEs based on microscopic observations [6,20]. Many field studies have focused on ticks for Q fever [21,22]. Together with laboratory evidence, at least 10 tick species, including D. andersoni, have been formally confirmed to be competent vectors of C. burnetii along with their efficient trans-stadial or trans-ovarial transmission abilities [23][24][25][26][27][28][29][30][31][32]; however, it remains controversial whether these ticks have any significant role in the natural spread of C. burnetii [9]. Furthermore, C. burnetii and various CLEs prevail in different tick species, suggesting that these ticks can serve as a reservoir for Coxiellaceae members. Considering possible contamination to host skin and persistence in the environment by excreting high concentrations of Coxiella in tick feces [32], the medical significance for ticks in the environmental dissemination of C. burnetii and CLEs should not be underestimated [22]. Additionally, the noticeable presence of CLEs in salivary glands, ovaries, and Malpighian tubes of various tick species in almost all life stages [33][34][35] further suggests the possible vertical transmission via the egg cytoplasm or horizontal transmission across development stages rather than being acquired by blood-sucking on infected vertebrates, which demonstrates the risk of vertebrate host infection, including humans.
In China, the first human case of Q fever was reported in the 1950s when patients with atypical pneumonia were diagnosed serologically using a complement fixation test [CFT] [36,37]. The first human case was definitively confirmed in 1962 when the first Chinese strain of C. burnetii (Qi Yi) was isolated [38]. Since that time, nearly 30 seroepidemiologic or molecular studies have described C. burnetii infections in 64 cities/municipalities within 19 provinces across China [39]. Moreover, CLEs and C. burnetii have also been detected from ticks belonging to 5 genera, including Ixodes [40], Dermacentor [40,41], Haemaphysalis [42], Hyalomma [43], Rhipicephalus [44,45], wild animals [46,47], and even freshwater shrimp [Palaemonetes sinensis] [48]. The presence of CLEs in these ticks confers crucial and diverse benefits to the host ticks, affecting their development, nutrition, chemical defense, or reproduction, and distinctly interfere with the maintenance and/or transmission of some tickborne pathogens by tick-endosymbiont interactions, such as competition or mutual reciprocity [49]. The medical significance of CLEs occurring in ticks from Chinese territory, however, remains to be determined. During our meta-transcriptomic surveys for tickborne pathogens, an emerging CLE was found in Haemaphysalis concina collected from forest areas in Mudanjiang city of Heilongjiang province. The CLE was identified as Ca. C. mudrowiae in clade C, with molecular evidence from 16S rRNA, 3 arrays of transcripts, including pyrophosphate-fructose 6-phosphate 1-phosphotransferase-edathiol-disulfide isomerase and thioredoxin-greA-carB-carA-DnaJ-DnaK-grpE-ppnk, ropC-ropB, and ubiA-non-canonical purine NTP pyrophosphatase-hemK-prfA. The discovery of Ca. C. mudrowiae infected in H. concinna in China might advance our understanding of the epidemiology of Q fever and other tickborne diseases that eventually benefit us with rational prevention and control measures based on the evolutionary history.

Ethics statement
Protocols for field tick collections and samples processing were reviewed and approved by the Institutional Ethics Review Board of Beijing Institute of Microbiology and Epidemiology (BIME-2020-019).

Tick collection, morphologic identification, and nucleotide acid extraction
A total of 213 adult ticks were randomly harvested on vegetation in Mudanjiang, Heilongjiang province, China (44.60°N 129.59°E) between April 2020 and July 2021. Identification of tick species was based on morphologic characteristics following taxonomy keys for ticks in China [50]. Following species identification, the ticks were grouped by sex, each with 30 individuals. After sterilization with 70% alcohol, ticks in each pool were crushed with a sterile plastic homogenizer and total DNA and RNA were simultaneously extracted with a MagMAX DNA/RNA Isolation kit (Thermo-Fisher Scientific, Life Technologies, Carlsbad, California, USA). Purified DNA/ RNA was quantified using the Qubit High Sensitivity assay (Thermo-Fisher Scientific, Life Technologies, Carlsbad, California, USA) before further processing.

Validation species identification by molecular markers
To verify tick species identification, we extracted genomic DNA from a proportion of the homogenates of each specimen or specimen pool. Two genes were used for tick identification: the partial 18S rRNA gene (∼1100 nt); and the partial COI gene (∼680 nt). The former target gene was amplified using primer pairs [18S-1 (5′-CTGGTGCCAGCGAGCCGCGGYAA-3′) and 18S-2 (5′-TCCGTCAATTYCTTTAAGTT-3′)], and the primer pairs [LCO1490 (5′-GGTCAACAAA TCATA AAGATATTGG-3′)] and [HCO2198 (5′-TAAACTTCAGGGTGAC CAAAAAATCA-3′)] for the latter. PCR amplifications were performed as described previously [51]. For taxonomic determination, the resulting sequences were compared against the nt database, as well as with all COI barcode records on the Barcode of Life Data Systems (BOLD).

Coxiella mudorwiae
The RNA samples from Haemaphysalis concinna were reverse-transcribed to cDNA using the PrimeScript TM IV 1st strand cDNA Synthesis Mix (Takara, Shiga, Japan) and ribosomal RNA (rRNA) was removed using the Ribo-Zero Gold Kit (Illumina, San Diego, California, USA) following the manufacturer's instructions. Subsequently, for all rRNA-depleted RNA-samples, the sequencing libraries were constructed using the KAPA Stranded RNA-Seq Kit (KAPA Biosystems, Roche, Boston, Massachusetts, USA) with barcode adapters (Bio Scientific, Phoenix, Arizona, USA) according to the manufacturer's instructions. After cDNA-level quantification by Qubit assays (Thermo-Fisher Scientific, Eugene, Oregon, USA), equimolar amounts of nucleic acids were pooled and submitted for sequencing in each library. Further, all libraries were subjected to next-generation sequencing (NGS) using the Illumina HiSeq 2500 platform on a single lane to generate the 125 bp paired-end reads at the BGI Sequencing Centre (www.genomics.cn). Raw sequencing reads of low quality were excluded, trimmed with trim galore [www. bioinformatics.babraham.ac.uk/projects/trim_galore/] [52], then assembled de novo using the Trinity v2.8.5 program [53,54]. The assembled scaffolds were predicted according to the open reading frames by MetaGeneMark, and CD-HIT was used to remove redundancy and obtain the initial unique gene catalog. For determination of gene abundance, the reads were realigned with the gene catalog with Bowtie 2 [55]. Only genes with two or more mapped reads were deemed to be present in a sample [56]. The presence of these genes in the corresponding samples was confirmed by nested RT-PCR and Sanger sequencing (data not shown). The unigenes were also aligned to the NR database (https://www.ncbi.nlm.nih.gov/) of NCBI with DIAMOND [57]. The aligned results of each gene with an e value ≤ the smallest e value × 10 were retained [56], then processed with the Lowest Common Ancestorbased algorithm implemented in MEGAN to ensure the species annotation information of sequences [58].

Confirmation Candidatus Coxiella mudorwiae infections by PCR amplification for 16S rDNA
Commercial HotStart PCR Premix Kit (Thermo-Fisher Scientific, Vilnius, Lithuania) was used for PCR amplification. Nested PCR according to the work from Seo et al. [16] was used to amplify the 16S rRNA of Ca. C. mudrowiae and other CLEs, and sequencing. Cox16SF1 (5′-CGTAGGA ATCTACCTTRTAGWGG-3′) and Cox16SR2 (5′-GCCTACCCGCTTCTGGTACAA TT-3′) were used to perform the first-round amplifications. Then, nested PCR was performed using the primers, Cox16SF2 (5′-TGAGAACTAGCTGTTGGRRAGT-3′) and Cox16SR2. All PCR amplifications were performed using the following program: pre-denaturation at 93°C for 3 min; 30 cycles of denaturation at 93°C for 30 s, annealing at 56°C for 30 s, and polymerization at 72°C for 1 min; and a final post-polymerization step at 72°C for 5 min. PCR products of the second amplification process were analyzed by electrophoresis, with 10 μl of the reaction mixture and a 100 bp DNA ladder (Bioneer, Daejeon, Korea) using 1.5% agarose gels for 30 min at 100 V and visualized using UV transillumination imaging after ethidium bromide staining. Samples yielding amplicons of the expected size were delivered to Sangon Biotechnology Co., Ltd. (Shanghai, China) for bi-directionally sequencing with the primers, Cox16SF1 and Cox16SR1 [5′-ACTYYCCAACAGCTAGTTCTCA-3′] [16].

Phylogenetic analysis
The obtained sequences were compared with the reference sequences in GenBank with the NCBI-BLAST server (http://blast.ncbi.nlm.nih.gov/blast.cgi) and multiple sequences were aligned with ClustalW with default parameters in MEGA 7.0. The phylogenetic trees of 16S rRNA, pyrophosphate-fructose 6-phosphate 1-phosphotransferase-eda-Thiol-disulfide isomerase and thioredox i n-greA-carB -carA-D naJ -DnaK-gr pE -ppnk, ropC-ropB, and ubiA-non-canonical purine NTP pyrophosphatase-hemK-prfA of Ca. C. mudrowiae were constructed with the maximum likelihood method using the best-fit model of nucleotide or amino acid substitution (LG + I + Γ + F for all alignments) with 1000 bootstrap replicates in the PhyML v.3 program [59]. All phylogenetic trees were rooted at the midpoint for clarity purposes only.

Data availability
All sequence reads generated in this project are available under the NCBI Short Read Archive (SRA) under accessions SAMN26934337-SAMN26934338 (BioProject ID: PRJNA819490). All sequences were deposited in GenBank under accession numbers: OP863298, OP863299, OP863300 for three RNA transcipts, OPB839488 for 16S rRNA.

Prevalence of Candidatus Coxiella mudorwiae in Haemaphysalis concinna
The presence of CLEs was prompted by assembly contigs derived from H. concinna, then confirmed by PCR amplification targeting the 16S rRNA gene combined with sequence analysis. After annotation, all the transcripts achieved in 30 male or female H. concinna individuals showed no difference, suggesting unbiased CLE parasitizing in both sexes. Similar to the transcripts, there were also no differences observed in the16S rRNA genes of the CLE in both sexual groups of H. concinna. As a consequence, the 16S rRNA gene of the CLE was shown to be identical to the Ca. C. mudrowiae strain, CRt (CP011126.1), with 99.72% similarity over 100% coverage. Ca. C. mudrowiae is known to be closely related to pathogenic Ca. C. massiliensis isolated from infected samples of human skin [13], one horse blood strain of Coxiella (H-JJ- 10) in Jeju Island of South Korea [17], and another humanderived Coxiella strain manifested with a scalp eschar and neck lymphadenopathy [14]. The phylogenetic tree constructed with partial sequences of 16S rRNA genes of CLEs from 46 tick species and C. burnetii reference strains formed four distinctive branches corresponding to clades A-D of Coxiellaece, respectively, as expected. Among these branches, all C. burnetii reference strains were clustered in clade A, along with other CLEs from Ornithodoros (O. amblus, O. moubata, and Carios capensis) species. The Ca. C. mudrowiae derived from H. concinna in the present study was shown to have high similarity to the Ca. C. mudrowiae strain, CRt (CP011126.1), and the Ca. C. mudrowiae strain, CRS-CAT (CP024961.1), which formed a distinctive sub-branch falling into clade C, including other CLEs from Rhipicephalus and Dermacentor ticks. Clade C was also demonstrated to be well-separated from other CLE clades, as shown in the phylogeny tree (Fig 1 panels A and B).  the protein database of the Ca. C. mudrowiae and other CLEs available through BLASTP. Sixteen transcripts of Ca. C. mudrowiae were identified from both sexes of H. concinna, which included pyrophosphate-fructose 6-phosphate 1-phosphotransferase, eda, thiol-disulfide isomerase, greA, carA, carB, DnaJ, DnaK, grpE, ppnK, rpoC, rpoB, and ubiA non-canonical purine NTP pyrophosphatases, hemK and prfA. Aligned sequences of the 16 genes or transcripts are listed in Table 1 with their identities and coverages of putative amino acids of CLE top hit proteins. Among the CLE top hit proteins, the 10 genes or transcripts, including pyrophosphate-fructose 6-phosphate 1-phosphotransferase, thiol-disulfide isomerase and thioredoxin, eda, greA, carA, carB, DnaJ, DnaK, grpE, and ppnK, were shown with > 78% identities on > 97% coverages with those of the Ca. C. mudrowiae stain, CRt (CP011126.1). The remaining six transcripts of rpoC-rpoB, ubiA, non-canonical purine NTP pyrophosphatase, hemK and prfA were demonstrated with higher identities, ranging from 77.26%-94.17%, on 100% coverages with those of the Ca. C. mudrowiae stain, CRt (CP011126.1). The results also supported the conclusion that the CLE from H. concinna should be identified as Ca. C. mudrowiae (Fig 2), which belongs to clade C of family Coxiellae, which has a genome reduction and prominent gene decay because of the loss of selection on gene functions [12].

DISCUSSION
Q fever is a widespread zoonotic disease caused by C. burnetii, a ubiquitous intracellular bacterium. Due to its highly polymorphic clinical manifestations, Q fever is difficult to prevent, diagnose, and treat in humans and animals [9]. Human exposure to C. burnetii may result in asymptomatic-to-mild infections, but also in acute or chronic disease. Although rarely fatal, Q fever can be highly debilitating, even under treatment [60].
This post-infection fatigue syndrome is associated with increased or impaired cellular responses, but with apparently no viable Coxiella, low or negligible antibody levels, and clinical expression of a long-lasting fatigue complex involving many body systems [61]. Therefore, Q fever results in a high socioeconomic burden and significant challenges for public health. To prevent and control the possible infections in humans and animals, pioneering studies have been carried out to elucidate the pathogenicity, epidemiology, diagnosis, and treatment aspects of C. burneti and > 40 tick species have been shown to naturally infect C. burnetii or CLEs [9]. Our NGS approaches have also revealed that Ca. C. mudrowiae, but not C. burnetii, prevail in H. concinna populations, which supports the hypothesis that diverse CLEs rather than presumptive C. burnetii predominate in most tick species investigated thus far [62][63][64]. The reality that ticks carry both C. burnetii and CLEs further emphasizes the demand to clearly distinguish Coxiellae members, which is necessary to improve our understanding of the epidemiology and evolutionary history of Q fever. The infection of C. burnetii in ticks might be rigorously confirmed through an array of impressive assays, including hemolymph tests, isolation in cell-lines [Vero] [65], and multiple-locus variable number tandem repeat (MLVA) analysis [66], multispacer sequence typing [MST] [67], and SNP genotyping [21,68]. Nevertheless, many studies have not been conducted as rigorously, which has led to frequent misidentifications of CLEs. Some strains, primarily assumed to be C. burnetii visually, were actually misidentification results of CLEs. CLEs in A. americanum, for example, has been repeatedly misidentified as C. burnetii in the older literature [69,70], but recent sequence-based detections have confirmed that it is actually a CLE (CP007541) of clade D characterized with a smaller genome size [71]. In our studies, 16 function genes or transcripts beyond 16S rRNA of Ca. C. mudrowiae were achieved, although the percent identities of some genes were quite low compared to the reference strain (Table 1), which may be explanted by the possible sense mutations or different strains from various localities. The limitation of our study was apparent because we failed to achieve a full genome of Ca. C. mudrowiae and unambioguously identify the CLE. Future attempts to isolate Ca. C. mudrowiae from H. concinna would be beneficial in reaching our goal.
Overall, heritable endosymbiotic bacteria are of ecologic and evolutionary importance to the particular arthropod species because the bacteria potentially confer crucial and diverse benefits to the host by affecting development, provision of nutrients, chemical defense, or reproduction [71]. The high frequency of CLE transfer occurring at the tick-vertebrate interface may enhance or decrease the probability of infections not only with C. burnetiid, but also with other tick-borne pathogens [72]. For example, Wolbachia spp. have recently been shown to be defensive endosymbionts to interfere with the replication and transmission of various pathogens in their arthropods hosts, including mosquitoes and f lies [73,74], which eventually limit the vector competence of these blood-sucking arthropods [75,76]. Although most CLEs described to date have been confined to infect ticks, whether commensal CLEs parasitize ticks limit their vector competence to transmit pathogenic organisms, including C. burnetiid, remains to be determined. Under natural circumstances, most CLEs in ticks pose a much lower infection risk to vertebrates than C. burnetii, which may be explained by the facts that genomic reduction or conjugation machinery might take place in the evolutionary history of CLEs to exchange genetic information, such as Dot/Icm and type IVB secretion system (T4BSS) via horizontal gene transfer [HGT] [49]. Thus, most Coxiella progenitors have evolved into avirulent CLEs with reduced genomes and devoid of known virulence genes [CLE of A. Americanum] [6] or intermingled pathogens retaining virulent genes (C. burnetii or pathogenic CLEs). The infectious Ca. C. mudrowiae was also demonstrated to have biologic activity in H. concinna along with transcription evidence from 16 genes, which were involved in many essential physiological processes, including energy metabolism (e.g., pyrophosphate-fructose 6-phosphate 1-phosphotransferase), intercellular vesicles formation or transportation (e.g., DnaJ, DnaK, and grpE), and potential intracellular invasion. Whether the transcriptions of these genes participate in the complicated interactions with pathogens or other endosymbiont parasites in host ticks remains to be clarified. In contrast, the occurrence of CLEs, such as Ca. C. mudrowiae, also suggest a feasible CLE-based approach to modify pathogen transmission in arthropods, which confirmed our interesting research effort in a direction for intervention of pathogen transmission [77]. Therefore, it is necessary to describe the diversity of CLEs, characterize more fully their genetic relatedness, and assess their potential to cause infections in vertebrates.
In conclusion, Ca. C. mudrowiae infected in H. concinna has revealed that studies on CLEs can advance our understanding of CLE diversity and epidemiology of Q fever. Ca. C. mudrowiae and C. burnetii are closely related, but differ in their transmission ecology and infectivity. This phenotypic diversity makes the evolution of genus Coxiella a topic of peculiar interest, because there are clearly transitions between avirulent and virulent symbionts residing in confined intracellular environments.