38
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium inaugural meeting report

      case-report
      The MetaSUB International Consortium
      Microbiome
      BioMed Central
      Microbiome, Biosynthetic gene clusters, Built environment, Next-generation sequencing, Antimicrobial resistance markers

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium is a novel, interdisciplinary initiative comprised of experts across many fields, including genomics, data analysis, engineering, public health, and architecture. The ultimate goal of the MetaSUB Consortium is to improve city utilization and planning through the detection, measurement, and design of metagenomics within urban environments. Although continual measures occur for temperature, air pressure, weather, and human activity, including longitudinal, cross-kingdom ecosystem dynamics can alter and improve the design of cities. The MetaSUB Consortium is aiding these efforts by developing and testing metagenomic methods and standards, including optimized methods for sample collection, DNA/RNA isolation, taxa characterization, and data visualization. The data produced by the consortium can aid city planners, public health officials, and architectural designers. In addition, the study will continue to lead to the discovery of new species, global maps of antimicrobial resistance (AMR) markers, and novel biosynthetic gene clusters (BGCs). Finally, we note that engineered metagenomic ecosystems can help enable more responsive, safer, and quantified cities.

          Related collections

          Most cited references31

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Extensive sequencing of seven human genomes to characterize benchmark reference materials

          The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Detection of Zoonotic Pathogens and Characterization of Novel Viruses Carried by Commensal Rattus norvegicus in New York City

            INTRODUCTION Zoonotic pathogens comprise a significant and increasing proportion of all new and emerging human infectious diseases (1, 2). Although zoonotic transmission is influenced by many factors, the frequency of contact between animal reservoirs and the human population appears to be a key element (3). Therefore, the risk of zoonotic transmission is increased by events that act to reduce the geographic or ecological separation between human and animal populations or increase the density and abundance of these populations where they coexist (2, 4). In this context, rapid and continuous urbanization constitutes a significant challenge to human health, as it creates irreversible changes to biodiversity that are driven by varied responses from animal species. In particular, species classified as urban exploiters and urban adapters may exist in unnaturally large and dense populations within urban environments and have above-average rates of contact with people (5 – 7). Of these, few species have been as successful at adapting to a peridomestic lifestyle as the Norway rat (Rattus norvegicus). In the urban environment, Norway rats closely cohabitate with humans—living inside buildings, feeding on refuse, and coming into contact with many aspects of the food supply (7 – 9). These characteristics, coupled with high levels of fecundity, growth rates, and population densities, suggest that urban Norway rats may be an important source of zoonotic pathogens (10, 11). Indeed, the Norway rat is a known reservoir of a range of human pathogens, including hantaviruses, Bartonella spp., and Leptospira interrogans; however, little is known about the microbial diversity present in urban rat populations or the risks they may pose to human health (12 – 16). As a first step toward understanding the zoonotic disease risk posed by rats in densely urban environments, we assessed the presence and prevalence of known and novel microbes in Norway rats in New York City (NYC). We took the unique approach of using both targeted molecular assays to detect known human pathogens and unbiased high-throughput sequencing (UHTS) to identify novel viruses related to agents of human disease. We quantified the tissue distribution of these novel viruses in the host using molecular methods and, in some cases, identified the site(s) of replication using strand-specific quantitative reverse transcription (RT)-PCR (ssqPCR). Unlike previous urban studies that have primarily relied on serological assays to assess the total prevalence of historic infection, our data provide a snapshot estimate of the current level of infection in the rat population, a parameter more closely related to the risk of zoonotic transmission (12, 17). Furthermore, as previous work has focused on rats found exclusively in outdoor locations, we concentrated our sampling within the built environment, where direct and indirect human-rodent contact is more likely to occur (18). RESULTS Sample collection. A total of 133 Norway rats were collected from five sites in NYC. Males (n = 72) were trapped slightly more often than females (n = 61), and juveniles were trapped more often than any other age category. Of the female rats, 43% were juveniles, 26% were subadults, and 31% were sexually mature adults, whereas 40% of the male rats were juveniles, 33% were subadults, and 26% were sexually mature. Targeted molecular analyses. Specific PCR-based assays were used to screen for the presence of 18 bacterial and 2 protozoan human pathogens (see Table S1 in the supplemental material). None of the samples tested was positive for Campylobacter coli, Listeria monocytogenes, Rickettsia spp., Toxoplasma gondii, Vibrio vulnificus, or Yersinia pestis, despite previous studies documenting most of these in multiple rodent species (15). All other bacterial and protozoan pathogens were detected in at least one animal (Table 1). Three bacterial pathogens were identified in more than 15% of animals: atypical enteropathogenic Escherichia coli (EPEC) was the most common (detected in 38% of rats), followed by Bartonella spp. (25% of rats) and Streptobacillus moniliformis (17% of rats) (Table 1). Phylogenetic analysis of a 327-nucleotide (nt) region of the gltA gene amplified from all infected animals revealed three distinct Bartonella species infecting NYC rats (Fig. S1). The most common of these was detected in 76% of Bartonella-positive animals and clustered within the Bartonella tribocorum group, which has previously been identified in multiple species of Rattus in Asia and North America. In addition, sequences 98% similar to those of Bartonella rochalimae were recovered from seven rats, and Bartonella elizabethae was identified in a single rat (Fig. S2). Infection with Bartonella was positively correlated with the age of the rat (P 90% similar at the nucleotide level to viruses known to infect Norway rats (e.g., Killham rat virus, rat astrovirus, and infectious diarrhea of infant rats [IDIR] agent [group B rotavirus]) and were not pursued further (Table 2). Viruses from an additional 13 families or genera that were 0.05). Only 10 rats were infected with more than two bacterial species, of which eight were female, and no rats were infected with more than four bacterial species (Table 4). In contrast, 53 rats were positive for more than two viral agents, and 13 of these carried more than five viruses (Table 4). As many as nine different viruses or four different bacterial species were identified in the same individual, with a maximum of 11 agents detected in a single rat. Patterns of coinfection between all agents were significantly nonrandom across the complete data set (C score, P = 0.0001); however, the only significantly positive association between any two bacteria occurred between Bartonella spp. and S. moniliformis (P = 0.005). Significantly positive associations were also observed between Bartonella spp. and multiple viruses, including NrKoV-1 and NrKoV-2 (P 180 g for females, >200 g for males) (65). The rats were necropsied, and the following tissues aseptically collected: brain, heart, kidney (83 rats only), liver, lung, inguinal lymph tissue, upper and lower intestine, salivary gland with associated lymph tissue, spleen, gonads (25 rats only), and urine or bladder (when <200 µl of urine was available). Oral and rectal swab samples were collected using sterile polyester swabs (Puritan Medical Products Company, Guilford, ME), and fecal pellets were collected when available. All samples were flash-frozen immediately following collection and stored at −80°C. All procedures described in this study were approved by the Institutional Animal Care and Use Committee at Columbia University (protocol number AC-AAAE6805). Targeted molecular analyses. DNA and RNA were extracted from each tissue and fecal sample using the AllPrep DNA/RNA minikit (Qiagen, Inc.) and from urine or serum using the QIAamp viral RNA minikit (Qiagen, Inc.). The extracted DNA was quantified and diluted to a working concentration of ≤400 ng/µl. Extracted RNA was quantified, and ≤5 µg used for cDNA synthesis with SuperScript III reverse transcriptase (Invitrogen) and random hexamers. Samples were tested by PCR for 10 bacterial, protozoan, and viral human pathogens previously associated with rodents using novel and previously published PCR assays, including Bartonella spp., L. interrogans, Rickettsia spp., S. moniliformis, Y. pestis, Cryptosporidium parvum, T. gondii, hepeviruses, hantaviruses (consensus assay), and SEOV (see Table S1 in the supplemental material). Each assay was performed using a subset of the sample types from each rat, selected to include known sites of replication or shedding (Table 1). Fecal samples were further analyzed for the presence of the following eight bacterial pathogens commonly associated with human gastrointestinal disease, using PCR-based assays: C. coli, Campylobacter jejuni, C. difficile, C. perfringens, L. monocytogenes, S. enterica, V. vulnificus, and Yersinia enterocolitica (see Table S1 in the supplemental material). PCR was also used to test for the presence of pathogenic E. coli, including enteroinvasive (EIEC, including Shigella), enterohemorrhagic (EHEC), enterotoxogenic (ETEC), enteroaggregative (EAEC), and enteropathogenic (EPEC) E. coli strains, using primers targeting virulence genes (Table S1) (66). In all cases, positive PCR products were confirmed by bidirectional dideoxy sequencing. Before Ro-SaV2 detection in intestinal samples was attempted, intestines were pretreated to remove fecal contamination by thorough washing with phosphate-buffered saline (PBS). To verify the absence of fecal material in the intestines, a PCR assay for cucumber green mottle mosaic virus (CGMMV) was performed on cDNA from paired fecal and intestinal samples from Ro-SaV2-infected animals. CGMMV was present in 10/11 Ro-SaV2-positive fecal samples and likely originated from the cucumber provided as a water source in the traps. However, all eight intestinal samples that were positive for Ro-SaV2 were negative for CGMVV, suggesting true intestinal infection by Ro-SaV2. UHTS. Serum samples and fecal pellets or rectal swab samples were also extracted, using a viral particle purification procedure, for UHTS. Briefly, each sample was successively passed through 0.45 µM and 0.22 µM sterile filters (Millipore) to remove bacterial and cellular debris and was treated with nucleases. Samples were lysed in NucliSENS buffer, extracted using the EasyMag platform (bioMérieux), and prepared for sequencing using the Ion Torrent Personal Genome Machine system, following the methods of Kapoor et al. (19). Sequencing was performed on pools of four to six samples, which were combined at the double-stranded DNA stage. Viral sequences were assembled using the Newbler or miraEST assemblers, and both contigs and unassembled reads were identified by similarity searches using BLASTn and BLASTx against the GenBank nonredundant nucleotide sequence database (67, 68). Viruses related to those known to cause disease in humans were selected for further study and verified by PCR on original (unpooled) sample material with primers derived from the UHTS sequence data. Confirmed positive results were followed by testing of the serum (n = 114) or fecal samples (n = 133) from remaining animals, and in some cases, subsequent screening of additional sample types from select positive animals (Table 3). One or more positive samples were chosen for further sequencing of phylogenetically relevant genes by overlapping PCR. The 5′ UTRs of NrPV, MPeV, and RPV were determined by rapid amplification of cDNA ends (RACE) using the SMARTer RACE cDNA amplification kit (Clontech). SEOV Baxter qPCR. Primers were designed to target a 121-nt region of the N gene (Baxter.qF, 5′ CATACCTCAGACGCACAC 3′; Baxter.qR, 5′ GGATCCATGTCATCACCG 3′; and Baxter. Probe, 5′-[6-FAM]CCTGGGGAAAGGAGGCAGTGGAT[TAMRA]-3′ [6-FAM, 6-carboxyfluorescein; TAMRA, 6-carboxytetramethylrhodamine]). For tissue samples, viral RNA copy numbers were normalized to the quantity of the reference gene encoding glyceraldehyde 3-phosphate dehydrogenase (GAPDH), whereas the viral RNA copy numbers in serum, oral, and rectal swab samples were reported per ml of serum or PBS wash, respectively (69). qPCR assays were run in duplicate on each sample, and the results were averaged. Samples with an average of ≤2 normalized copies were considered negative. ssqPCR. For ssqPCR, strand-specific synthetic standards were generated by transcribing positive- and negative-sense RNA in vitro from pCRII-TOPO dual promoter vectors (Life Technologies) containing 310 and 594 nt of the NS3 genes of NrHV-1 and NrHV-2, respectively. Positive- and negative-sense RNAs were synthesized from HindIII- or EcoRV-linearized plasmids by transcription from the T7 or SP6 RNA polymerase promoter. In vitro transcription was carried out for 2 h at 37°C using the RiboMax large-scale RNA production system (Promega) and 500 ng of linearized plasmid. Plasmid DNA was removed from the synthetic RNA transcripts by treatment with DNase I (Promega) for 30 min, followed by purification with the High Pure RNA purification kit (Roche). Purified RNA transcripts were analyzed on the Agilent 2100 Bioanalyzer, and RNA standards were prepared by serial dilution in human total RNA. cDNA from both strands was generated using strand-specific primers containing a tag sequence at the 5′ end (see Table S2 in the supplemental material) (70). The RNA was preheated at 70°C for 5 min with 10 pmol of specific primer and 1× reverse transcriptase buffer, followed by the addition of a preheated reaction mixture containing 1 mM MnCl2, 200 µM each deoxynucleoside triphosphate (dNTP), 40 U RNaseOUT, and 1 U Tth DNA polymerase (Promega). The reaction mixtures were incubated at 62°C for 2 min, followed by 65°C for 30 min. The cDNA was incubated with preheated 1× chelate buffer at 98°C for 30 min to inactivate the Tth reverse transcriptase before exonuclease I treatment to remove unincorporated RT primers (New England Biolabs). Reaction mixtures lacking RT primer were included to control for self-priming, the strand specificity of each primer was assessed by performing the RT step in the presence of the uncomplementary strand, and reaction mixtures lacking Tth DNA polymerase were included to control for plasmid DNA detection. ssqPCRs were performed using TaqMan universal master mix II with primers, probe, and 2 µl of cDNA under the following conditions: 50°C for 2 min, 95°C for 10 min, and then 40 cycles of 95°C for 15 s, 50°C for 20 s, and 72°C for 30 s (see Table S2 in the supplemental material). The specificity of the reaction was monitored by RT and amplification of serial dilutions of the uncomplementary strand. The sensitivities of the ssqPCR assays ranged from 0.35 × 103 to 3.5 × 103 RNA copies/reaction mixture volume, and nonstrand-specific amplification was not detected until 3.5 × 107 viral RNA copies of the uncomplementary strand per reaction mixture volume were present (Table S3). Phylogenetic and sequence analyses. Nucleotide or predicted amino acid sequences were aligned with representative members of the relevant family or genus using MUSCLE in Geneious version 7 (Biomatters Ltd.) and manually adjusted. Maximum-likelihood (ML) and Bayesian Markov chain Monte Carlo (MCMC) phylogenetic trees were constructed for each alignment using RAxML version 8.0 and MrBayes version 3.2, respectively (71, 72). ML trees were inferred using the rapid-search algorithm, either the general time-reversible (GTR) plus gamma model of nucleotide substitution or the Whelan and Goldman (WAG) plus gamma model of amino acid substitution, and 500 bootstrap replicates. MCMC trees were inferred using the substitution models described above, a minimum of 10 million generations with sampling every 10,000 generations and terminated when the standard deviation of split frequencies reached <0.01. Phylogenetic analysis of Bartonella was performed by trimming the gltA gene sequences to a 327-nt region (nt positions 801 to 1127) commonly used for taxonomic classification and constructing a neighbor-joining tree using the Hasegawa, Kishino, and Yano (HKY) plus gamma model of nucleotide substitution (13, 73). Phylogenetic analysis of the flaviviruses was performed by first constructing a tree that included representative viruses across the family using a highly conserved region of the NS5B protein (aa 462 to 802 of tick-borne encephalitis virus; GenBank accession number NP_775511.1), followed by complete NS3 and NS5B amino acid phylogenies constructed separately for the Pestivirus and Hepacivirus/Pegivirus genera. These were rooted based on the relative positions of each genus in the family-level tree. To estimate the temporal and geographic origin of SEOV Baxter in NYC, phylogeographic analysis of the N gene was performed using the MCMC method available in the BEAST package (version 1.8.0) (74). All available full-length or nearly full-length SEOV N gene sequences with published sampling times were downloaded from GenBank, aligned as described above, and randomly subsampled five times to include a maximum of 10 sequences per country per year. The codon-structured SDR06 model of nucleotide substitution was used along with a relaxed, uncorrelated lognormal molecular clock and a constant population size coalescent prior (best-fit model, data not shown). Two independent MCMC chains were run for 100 million generations each, and convergence of all relevant parameters was assessed using Tracer version 1.5. The runs were combined after removing a 10% burn-in, and the maximum-clade-credibility (MCC) tree, including ancestral location-state reconstructions, was summarized. Transmembrane domain prediction was performed using TMHHM 2.0, putative N-glycosylation sites were predicted with NetNGlyc 1.0, and the presence of N-terminal signal peptides was predicted using SignalP 4.1, all of which were accessed through the ExPASy web portal (http://www.expasy.org). RNA secondary structures were predicted by MFOLD and through homology searching and structural alignment with bases conserved in other pestiviruses for NrPV and with parechoviruses, hunniviruses, and rosavirus for RPV (75). RNA structures were initially drawn using PseudoViewer, followed by manual editing (76). Statistical analyses. Potential associations between the presence of a microbial agent and the age and sex of the rat were explored using chi-square tests performed with SPSS version 21 (IBM, Armonk, NY), with associations considered significant at a level of α = 0.05. Tests for the overall effect of the age category or sex on the number of viruses carried by an individual were conducted using the Kruskal-Wallis test for age and the Wilcoxon test for sex. Patterns of pathogen cooccurrence within a single host were explored using the Fortran software program PAIRS version 1.1, which utilizes a Bayesian approach to detect nonrandom associations between pairs of taxa (77). The C score statistic was employed as a measure of pathogen cooccurrence (78). Nucleotide sequence accession numbers. The GenBank accession numbers for the agents sequenced in this study are KJ950830 to KJ951004. SUPPLEMENTAL MATERIAL Figure S1 Neighbor-joining tree of a 327-nt region of the gltA gene of Bartonella. The sequences derived from this study are indicated by a circle, and those recovered from Rattus norvegicus rats in Los Angeles, China, and Southeast Asia are given by squares and triangles, respectively. Download Figure S1, PDF file, 0.3 MB Figure S2 Predicted RNA secondary structure of the 5′ UTR of NrPV. Completely conserved nucleotides across all pestiviruses are colored in pink, and the conserved structural domains are indicated. Numbers indicate nucleotide positions from the 5′ end. Download Figure S2, PDF file, 0.5 MB Figure S3 (A) Predicted RNA secondary structure of the 5′ UTR of RPV. Completely conserved nucleotides across all parechovirus and hunnivirus 5′ UTR sequences are colored in pink, and the conserved structural domains are indicated. Numbers indicate nucleotide positions from the 5′ end. (B) ML tree of the complete VP1 gene of RPV and feline, bat, and canine picornaviruses. Selected representatives of the Enterovirus and Sapelovirus genera are also shown. When the BSP and BPP values are both ≥70%, the nodal support value is given beneath associated nodes in the format BSP/BPP. Download Figure S3, PDF file, 0.8 MB Figure S4 Unrooted ML tree of the complete VP1 gene of Manhattan parechovirus (MPeV) (indicated by a red branch) and representative members of the Parechovirus genus. When the BSP and BPP values are both ≥70%, the nodal support is shown beneath the associated node in the format BSP/BPP. HPeV, human parechovirus. Download Figure S4, PDF file, 0.3 MB Table S1 Primers used for targeted molecular analysis in this study. Table S1, DOCX file, 0.1 MB. Table S2 Primer and probe sequences used in the strand-specific reverse transcription and quantitative PCR assays of NrHV-1 and NrHV-2. Table S2, DOCX file, 0.1 MB. Table S3 Sensitivities and specificities of the strand-specific qPCR assays for NrHV-1 and NrHV-2. Table S3, DOCX file, 0.04 MB.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Reuse of public genome-wide gene expression data.

              Our understanding of gene expression has changed dramatically over the past decade, largely catalysed by technological developments. High-throughput experiments - microarrays and next-generation sequencing - have generated large amounts of genome-wide gene expression data that are collected in public archives. Added-value databases process, analyse and annotate these data further to make them accessible to every biologist. In this Review, we discuss the utility of the gene expression data that are in the public domain and how researchers are making use of these data. Reuse of public data can be very powerful, but there are many obstacles in data preparation and analysis and in the interpretation of the results. We will discuss these challenges and provide recommendations that we believe can improve the utility of such data.
                Bookmark

                Author and article information

                Contributors
                chm2042@med.cornell.edu
                Journal
                Microbiome
                Microbiome
                Microbiome
                BioMed Central (London )
                2049-2618
                3 June 2016
                3 June 2016
                2016
                : 4
                : 24
                Affiliations
                Dept. of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10021 USA
                Article
                168
                10.1186/s40168-016-0168-z
                4894504
                27255532
                8fc19c0b-3412-4338-9ed7-0ca34d0940b6
                © The MetaSUB International Consortium. 2016

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 17 December 2015
                : 15 April 2016
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/100000879, Alfred P. Sloan Foundation;
                Award ID: 2015-13964
                Funded by: Irma T. Hirschl and Monique Weill-Caulier Charitable Trust
                Award ID: Mason001
                Funded by: WorldQuant Foundation
                Funded by: FundRef http://dx.doi.org/10.13039/100007204, Vallee Foundation (US);
                Award ID: NA
                Funded by: FundRef http://dx.doi.org/10.13039/100000002, National Institutes of Health;
                Award ID: F31GM111053
                Award ID: R25EB020393
                Categories
                Meeting Report
                Custom metadata
                © The Author(s) 2016

                microbiome,biosynthetic gene clusters,built environment,next-generation sequencing,antimicrobial resistance markers

                Comments

                Comment on this article