3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Is stool frequency associated with the richness and community composition of gut microbiota?

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background/Aims

          Recently, a number of studies have reported that the gut microbiota could contribute to human conditions, including obesity, inflammation, cancer development, and behavior. We hypothesized that the composition and distribution of gut microbiota are different according to stool frequency, and attempted to identify the association between gut microbiota and stool frequency.

          Methods

          We collected fecal samples from healthy individuals divided into 3 groups according to stool frequency: group 1, a small number of defecation (≤2 times/wk); group 2, normal defecation (1 time/day or 1 time/2 day); and group 3, a large number of defecation (≥2–3 times/day). We evaluated the composition and distribution of the gut microbiota in each group via 16S rRNA-based taxonomic profiling of the fecal samples.

          Results

          Fecal samples were collected from a total of 60 individuals (31 men and 29 women, aged 34.1±5.88 years), and each group comprised 20 individuals. The microbial richness of group 1 was significantly higher than that of group 3 and tended to decrease with increasing number of defecation ( P<0.05). The biological community composition was fairly different according to the number of defecation, and Bacteroidetes to Firmicutes ratio was higher in group 1 than in the other groups. Moreover, we found specific strains at the family and genus levels in groups 1 and 3.

          Conclusions

          Bacteroidetes to Firmicutes ratio and the abundance of Bifidobacterium were different according to the stool frequency, and specific bacteria were identified in the subjects with large and small numbers of defecation, respectively. These findings suggest that stool frequency might be associated with the richness and community composition of the gut microbiota.

          Related collections

          Most cited references10

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Microbial Co-occurrence Relationships in the Human Microbiome

          Introduction In nature, organisms rarely live in isolation, but instead coexist in complex ecologies with various symbiotic relationships [1]. As defined in macroecology, observed relationships between organisms span a wide range including win-win (mutualism), win-zero (commensalism), win-lose (parasitism, predation), zero-lose (amensalism), and lose-lose (competition) situations [2], [3], [4]. These interactions are also widespread in microbial communities, where microbes can exchange or compete for nutrients, signaling molecules, or immune evasion mechanisms [4], [5], [6]. While such ecological interactions have been recently studied in environmental microbial communities [7], [8], [9], [10], it is not yet clear what the range of normal interactions among human-associated microbes might be, nor how their occurrence throughout a microbial population may influence host health or disease [11]. Several previous studies have identified individual microbial interactions that are essential for community stability in the healthy commensal microbiota [12], [13], [14], [15], and many are further implicated in dysbioses and overgrowth of pathogens linked to disease [16]. Each human body site represents a unique microbial landscape or niche [17], [18], and relationships analogous to macroecological “checkerboard patterns” [3] of organismal co-occurrence have been observed due to competition and cooperation [5], [9], [19], [20]. For example, dental biofilm development is known to involve complex bacterial interactions with specific colonization patterns [21], [22], [23]. Likewise, disruption of relationships among the normal intestinal microbiota by overgrowth of competitive pathogenic species can lead to diseases, e.g. colonization of Clostridium difficile in the gut [24]. However, no complete catalog of normally occurring interactions in the human microbiome exists, and characterizing these co-occurrence and co-exclusion patterns across body sites would elucidate both their contributions to health and the basic biology of their ecological relationships. Thus, characterizing key microbial interactions of any ecological type within the human body would serve as an important first step for studying and understanding transitions among various healthy microbial states or into disease-linked imbalances. As has been also been pointed out in macroecology, however, the analytical methodology needed to comprehensively detect such co-occurrence relationships is surprisingly complex [25]. Most existing studies employ simple measures such as Pearson's or Spearman's correlation to identify significant abundance relationships [13], [15], [26]. These methods are suboptimal when applied without modification to organismal relative abundances [27]. Since absolute microbial counts are not known and measurements depend on sampling and sequencing depth, an increase in one relative abundance must be accompanied by a compositional decrease in another, leading to spurious correlations among non-independent measurements [28]. In addition, sparse sequence counts can cause artefactual associations for low-abundance organisms with very few non-zero observations [27]. Conversely, association methods such as log-ratio based distances [28] that have been developed specifically for such compositional data are difficult to assign statistical significance, a vital consideration in high-dimensional microbial communities containing hundreds or thousands of taxa. Here, we have addressed these issues to catalog a baseline of normal microbial interactions in the healthy human microbiome. The Human Microbiome Project (HMP) [29] sampled a disease-free adult population of 239 individuals, including 18 body habitats in five areas (oral, nasal, skin, gut, and urogenital), providing 5,026 microbial community compositions assessed using 16S rRNA gene taxonomic marker sequencing [29]. We have developed a suite of methods to characterize microbial co-occurrence and co-exclusion patterns throughout the healthy human microbiome while suppressing spurious correlations. Specifically, these were 1) an ensemble approach including multiple similarity and dissimilarity measures, and 2) a compendium of generalized boosted linear models (GBLMs) describing predictive relationships, both assessed nonparametrically for statistical significance while mitigating the effects of compositionality. Together, these methods provide a microbiome-wide network of associations both among individual microbes and between entire microbial clades. Among the 726 taxa and 884 clades in the HMP data, we examined both intra-body site and inter-body site relationships as a single integrated microbial co-occurrence network. Each relationship represents co-occurrence/co-exclusion pattern between a pair of microbes within or between body sites among all subjects in the HMP (in contrast to studies within single subjects of microbial co-occurrences across biogeography, e.g. [30], [31]). This ecological network proved to contain few highly connected (hub) organisms and was, like most biological networks, scale-free. Co-occurrence patterns of the human microbiome were for the most part highly localized, with most relationships occurring within a body site or area, and there were proportionally few strong correspondences spanning even closely related body sites. Each pair of organisms was assessed for positive (e.g. cooperative) or negative (e.g. competitive) associations, and in many cases these patterns could be explained by comparing the organisms' phylogenetic versus functional similarities. In particular, taxa with close evolutionary relationships tended to positively associate at a few proximal body sites, while distantly related taxa with functional similarities tended to compete. The resulting network of microbial associations thus provides a starting point for further investigations of the ecological mechanisms underlying the establishment and maintenance of human microbiome structure. Results/Discussion We inferred a microbiome-wide microbial interaction network by analyzing 5,026 samples from the Human Microbiome Project (HMP) comprising 18 body sites, 239 individuals recruited at two clinical centers, and 726 bacterial phylotypes detected by 16S rRNA gene sequencing (Table 1). Our study aimed to determine co-occurrence and co-exclusion relationships among the relative abundances of microbial taxa across all individuals, potentially indicative of their ecological relationships. We thus combined two complementary approaches, namely an ensemble of multiple similarity and dissimilarity measures (henceforth “ensemble approach”) and a compendium of generalized boosted linear models (GBLMs, henceforth “GBLM approach”). Both methods were applied to the HMP data to produce microbial interaction networks in which each node represented a microbial clade (taxon or group of taxa) connected by edges that were weighted by the significance of their association (positive or negative). Spurious correlations due to compositional structure of relative abundance data [27] were prevented by a novel bootstrap and re-normalization approach assessing the degree of association present beyond that expected by compositionality alone. We used Simes method followed by Benjamini-Hochberg-Yekutieli false discovery rate (FDR) correction to combine the resulting networks (Figure 1). A detailed final network is provided in Figure S1, with a comparison of all networks in Figure S7 and additional information in Methods. This provided a single global microbial interaction network capturing 3,005 associations among 197 phylotypes, spanning all available body sites from the human microbiome (Figure 2; Table S1). 10.1371/journal.pcbi.1002606.g001 Figure 1 Methodology for characterizing microbial interactions using a compendium of similarity measures. 16S data from the Human Microbiome Project (HMP) were collected from 18 body sites in a cohort of 239 healthy subjects and assessed using 16S rRNA gene sequencing. We analyzed microbial co-occurrence and co-exclusion patterns in these data by developing two complementary approaches: a compendium of Generalized Boosted Linear Model (GBLMs) and an ensemble of similarity and dissimilarity measures. Each approach produced a network in which each node represented a microbial taxon within one body site, and each edge represented a significant association between microbial or whole clade abundances within or across body sites. The resulting association networks produced by each individual method were merged as p-values using Simes method, after which FDR correction was performed. Associations with FDR q-values>0.05, inconclusive directionality, or fewer than two supporting pieces of evidence were removed. This provided a single global microbial association network for taxa throughout the healthy commensal microbiota. 10.1371/journal.pcbi.1002606.g002 Figure 2 Significant co-occurrence and co-exclusion relationships among the abundances of clades in the human microbiome. A global microbial interaction network capturing 1,949 associations among 452 clades at or above the order level in the human microbiome, reduced for visualization from the complete network in Figure S1. Each node represents a bacterial order, summarizing one or more genus-level phylotypes and family-level taxonomic groups. These are colored by body site, and each edge represents a significant co-occurrence/co-exclusion relationship. Edge width is proportional to the significance of supporting evidence, and color indicates the sign of the association (red negative, green positive). Self-loops indicate associations among phylotypes within an order; for a full network of all phylotypes and clades, see Figure S1. A high degree of modularity is apparent within body areas (skin, urogenital tract, oral cavity, gut, and airways) and within individual body sites, with most communities forming distinct niches across which few microbial associations occur. 10.1371/journal.pcbi.1002606.t001 Table 1 16S rRNA gene sequencing data from the Human Microbiome Project used to assess microbial co-occurrence relationships in the human microbiome. Houston St. Louis Body Area/Site Total Total Female Male Total Female Male Oral 3022 2038 840 1198 984 456 528 Buccal mucosa 340 228 92 136 112 53 59 Hard palate 334 221 90 131 113 53 60 Keratinized gingival 337 226 95 131 111 51 60 Palatine Tonsils 340 225 92 133 115 54 61 Saliva 309 227 94 133 82 35 47 Subgingival plaque 341 228 92 136 113 53 60 Supragingival plaque 349 232 97 135 117 55 62 Throat 321 219 92 127 102 46 56 Tongue dorsum 351 232 96 136 119 56 63 Gut 351 228 94 134 123 58 65 Stool 351 228 94 134 123 58 65 Airways 282 190 82 108 92 37 55 Anterior nares 282 190 82 108 92 37 55 Skin 921 554 233 321 367 159 208 Left Antecubital fossa 158 85 37 48 73 25 48 Right Antecubital fossa 160 83 33 50 77 34 43 Left Retroauricular crease 303 198 87 111 105 50 55 Right Retroauricular crease 300 188 76 112 112 50 62 Urogenital 450 286 286 0 164 164 0 Mid vagina 149 93 93 0 56 56 0 Posterior fornix 150 95 95 0 55 55 0 Vaginal introitus 151 98 98 0 53 53 0 Total 5026 3296 2230 2324 1730 1292 1184 We considered microbial associations in a total of 5,026 samples from the Human Microbiome Project (HMP) comprising 18 body sites in 239 individuals recruited at two clinical centers (Baylor College of Medicine, Houston, TX and Washington University at St. Louis, MO), which in total contained 726 reliably detectable bacterial phylotypes. For details of HMP samples and data processing, see [29]. A global network of microbial co-occurrence and mutual exclusion within and among body site niches of the human microbiome Global properties of the microbiome-wide network of microbial associations are summarized in Figures 2 and 3. A dominant characteristic of the network was its habitat-specific modularity. After grouping the 18 body sites into five broad areas (oral, skin, nasal, urogenital, and gut), the large majority of edges were found clustered within body areas (98.54%), and these clusters were sparsely connected through a minority of edges (1.46%). This is confirmed by the network's high modularity coefficient of 0.28 (as defined by [32]) and Markov clustering of the network (see Methods and Figure S2). It has long been observed that sites within the human microbiome are distinct in terms of microbial composition [33], and this proved to be true of microbial interactions as well: microbial relationships within each body area's community were largely unique (Table 2). The microstructure of interaction patterns - and thus in the underlying ecology - was different for different areas, however. For example, all vaginal sites within the urogenital area were interrelated in a single homogeneous community, whereas interactions within the oral cavity suggested microbial cross-talk among three distinct habitats [34]. This can be observed quantitatively based on the proportions of microbial interactions spanning body sites within each area, e.g. 69.57% among the vaginal sites and 53.19% among the oral sites, both exceeding the microbiome-wide baseline. The skin was further unique in that the large amount (57.65%) of its associations related microbes in corresponding left and right body sites (left and right antecubital fossae and retroauricular creases), reflecting consistent maintenance of bilateral symmetry in the skin microbiome. 10.1371/journal.pcbi.1002606.g003 Figure 3 Global network properties summarizing key microbial hubs and interaction patterns. A) Node degree distributions of overall, co-occurrence, and co-exclusion associations in the human microbiome. This is well-fit by a power law with slope −1,7 (dotted red regression line, adjusted R2 = 0.9). Node degree indicates the number of links that connect a node to others in the network. Power law degree distributions, referred to as scale-free, mean that most nodes have only a few edges and are often connected by a few high-degree hub nodes. The top five most connected hubs as indicated in callouts, mainly signature oral taxa including Porphyromonas in the tongue dorsum. B) and C) Node proportions after division of the network into body sites (B) or classes (C). Both pie charts show that the composition of the network (in agreement with underlying data) is skewed towards the oral cavity (B) and its constituent Firmicutes (including Bacilli and Clostridia) (C). (B) further agrees with published measures of body sites' alpha diversity [84]. D) and E) Composition of relationships among microbes grouped according to body site (D) and taxonomic class (E). In E), the first two bars (green and red) include the fraction of all possible edges incident to at least one node representing a class or one of its members (root scaled for visualization). The second two bars (lime and orange) only include pairs of microbes that are members of the same class, again normalized as a fraction of total possible interactions and root scaled. The Bacilli, Bacteroidia, and Fusobacteria contain significantly more negatively associated microbes than expected by permutation testing (see Table S2), and classes overall are depleted for negative associations, indicating that members of the same class tend not to compete strongly with each other in these communities. 10.1371/journal.pcbi.1002606.t002 Table 2 Summary statistics of microbial associations in the normal human microbiota. Edges # of edges Percent Within same body area 2961 98.54% Within same body site 1409 (47.59%) Among skin sites 196 6.52% Between left and right skin sites 113 (57.65%) Within the airways (anterior nares) 31 1.03% Among oral sites 2598 86.46% Between different oral sites 1382 (53.19%) Within the gut 67 2.23% Among vaginal sites 69 2.30% Between different vaginal sites 48 (69.57%) Total 3005 Microbial co-occurrence and co-exclusion relationships summarized within the five major body areas and relationships spanning different body sites within these areas. Percentages are fractions of the total number of edges in the network, while percentages in parentheses represent fractions of edges within each body area. We began decomposing the network by categorizing microbial associations within each body area into body-site-specific relationships of two types: cross-site and within-site interactions. On average, these two classes make up 53.11 and 46.89 percent of the total edges, respectively (Table 2). First focusing on cross-site associations, a majority (66.10%) of such relationships were co-occurrences between the same or taxonomically related clades in proximal or bilateral body sites. This reflects coordinated community structure among ecologically related niches, such as similar dental plaques, vaginal sites, and bilateral skin sites. Body sites specifically connected by many positive associations were either in direct contact (e.g. tongue and saliva), proximal (e.g. sub- and supragingival plaques), or similar in terms of environmental exposure (e.g. bilateral skin sites), thus providing mechanisms to support comparable microbiota and exhibiting high levels of microbial co-occurrence. This pattern held true for the minority (33.90%) co-exclusions as well, with many occurring between bilateral skin sites or within subgroups of the oral cavity [34]. This suggested that the first level of hierarchical co-occurrence structure in this network corresponded with groups of body sites representing distinct microbial habitats. Conversely, within-site relationships showed a much more balanced ratio of microbial co-occurrence (48.26%) vs co-exclusion (51.74%) interactions. Many of the negative within-site relationships were associated with the abundant signature organisms characteristic of each body site [35], for example Streptococcus in the oral cavity and Bacteroides in the gut. The relative abundances of these signature taxa varied greatly among individuals, in some cases (e.g. Bacteroides) spanning from 1% to 97% within a body site across the HMP population. It is generally very difficult to determine from relative abundance measurements alone whether these negative associations represent true anti-correlation (e.g. one organism out-competing another) or overgrowth of one organism while the rest of the population remains unchanged (resulting in a negative correlation due to compositionality of these data). This problem has a long history in quantitative ecology [27], [28]. Our methods generally determine these relationships in the human microbiome to be stronger than what would be expected from compositionality alone (see Methods and Text S1), and the negative interactions detected here are thus likely biologically informative. This is supported by the fact that they are strongest in cases where distinct alternative dominant community members occurred among different individuals (e.g. Prevotellaceae vs. Lactobacillaceae in the vaginal area [36] or Propionibacterium vs. Staphylococcus on the skin [35], [37]). The increase in negative interactions within habitats is also in line with the fact that most competitive mechanisms require proximity or physical contact [38], whereas positive interactions are likely to also occur from microbiome-wide shared environmental exposures. Association properties globally and within body sites demonstrate the basic ecological organization of the human microbiota We further assessed several other measures of network community structure. Globally speaking, the network followed a scale-free degree distribution typical of biological systems, meaning that most clades possessed few interactions but a few clades possessed many (Figure 3A [39]), The network had a low average path length of three (contrasted with six in randomized networks), meaning that short paths existed between most clades [40], and it possessed a low average per-node cluster coefficient (0.1) measuring the local density of connections. Together, these values indicate that the microbial association network is structured to be scale-free and thus robust to random disruption [39], with only sparse local multi-organism clusters. Since these data only describe phylotypes at approximately the genus level, it remains to be seen whether a greater degree of locally clustered functional associations emerges among Operational Taxonomic Units (OTUs), species, or strains within these phylotypes. As the cluster coefficient distribution was not well described by the inverse node degree distribution [41], the network possesses no strong hierarchical modularity despite its scale-freeness, in contrast to the strong habitat-centric modularity. The diversity of microbial interactors (i.e. number of unique phylotypes) within each body site also proved to directly dictate its interaction density (Figure 3B). That is, communities with a greater number of different organisms had a proportionally greater number of positive and negative associations. Within these sites, the number of relationships scaled directly with the number of unique phylotypes (adjusted R2 of 0.75), the only body site with more interactions than expected for its diversity being the tongue dorsum (see also Table S2). This site also harbored the top-ranking hub phylotype (Firmicutes, see Figure 3A). In combination with the behavior of specific microbial hubs as discussed below, this might argue that most microbial taxa form strong metabolic or functional associations with adjacent taxa inhabiting the same body site habitat, allowing consortia to specialize within highly localized microbial niches [33]. When randomizing between rather than within body sites, no body site pairs possessed more cross-site associations than expected (with the slight exception of tongue dorsum), whereas most body sites were significantly enriched for within-site relationships (the only exceptions being posterior fornix, mid-vagina, and antecubital fossae, which tended toward too few phylotypes to reach significance; see Figure 3D and Table S2), again confirming the microbiome's habitat-driven modularity. When calculating network properties in a body-area-specific manner, we found that the overall average path length between nodes in the oral cavity, which contributes most of the samples, was much larger (∼3.4) than those of the other body areas (ranging from ∼1.1 to ∼2.0). In addition to supporting the aforementioned degree of inter-site habitat formation in the oral cavity, this intriguingly suggests that other body sites in which fewer samples are currently available (see Table 1) have not yet exhausted the detection of microbial relationships in the human microbiome. More samples and greater sequencing depth may further improve detection power. Key taxa including members of the Firmicutes act as network hubs coordinating many relationships throughout the microbiome We next examined the associations of individual clades with respect to interaction degree, observing highly connected “hub” clades to be found within each body area. Two classes of hubs appeared in the association network: clades highly connected within one body site, and clades acting as “connectors” between multiple body sites. Hubs included both specific taxa (e.g. Porphyromonas, see Figure 3A, Table S3) and larger taxonomic groupings (e.g. the phylum Firmicutes). Within-site hubs were often, although not always, abundant signature taxa (detailed below), high-degree exceptions including Atopobium on the tongue (28 total associations, 16 within-site) and Selenomonas on both tooth plaques (20 total/19 within and 7 total/3 within for supra- and subgingival, respectively). The latter provides a striking example of the niche-specificity of these low-abundance within-site interactors, as Selenomonas averages only 1.1% and 1.2% of the sub- and supragingival plaque communities, respectively, but associates preferentially (20 of 27, 74%) with members of the greater oxygen availability supragingival community. The clade's detection as a within-site hub thus corresponds with the ecology that might be expected of an organism known to be oxygen-sensitive, fastidious, and grown best in co-culture [42]. Between-site hubs typically operated among body sites within the same area as described above, with two of the five most connected hub clades in the network falling into this connector category linking multiple body sites, Firmicutes and Proteobacteria on the tongue (see Figure 3A). The Firmicutes and Porphyromonas (phylum Bacteroidetes) hubs in the tongue also had the largest numbers of negative connections among all phylotypes, and all of these highly interactive clades centered on the tongue and spanned multiple related oral habitats. Signature clades such as the Firmicutes are of course highly functionally diverse, and this network suggests that the few abundant members in any one habitat [35] might instead serve as “information processors” throughout a body area. In contrast to the low-abundance within-site hubs, this would allow them to provide baseline functionality complemented by distinct, less abundant clades with which they co-occur within differing body site habitats. Correspondingly, Firmicutes and other inter-site hub nodes showed a higher connectivity than the clades with highest intra-site degree (e.g. Bacteroidales in the subgingival plaque). Such clades with unusually frequent inter-site associations are thus outliers relative to the network's overall habitat-specific trend and suggest that inter-site hubs are particularly critical for associating similar sites within the same body area. In the oropharynx, for example, Streptococcus spp. with a modest degree of functional variation might be present throughout the habitat, interacting with distinct, more specialized clades within each body site [13]. Almost all such high-connectivity hubs occurred among oral sites (e.g. Porphyromonas, Streptococcus, Veillonella, and others), the first notable exception being the Propionibacterium hub on skin sites (left and right retroauricular crease). All of these follow the same pattern, however, in which abundant phylotypes likely possessing within-clade functional diversity are distributed among related habitats within each individual. Marked differences in ecological behavior between phylogenetic clades We additionally examined the phylogenetic rather than biogeographical distribution of these associations, testing whether clades tended to support more phylogenetically related (within-clade) or diverse (between-clade) interactions. We first investigated purely quantitative degree distributions by summarizing clades at the class level. Associations were summarized as the fraction of all possible interactions that were observed to occur, separated into positive and negative bins (Figure 3E). In addition, clade-specific over-representation of these bins was tested for significance by randomization (see Methods and Table S2). The only classes that showed significantly more negative (and, simultaneously, cross-clade) associations than expected were the Bacteroidia, Bacilli, and Fusobacteria. Most of the common classes in the human microbiome had more intra-clade edges than expected by chance (Actinobacteria, Bacilli, Bacteroidia, Betaproteobacteria, Clostridia, Epsilonproteobacteria, Fusobacteria, Gammaproteobacteria, Mollicutes, and Spirochaetes), most of which also have high cluster coefficients (Figure S3). Taken together with the biogeographical interactions assessed above, the enrichment for within-class associations likely indicates a phylogenetic aspect of the same behavior. Specifically, if one member of such a class is abundant in one body site within an individual, it (or closely related class members) also tends to be enriched in related body sites. We next considered relationships between class-level clades throughout the microbiome, summarized in Figure 4. Surprisingly, the Actinobacteria and Bacilli form only co-exclusion relationships with other classes, most strongly with Bacteroidia and Fusobacteria, and primarily within the oral cavity. These clades (which include the extremely abundant streptococci) might thus be largely self-sufficient in the functional diversity needed to maintain an oral community, excluding other clades when appropriately supported by e.g. environmental factors. Although a few classes were linked by positive as well as negative interactions (e.g. Clostridia and Bacteroidia), none of these reached significance on randomization. Classes connected by both positive and negative links might suggest either that the clades exhibit co-occurrence only in some environments or that some members of the two classes co-occur while others co-exclude. As the oral communities are both the most data-rich and the most alpha-diverse in the human microbiome [35], it is not surprising that most relationships are observed within and among them. For instance, 97% of the specific mutual exclusions between Bacilli and Bacteroidia members occur in oral sites, as do 81% of the members of the Clostridia and Bacteroidia. The second largest contribution to the latter exclusion (∼18%) comes from the gut, reflecting the frequently discussed Bacteroides/Firmicutes ratio observed in Western populations [15], [43], and similar tradeoffs (with few positive associations) were observed in other habitats such as the skin (e.g. Staphylococcus in the Bacilli and Propionibacterium in the Actinobacteria [37]). 10.1371/journal.pcbi.1002606.g004 Figure 4 Co-occurrence of microbial clades within and among body areas. Nodes represent microbial classes colored by phylum, with edges summarizing aspects of their interactions over all body sites. Classes are linked when the number of edges between them is significantly larger than expected (randomization p 5% abundance. While a trivial explanation might be misclassification of a small portion of Bacteroides, this trend might instead suggest co-occurrence in a metabolic niche that rarely favors either organism but, in the rare occasion of favoring one, permits both. Conclusions We analyzed ecological interactions among bacteria in the human microbiome using 16S marker gene abundance data from the Human Microbiome Project. Our methods for building a microbiome-wide microbial association network combined two complementary approaches: an ensemble of similarity/dissimilarity measures and a compendium of generalized boosted linear models. Relationship significance was assessed using a novel nonparametric approach to compositional data analysis, resulting in a network of co-occurrence and co-exclusion relationships representing potential microbial interactions and incompatibilities within and across body sites. Analysis of the network demonstrated strong organization of the human microbiota into body area niches, mostly among closely related individual body sites representing microbial habitats. A few “hub” microbes were observed to act as signature taxa driving the composition of each microcommunity. Many of these were also the dominant species within a body area, for example Streptococcus in the oral cavity and Bacteroides in the gut, and these highly abundant taxa also frequently co-occurred as connectors among multiple related body sites. In vivo mechanisms were available from prior work for many of these associations, and more generally the phylogenetic and functional relatedness of pairs of co-occurring microbes often explained their associations. In particular, phylogenetically related microbes tended to co-occur at proximal or environmentally similar body sites, while distantly related microbes with shared functional capacities tended to compete. This microbial association network was described from observational data, and the mechanisms underlying any of these putative interactions may be quite diverse. Positive co-occurrence association types could include nutritional cross-feeding, co-aggregation, co-colonization, signaling pathways, and co-survival in similar environments [4], [63]. Negative exclusion interactions likewise might span toxin or small molecule production, environmental modification (to the detriment of microbial neighbors), immunomodulation, or gross overpopulation of a niche. Ecologically, these data alone do not resolve variations of mutualism, commensalism, amensalism, or predator-prey relationships [4], [63]. Further, all of these ecological relationships, detected here based on microbial abundance patterns across many subjects, are themselves distinct from the biogeographical “co-occurrence” patterns observed by previous studies of individual microbes within subjects [30], [31]. To distinguish between these, future work could include perturbation experiments (e.g. the removal of a species from a defined habitat such as the gut of a gnotobiotic mouse), as these are becoming less difficult to sustain technically [64]. Analytic refinements might instead include defining directionality of relationships in higher-resolution (e.g. temporal) data; for instance, we expect a strict mutualistic relationship (where both partners cannot exist without the other) to be symmetric, whereas the relationship between a prey and a specialized predator is expected to be asymmetric (the prey can occur without its predator, but not vice versa). Negative co-exclusions may have fewer possible initial interpretations, comprising the types of competition outlined above, or they may indicate different, exclusive microbial community states occurring temporally or as linked to host environment [15], [36], [58]. Methodologically, it is again important to emphasize that detecting significant co-occurrences among members of a population assayed as relative abundances can be surprisingly difficult due to compositionality [27]. That is, an absolute increase in one organism's abundance can result in an apparent relative decrease of all other abundances, leading to spurious correlations. Extensive prior work has explained the problem in microbial and macroecological settings [65], and we have mitigated potential issues in these data through our ensemble approach and by principled calculation of significance thresholds using null distributions that incorporate the degree of similarity due solely to compositional effects (see Methods and Text S1). GBLMs were the most distinct method included in this ensemble and share some similarities with recently proposed genetic regulatory network (GRN) reconstruction techniques [66]. GBLMs do provide methodology for discovering GRN-like higher-order interactions in microbial communities, but the accuracy needed to overcome the associated multiple hypothesis testing problems is not yet achievable from available 16S data [67], [68]. We anticipate that future studies with species- or strain-level classification of deep shotgun metagenomic sequences may provide sufficient resolution for more detailed networks including such cooperative microbial associations. While it might be hoped that easily sampled microbiomes such as the saliva would serve as proxies for e.g. the broader oral microbiome, these results suggest that this is not generally the case. There are few strong correspondences among organisms even at closely related body sites, let alone distal sites, and very few cases where microbial abundance is quantitatively predictable from a proxy sample. In the HMP, this may be a feature of a healthy population, and additional relationships (or disruption of existing ones) might emerge in the presence of disease. Environmental factors that strongly impact the healthy microbiome may additionally not be captured for this population (e.g. diet) and can be further investigated by targeted methodology in future cohorts. This catalog of microbial co-occurrence and co-exclusion relationships thus provides an initial glimpse of potential mechanisms of community organization throughout the human microbiome. While this computational methodology can be applied to any communities assayed using marker gene sequencing, it is interesting to conclude by noting that the resolution of the resulting network is limited by the specificity of 16S sequence binning. The network discussed here, for example, leverages two specific hypervariable regions for taxonomic classification, each with strengths and weaknesses, and neither individually adequate for sequence classification at the species level [69], [70]. Since it is likely that additional microbial associations will occur at the species or strain level, we anticipate that further community structure will emerge during analysis of metagenomic shotgun sequences taxonomically binned at a finer level of detail. Community shotgun sequencing will also provide functional information regarding metabolism, signaling, and, again, potential physical mechanisms of interaction, which can in turn be matched against complete reference genomes for co-occurring strains. Perturbation analyses in co-culture or, eventually, longitudinal studies in human cohorts will provide an intriguing means of investigating the impact of these microbial “wiring” diagrams on human health. Methods Two complementary approaches, an ensemble of multiple similarity/dissimilarity measures and a compendium of generalized boosted linear models (GBLMs), were used to interrogate significant associations between microbial abundances. These were drawn from 18 body sites assayed by the Human Microbiome Project at two clinical centers using 16S rRNA gene sequencing. Simes method and Benjamini-Hochberg-Yekutieli false discovery rate (FDR) correction were used to combine the resulting networks. From this merged, global network, we summarized overall network properties (degree distribution, modularity, etc.), assessed patterns of microbial connectivity within and among body sites, and identified highly connected (hub) microbes. Phylogenetic and functional distances were calculated based on 16S rRNA gene sequence similarity and shared orthologous gene families, respectively, and combined with the network. Please see Text S1 for an extensive discussion of the methodology used to assess relationship significance in compositional data, which is presented in summary below. 16S data acquisition and processing The 16S rRNA gene-based dataset of the normal (healthy) human microbiome was made available through the Human Microbiome Project (HMP) and is detailed in [29]. Briefly, it consists of 454 FLX Titanium sequences spanning the V1 to V3 and V3 to V5 variable regions obtained for 239 healthy subjects enrolled at clinical sites in Houston, TX and St. Louis, MO. These cover 18 body sites covering five areas: the oral cavity (nine sites: saliva, tongue dorsum, palatine tonsils, keratinized gingiva, hard palate, buccal mucosa, throat, and sub- and supragingival plaques), the gut (one site: stool), the vagina (three sites: introitus, mid-vagina, and posterior fornix), the nasal cavity (one sample: anterior nares), and the skin (four sites: left and right antecubital fossae and retroauricular creases). Sequences of both 16S windows were processed separately using mothur [71] into phylotypes using the RDP taxonomy as described in [29] and [68], with full protocols also available on the HMP DACC website (http://hmpdacc.org/HMMCP). Genus level and above phylotypes were used for this analysis, for which the datasets from both windows were combined. This resulted in more than 5,000 samples comprising 910 taxa made available as part of the HMP (http://hmpdacc.org/HMMCP). These were further processed for this study by excluding any phylotype not supported by at least two sequences in at least two samples. Samples were removed as suspect if the most abundant taxon was detected by fewer than 1% of the sequences supporting it in the sample in which it was most abundant, and counts for the remaining 726 taxa were converted to relative abundances in each of the resulting 5,026 samples. Due to potential differences between clinical centers, the dataset was conservatively split into two subsets for further analysis, subjects recruited in Houston (3,296 samples) and those recruited in St. Louis (1,730 samples). Generalized Boosted Linear Models GBLM definition and construction For each resulting dataset, a compendium of generalized boosted linear models (GBLMs) was constructed by selecting all 324 combinations of source body sites ss and target sites ts. Each GBLM was fit using the abundances of all source taxa st within the source site to predict the abundance of each target taxon tt within the target site using a sparse linear model of the form: All additional non-leaf clades in the RDP [72] taxonomy (i.e. families, orders, etc. up to the bacterial and archaeal domains) were included as source and target taxa. For ss equal to ts, i.e. predicting the abundance of a taxon tt when the abundances of all taxa in the same body site are known, the abundances of all parent and descendant clades of tt were excluded from the available source taxa st. That is, when ss = ts and tt was of the form domain|phylum|class|…|clade, all source taxa of the form domain, domain|phylum, domain|phylum|class, etc. or of the form domain|phylum|class|…|clade|subclade, domain|phylum|class|…|clade|subclade|subsubclade, etc. were excluded from the source taxa st. This prevented the abundances of xtt ,ts from being predicted using abundances xst ,ss on which they were directly dependent, while allowing the detection of predictive relationships between distinct clades within the same body site. The linear model was generalized to include binary categorical target taxa (in this case only gender and ethnicity) using standard logistic regression: As this is clearly an extremely high-dimensional problem, multiple a priori and post hoc steps were taken to enforce model sparsity and to avoid overfitting for each (ss, ts, tt) tuple. The first of these was to exclude from the available st any taxon not correlating with tt at a Spearman correlation of nominally p<0.05. The second was to boost linear model fitting rather than attempt to fit all βtt ,ts,st,ss simultaneously [73]. Boosted linear models retain the usual L2 least squares penalty, but are constructed in a manner similar to sparse forward variable selection or the LASSO [74]. βs are considered for inclusion in the model one at a time and the parameter minimizing sum of squared error selected and included. However, each subsequent round of parameter fitting operates on the residuals of all previous rounds, thus “upweighting” poorly fit examples, and the inclusion of further non-zero βs stops after a fixed number of iterations. This tuning parameter and the model fitting process was 10-fold cross-validated and selected from the most accurate (by root mean square error for continuous tt and AUC for binary) of 50, 100, or 150 boosting iterations using the caret [75] and mboost [76] R packages. This resulted in a compendium of 7±4.9 non-zero parameters β, each evaluated with a 10× cross-validated R2/AUC and a nominal R2/AUC on the full dataset. Final model quality scores were assigned by A) subtracting AUCs below 0.5 from one, since caret does not calculate AUCs directionally, and B) retaining the minimum of the cross-validated and nominal R2/AUC. Any continuous model not achieving an R2 above zero after adjustment for the number of non-zero parameters ( ), parameters p, training samples n) was discarded (55,424 retained). GBLM filtering and significance Even from cross-validated goodness-of-fit scores, the compositional structure of relative abundance data prevents straightforward assessment of model significance (see Text S1). We thus additionally fit twenty models per (ss, ts, tt) tuple after bootstrap re-sampling the values of tt across samples. The AR2/AUC of these bootstrap models provided a confidence interval around the observed AR2/AUC. Any model for which the 90% confidence interval failed to include the observed AR2/AUC was discarded. To construct the null distribution of associations due to compositionality alone, we fit twenty additional models after permuting the values of tt across samples and renormalizing them sample-wise, thus retaining compositional effects but breaking true associations. The GBLM was re-fit and the resulting null distribution of AR2/AUC values used to assess significance of the true model. The mean and standard deviation of the bootstrap distribution was z-tested against this null distribution using the jointly pooled standard deviation, providing one p-value per model per clinical center. Any model with FDR adjusted p-value greater than 0.05 was discarded (18,286 retained). Ensemble scoring Data preprocessing The 16S data described above were first normalized by dividing each sample by its total phylotype sum. Mislabeled samples were removed [77] and samples were again processed as two subsets, one per clinical center (Houston and St. Louis). These were encoded as a matrix in which each row represented a phylotype in a specific body site and each column represented an individual during one sampling visit. Rows with more than 2/3 zero counts were removed, leaving matrices of 1,217 (Houston) and 1,408 (St. Louis) phylotype-bodysite composite features collected for 248 and 144 subject-visit points, respectively. Ensemble score calculation We built on composite co-occurrence scores as described in, for example, [78] (for protein functions) and [19] (graph clustering) to find groups of microbial lineages co-existing across a large number of environments. Specifically, we combined four diverse measures in order to overcome two major challenges in the inference of co-occurrence networks, particularly appropriateness of scoring measures to sparse count data and determination of statistical significance. The first is exemplified by the double-zero problem, in which a zero indicates either that an organism is absent or that it is below detection limit [25]; Pearson and Spearman correlations are sensitive to this, while the Bray Curtis dissimilarity is not. The latter issue arises from the need to normalize across samples with unknown absolute abundances, either by relativizing or by downsampling; either procedure results in constrained sample sums, which introduce artificial correlations [27]. We thus employed an ensemble approach combining four diverse measures: two measures of correlation (Pearson and Spearman) and two measures of dissimilarity (Bray-Curtis (BC) and Kullback-Leibler (KLD)). For BC and KLD calculations, rows were divided by their sum prior to computation. Additional measures were considered for our ensemble, including the Hellinger, Euclidean and variance of log-ratios, but these proved to be well-represented by the smaller final ensemble (see Figure S6). Ensemble network building After running each of the above measures on the two 16S data matrices, one per clinical center, we set measure-specific thresholds as a pre-filter such that each measure contributed 1,000 top-ranking and 1,000 bottom-ranking edges to the network. Edge scores were computed only between clade pairs without parent-descendant relationship (e.g. without pairs of the type Actinobacteridae|Actinomycetales or Actinomycetales|Propionibacterineae) for clades in the same body site. To assign statistical significance to the resulting differently-scaled scores, we first computed edge- and measure-specific permutation and bootstrap score distributions with 1,000 iterations each. In order to address the compositionality issues discussed above [27], we re-normalized the data in each permutation, providing a null distribution that captures the similarity introduced by compositionality alone (see Text S1). We then computed the p-value as above by z-scoring the permuted null and bootstrap confidence interval using pooled variance. P-values were tail-adjusted so that low p-values correspond to co-presence and high p-values to exclusion. For BC and KLD, we did not compute re-normalized permutations, because these measures are intrinsically robust to compositionality [28]. Instead, we calculated their p-values using the bootstrap interval compared to a point null value that was computed by permutation. Finally, to remove unstable edges, we removed all edges whose score was not within the 95% confidence interval (limited by the 2.5 and 97.5 percentiles) of the bootstrap distribution. Additionally, a number of BC-supported negative links were removed because they were due to abundance profiles including one extreme outlier. This affected the following clades for St. Louis: Actinomycetales in stool, Corynebacterium and Corynebacteriaceae in the tonsils, Lactobacillus and Lactobacillaceae in the anterior nares and for Houston: an unclassified Neisseria in the left retroauricular crease. Mitigating the compositional effect in relative abundance analysis Data summarized as relative abundances are referred to as compositional [28]. Because they sum to one, their elements are not independent and may exhibit spurious correlations regardless of the true underlying relationship. To mitigate this effect, we assessed the significance of our ensemble scores and GBLMs as described above, each by comparing a bootstrap confidence interval around the observed score with a permuted null distribution that includes repeated renormalization to account for compositional effects alone. In each permutation, only the target taxon row was randomized and all samples subsequently renormalized to a constant sum. Because permutation breaks correlation structure while renormalization reintroduces compositionality, a null distribution coupling these elements induces correlation from compositional structure alone. Comparing this null distribution to a standard bootstrap confidence interval around the observed value provided a straightforward nonparametric test of association accounting for compositionality. Simulation studies showed that the bootstrap-renormalization scheme was successful in discounting compositional effects while preserving true correlations (see Text S1). Network merging by Simes method and FDR correction Simes method was used to combine all ten networks (5 methods×2 study centers) into one final network, as it is robust against non-independent tests [79]. A strict intersection of the two clinical centers' networks, rather than a p-value combination, was also examined and found to be over-stringent due to systematic differences in the data (Figure S8). After merging, p-values on each final edge were corrected to FDR q-values using the Benjamini-Hochberg-Yekutieli method and a q-value cutoff of 0.05 was applied. The positivity or negativity of each relationship was determined by consensus voting over all integrated data sources, ranging from −10 (most negative) to 10 (most positive, see Figure 2). Edges with indeterminate directionality (direction score of zero) were removed. Finally, only edges with at least two (out of ten) supporting pieces of evidence were retained. Computation of network modularity The formula by Clauset et al. [32] compares the fraction of edges within input clusters with the fraction of within-cluster edges that would be expected for a randomized network. We clustered the network using the Markov cluster algorithm (MCL) [80] and computed network modularities for a range of inflation parameters. The strongest modularity (0.28) was measured for an inflation of 1.3 and is slightly below the cut-off recommended by Clauset et al.. This modularity was, however, higher than any measured for 100 randomized networks (which preserved node and edge number, but in which edges were randomly re-assigned) and was therefore retained as significant. Assessment of significant connectivity density within and among body sites and classes To assess whether specific body sites were more connected than expected by chance, we repeatedly (1,000 times) selected as many nodes as a body site contains from the global network at random and counted their edge number. This resulted in a distribution of edge numbers for random node sets. To retain only plausibly significant edges for further calculation, we corrected for multiple testing by multiplying the nominal p-value from this distribution with the number of tests carried out and retaining values below 0.05. We repeated this test separately for positive, negative, intra- and cross-edges. For visualization, the network itself was also separated into within-site, within-area, and between-area subsets for further inspection (Figure S9). Assessment of relationships between body-site-specific and class-specific clade groups To assess the interaction strength between body site and clade groups (Figures 4 and 5), the number of relationships between nodes in each group pair was counted and normalized by the group member product, which represents the number of links two groups can potentially form. This was repeated in 1,000 randomized networks (generated as described for the computation of network modularity). A p-value was computed from the count distribution and node group relationships with p-values above 0.05 were discarded, retaining only the fractions of total possible relationships which were significantly higher than those expected by chance. Phylogenetic and functional similarity scores Phylogenetic distances Genome sequences of 1,107 organisms were retrieved from NCBI (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi, December 2010) and 16S sequences were extracted. These 16S sequences were aligned using MUSCLE 3.8.31 [81] and a full phylogenetic tree reconstructed by maximum likelihood using FastTree 2.1 [82]. A matrix of all pair-wise distances was created from this cladogram and distances between any two nodes (e.g. families, orders, etc.) calculated by taking the median of all distances (as provided in units from FastTree) between all pairs of leaf taxa descending from the two nodes. Functional distances Functional complements of the same genomes were summarized using COG [83] families as assigned by NCBI annotations. This resulted in an abundance matrix with 4,685 columns (corresponding to COG families) and one row per genome. Columns summing to less than 10% of the number of genomes were removed, resulting in 3,514 usable COG families. Pairwise scores between genomes were calculated using Jaccard index, with distances for higher-level clades computed using medians in the same manner as described above. Final functional distances were represented as the Jaccard index of non-shared COG families between pairs of genomes. Supporting Information Figure S1 Significant co-occurrence and co-exclusion relationships among the abundances of clades in the human microbiome. The network displays all significant phylotype associations within and across the 18 body sites sampled by the HMP. Nodes represent phylotypes (colored according to the body site in which they occur) whereas edges represent significant relationships between phylotypes. Edge thickness reflects the strength of the relationship, and edge color its directionality (green co-occurrence, red co-exclusion). (PDF) Click here for additional data file. Figure S2 Markov clustering of the complete phylotype network. Markov-clustered network (inflation parameter: 1.3). When clustering the cross-body site network with this inflation parameter giving optimal modularity, the network splits into the set of depicted clusters (75 in total). Many of them are specific to body sites (stool, anterior nares) or areas (mouth, vagina, skin). (TIF) Click here for additional data file. Figure S3 Cluster coefficients of association networks within individual body sites and clades. Average cluster coefficients (computed with tYNA [85]) of body-site-specific (A) and class-specific (B) sub-networks. The “cliquishness” of each node within a body site or class is expressed by the average cluster coefficient, which is higher when the neighbors of each node are also connected among themselves. It can be zero if none of the nodes in the sub-network has inter-linked neighbors. The cluster coefficient was computed for all edges of a sub-network (gray bars) and for positive (green bars) and negative edges (red bars) separately. Strikingly, almost none of the negative-edge-only sub-networks had cluster coefficients above zero. In the case of the negative class sub-networks, this is a consequence of the low number of intra-class negative edges (see Figure 3E). If a negative-edge-only sub-network has a cluster coefficient of zero, it means that neighbors of a node are either not interconnected at all or that they are interconnected only by positive edges. Within the body sites, groups of phylotypes linked by negative edges likely reflect alternative communities. Members of these communities are linked among themselves by positive edges. Thus, if the positive edges are removed, the neighbors of negative nodes are no longer interlinked and the average cluster coefficient becomes zero. The high positive-edge-only cluster coefficients in classes correspond well to the high positive intra-edge number in these classes (see Figure 3E) and mean that if one member of the class is present in an individual, the other members are also likely present. (TIF) Click here for additional data file. Figure S4 Co-exclusion of Tannerella and Streptococcus in the subgingival plaque. The anaerobic and proteolytic Tannerella requires a lower pO2 than Streptococcus, while Streptococcus is an asaccharolytic colonizer of the tooth surface that uses sugars as its primary source of carbon [49], [50]. Between the supragingival and the subragingival plaques, as well as within the subgingival plaques, a gradient of nutrition and oxygen is present. The gradual drop of the abundance of Tannerella as the streptococci increase reflects the continuous nutritional and oxygen gradient between and within the supragingival and the subgingival biofilms. (TIF) Click here for additional data file. Figure S5 Abundances of 18 putative associations between oral and gut microbes. Quality control plots of the raw data for all putatively significant oral/gut microbial associations showed no strong evidence for microbial transfer from the oral cavity along the digestive tract at the available level of detection. For GBLM associations, plots show predictions from the full linear model (x axis) against observed values (y axis) with the line of unity drawn as a guide, with data from the two clinical centers distinguishable by color (orange = Baylor College, purple = Washington University). None of the significant associations proved to be substantially robust from any of the nine oral body sites to gut microbes. (TIF) Click here for additional data file. Figure S6 Repeatability of network inference using seven individual similarity/dissimilarity measures with the Houston data subset. The 2,000 most extreme (1,000 top- and bottom-scoring) edges were computed for each measure in the Houston sample subset. Measure similarity was then computed as the Jaccard index of edge overlap. Abbreviations: KLD = Kullback-Leibler dissimilarity, Var-Log = variance of log-ratios, a measure recommended by Aitchison to compute associations between parts of compositions [28]. (TIF) Click here for additional data file. Figure S7 Agreement between association networks produced by individual similarity measures and datasets. Heat map depicting the edge overlap as measured by the Jaccard index between the different methods and sample sets (Houston versus St. Louis) employed. By design from our ensemble of scoring measures, which were chosen to capture different types of microbial co-occurrences, the networks are first grouped by measure into correlations (Pearson, Spearman), GBLMs, and dissimilarities (KLD, Bray-Curtis). Each of these clusters then differentiated further according to sample set (e.g. Spearman and Pearson in Houston versus Spearman and Pearson in St. Louis). (TIF) Click here for additional data file. Figure S8 Intersection of networks generated independently for the Houston and St. Louis clinical center sample subsets. Our co-occurrence/exclusion network built on the combination of p-values for microbial interaction from 10 distinct networks, generated by five methods in each of two sample subsets from the HMP's Houston and St. Louis clinical centers. We examined the feasibility of treating these two clinical centers as replicates rather than semi-independent observations by performing a hard intersection, i.e. applying Simes method to each set of five methods separately and retaining only the edges significant in both. This intersection retained only 499 nodes and 938 edges, almost all of which (902, 96%) were contained in the complete network. This represents approximately 30% of the edges in the complete network, with the remainder made up of significant relationships confidently detected at only one clinical center. As the two clinical centers differed systematically in minor technical details such as input DNA concentration and chimerism during 16S sequencing [68], treating these as non-independent but non-replicate observations likely represents a more complete model of the HMP data's microbial co-occurrence and exclusion networks. (TIF) Click here for additional data file. Figure S9 Co-occurrence and exclusion relationships within each body site, within body areas, and between body areas. Sub-networks consisting of (A) 1,409 edges among clades within one body site, (B) 1,552 edges spanning body sites within the same area (such as the oral cavity or vagina), and (C) 44 interactions between distinct body areas. (TIF) Click here for additional data file. Table S1 Complete network of co-occurrence and co-exclusion relationships among clades in the human microbiome. Each row represents an association between two clades in specific body sites, together with their supporting methods, sign of association (positive or negative), and FDR-corrected significance. (XLSX) Click here for additional data file. Table S2 Over-representation of associations between organisms in body-site- and clade-specific subnetworks. Over-representation of edges within and between body-site- and clade-specific sub-networks was assessed at the class level by computing edge numbers in 1,000 randomly selected sub-networks of equal node number to the sub-network(s) of interest. This table gives the Bonferroni-corrected p-value and the median edge number of the random sub-networks in brackets in the form (total, p-value, random). Nominally significant values below 0.05 are highlighted in green. Tongue dorsum is the only body site with a significant over-representation of negative edges, and the Bacilli, Bacteroidia and Fusobacteria are the only clades with significant negative relationships. (XLSX) Click here for additional data file. Table S3 Negative and positive association degrees of individual body site clades' nodes. For each clade in the network, the number of total, positive and negative links to other clades is listed in descending order. In addition, each clade's number of intra- and cross-body-site links is given. The top hub nodes highlighted in Figure 3A thus appear as the first four table rows. (XLSX) Click here for additional data file. Text S1 Description of ReBoot. Detailed description of the permutation-renormalization and bootstrap (ReBoot) method, which is designed to assess the significance of associations in compositional data. (DOCX) Click here for additional data file.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            The Microbiota Mediates Pathogen Clearance from the Gut Lumen after Non-Typhoidal Salmonella Diarrhea

            Introduction Bacterial diarrhea is a global cause of morbidity and mortality. In most cases, the acute disease symptoms cease after a few days and the pathogen is eliminated from the gut. However, the mechanisms eliminating enteropathogenic bacteria from the gut lumen are poorly understood. Most “classical” effector mechanisms of the immune system are ineffective in the gut lumen (i.e. complement-mediated killing, opsonophagocytosis, T-cell mediated toxicity). In the gut, innate and adaptive immune responses such as antimicrobial peptides, natural and pathogen-specific mucosal secretory IgA (sIgA) antibodies are considered to be cardinal defense mechanisms. In addition to the host's immune system, the highly dense and diverse bacterial community in the gut (the microbiota; >500 different species [1], [2]) plays a key role by inhibiting pathogen growth in the gut lumen right from the beginning. This phenomenon is referred to as ‘colonization resistance’ and efficiently blocks infections by Clostridium difficile, Salmonella spp. and many other pathogenic bacteria [3]. Colonization resistance might be based on nutrient limitation, release of inhibitory metabolites, production of bactericidal compounds, the competition for binding sites and other, unidentified features of the dense microbial community [4], [5]. Much less is known about the mechanisms clearing enteropathogenic bacteria from the gut lumen once they have established an infection in this niche. ‘Pathogen clearance’ differs significantly from colonization resistance as both, the mucosa [6] and the microbiota, must recover from pathogen-inflicted disturbance while eliminating the pathogen [7]. Here, we have studied the mechanisms of pathogen clearance from the gut lumen using the example of non-typhoidal Salmonella (NTS) diarrhea. NTS infections, including S. enterica spp. I serovar Typhimurium (S. tm), account for a significant share of food-borne diarrhea in Europe and Northern America. In sub-Saharan Africa, NTS are also an important cause of invasive disease with high mortality, particularly in HIV infected individuals [8]. In humans, colonization resistance confers partial protection, but antibiotic treatment increases the risk of Salmonella diarrhea [9], [10]. In the typical cases of NTS diarrhea, the pathogen begins to grow in the gut and disease symptoms manifest eight to 24h after consumption of contaminated food or water. Usually, the pathogen remains limited to the gastrointestinal tract and diarrhea subsides within several days. After cessation of symptoms, Salmonella remains detectable in the stool for weeks, several months or sometimes even longer [11], [12]. Pathogen clearance seems to fail in these long-term ‘asymptomatic excretors’. This is problematic, as ‘asymptomatic excretors’ pose a significant risk of transmission, in particular when food workers in restaurants or the food industry are affected [13]. So far, we can only speculate about mechanisms mediating pathogen clearance from the gut lumen. Antimicrobial peptides might be involved in some infections, but should not affect S. tm clearance, as this pathogen is particularly resistant against this type of compound [14], [15]. Antibody responses, i.e. pathogen-specific secretory IgA (sIgA), might also clear pathogens from the gut lumen. S. tm elicits profound antibody responses against LPS and protein antigens [16]. In systemic infection models antibody responses can confer some degree of protection [17], [18]. Previous work on the role of sIgA in intestinal S. tm infection yielded conflicting results. sIgA protected cultured epithelial cells from S. tm infection, but did not reduce intestinal pathogen densities [18]. Similar findings were made for the enteropathogenic bacterium Citrobacter rodentium [19]. However, the role of sIgA in pathogen clearance in models of acute Salmonella enterocolitis with high intestinal pathogen loads has not been addressed so far. Finally, we reasoned that the microbiota itself might contribute to pathogen clearance. It remained to be shown which mechanisms contribute to pathogen clearance. We have used a Salmonella diarrhea mouse model to analyze the relative importance of sIgA and the intestinal microbiota in S. tm clearance after infection. In mice, the intestinal microbiota confers colonization resistance. Normally, 0.05 vs. colitis in mock immunized controls). Thus, S. tm att-immunized mice mounted an adaptive immune response which protected from mucosal disease on re-infection with the pathogen in an O-antigen-dependent way. S. tm att induces pathogen-specific sIgA The exquisite O-antigen specificity of protection from a second round of inflammation suggested that adaptive immunity and particularly sIgA may be the crucial mechanism not only for preventing inflammation on re-infection, but also for clearing pathogens from the gut. Therefore, we determined the kinetics of the Salmonella-specific humoral immune response by measuring specific Ig via surface staining of live, intact bacteria by flow cytometric analysis (Fig. 2C). This assay accurately differentiates S. tm specific antibodies from antibodies directed against closely related species, such as E. coli [30] (Fig.S1A–C). S. tm-specific IgM, IgG and IgA were detectable in the serum as early as 7 days post immunization. By day 14, all mice secreted S. tm -surface-specific sIgA into the gut lumen. Mucosal sIgA responses were confirmed by immunohistochemistry (Fig.S2). Salmonella antigens targeted by this strong, specific humoral immune response were analyzed by Western blotting. The antibody response was indeed pathogen-specific, as Lactobacillus reuteri RR and Enterococcus faecalis, two commensals isolated from our mouse colony, were not recognized (Fig. 2D; Fig.S3). In analogy to the human infection (Fig.S4), the antibody response included sIgA recognizing the O-antigen of S. tm (protease resistant ladder-like bands in the Western blot; Fig. 2D), a highly repetitive sugar structure of the lipopolysaccharide (LPS), coating the surface of the pathogen. In contrast, the O-antigens from S. enteritidis and E. coli, which have a different sugar structure or LPS from the O-antigen deficient mutant S. tm ΔO were not recognized. In addition, antibodies to several prominent protein antigens were detected. Most of these protein antigens were conserved in different Salmonella and E. coli strains, but not in L. reuteri RR or E. faecalis. It should be noted that acute mucosal inflammation seems necessary to elicit immune responses protecting from enterocolitis. It was also shown previously, that invasive Salmonella strains triggered more potent adaptive immune responses [31]. Mice not pretreated with sm before immunization (low antigen loads, no gut inflammation), sm-treated mice immunized with S. tm avir (high antigen loads, no gut inflammation) and parenterally immunized mice (S. tm att i.v.; systemic antigen loads, no gut inflammation) did not mount detectable levels of O-antigen-specific sIgA. None of the mice were protected against wild type S. tm (S. tm wt) mediated enteropathogenesis (Fig.S5). Overall, these data demonstrated that the LPS O-antigen was the dominant protective antigen and that mice mount a robust pathogen-specific sIgA response during the first round of infection. This is in line with earlier data from studies in the mouse typhoid fever model, in chicken and data from human patients [18], [25], [26], [27] (Fig.S4). However, from these first sets of experiments we could not conclude whether pathogen-specific sIgA was sufficient for S. tm clearance from the gut. sIgA is dispensable for pathogen clearance from the gut lumen In order to address sIgA functions in pathogen clearance, we analyzed the outcome of S. tm att infection in different KO-mice lacking key mediators of functional adaptive immune responses. We determined whether T-cell dependent or -independent mucosal sIgA immune responses [30], [32], [33] were critical for termination of inflammation, pathogen clearance and protection from inflammation on re-infection. ‘Immunization-challenge’ experiments were performed on mice lacking the T-cell receptor (TCRβ−/−δ−/−; T-cell deficient), B-cells (JH −/−), IgA (IgA−/−) or sIgA and sIgM-transport into the gut lumen (pIgR−/−; Table S2). Two days after initial infection with S. tm att, all knockout mice displayed pronounced gut inflammation (data not shown) and gut inflammation subsided by day 40 (Table 1). This demonstrated that the acute mucosal inflammation can be efficiently terminated in the absence of T-cells, B-cells, antibodies or sIgA. Furthermore, several IgA−/− (3/4) and pIgR−/− (2/5) animals managed to clear S. tm att from the gut lumen by day 40 p.i. This indicated that pathogens can (at least in some cases) be cleared from the gut lumen, in the absence of pathogen-specific sIgA (and sIgM) in the gut lumen. In order to exclude differences attributable to alterations in microbiota composition between different mouse lines, we have compared the S. tm att clearance kinetics between IgA−/− and wild type littermates (IgA+/−, IgA+/−, IgA+/+; Fig. 3). This verified that kinetics of pathogen clearance was not affected by presence or absence of sIgA. 10.1371/journal.ppat.1001097.g003 Figure 3 IgA deficiency does not affect kinetics of pathogen clearance. Time course of fecal S. tm att shedding in IgA−/− and IgA-proficient littermates. IgA−/− and IgA-proficient littermates were generated by crossing IgA+/− mice in order to yield littermates with comparable gut flora. Sm-treated IgA−/− (n = 10; white diamonds), IgA+/− (n = 7; black circles) and wild type (n = 7; black circles) littermates were infected with S. tm att and fecal Salmonella shedding was monitored until day 60 post infection. Black bar: medians. Dashed line: detection limit. Statistics: comparison of fecal shedding by IgA−/− vs. IgA+/− and IgA+/+ (n.s.: p>0.05). 10.1371/journal.ppat.1001097.t001 Table 1 Primary infection with S. tm att. Mouse line * day 2 day 40 Gut inflam.‡ Cecum cont. (cfu/g) MLN (cfu/organ) Spleen (cfu/organ) Gut inflam.‡ Antibodies Serum IgG Serum IgA sIgA wt + 10 625 60 0 ++ ++ ++ TCRβ−/−/δ−/− + 6×107 40 20 2 +# +# +# JH −/−¤ + 2×103¤ 60¤ 20¤ 0¤ −¤ −¤ −¤ IgA−/− + 10≠ 1×103 20 1 ++ − − pIgR−/− + 120§ 1000 20 1 ++ ++ − *median values of 4 or 5 mice per group; pathogen loads: cfu of S. tm att. ‡ pathology score. *median values of 4 or 5 mice per group. # no LPS specific Ig. ≠ 3/4 IgA−/− mice had no detectable S. tm att in the gut lumen. § 2/5 pIgR−/− mice had no detectable S. tm att in the gut lumen. ¤ JH −/− day 60 post S. tm att immunization. Strikingly, none of the S. tm att -immunized knockout mice developed O-antigen specific antibodies and none were protected from intestinal inflammation upon challenge with S. tm wt (pathological score ≫3; Table 2 and Fig.S6). Thus, a T-cell dependent, adaptive mucosal sIgA response is essential for protection from secondary disease, but is dispensable for resolving the initial inflammatory response to S. tm att and for clearing the pathogen from the intestinal lumen. 10.1371/journal.ppat.1001097.t002 Table 2 Challenge infection (day 2 post S. tm wt challenge). Mouse line * naïve mice S. tm att immunized mice Gut inflam.‡ Cecum cont. (cfu/g) MLN (cfu/organ) Spleen (cfu/organ) Gut inflam.‡ Cecum cont. (cfu/g) MLN (cfu/organ) Spleen (cfu/organ) wt 9 2×107 4×103 240 3 3×107 7×102 60 TCRβ−/−/δ−/− 10 1×107 8×103 290 8 4×108 1×101 280 JH −/− 9 2×107 3×103 20 8 5×108 3×102 20 IgA−/− 8 8×107 4×103 30 8 2×109 1×103 20 pIgR−/− 9 2×107 2×103 20 9 1×107 2×103 20 *median values of 4 or 5 mice per group; pathogen loads: cfu of S. tm wt (B). ‡ pathology score. *median values of 4 or 5 mice per group. ¤ JH −/− day 60 post S. tm att immunization. sIgA restricts intestinal pathogen growth and mucosal access Though dispensable, our findings did not exclude that sIgA exerts an effector function [34] which contributes in some way to pathogen clearance. To identify such mechanisms, we analyzed the effects of sIgA on pathogen growth and its interaction with the host's intestinal mucosa in greater detail. First, we applied a modified ‘immunization-challenge’ protocol. Sm-treated mice were infected with S. tm att, an equivalent S. enteritidis strain (S. en att; S. enteritidis 125109 sseD; [35]) or mock. Antibody responses and S. tm att/ S. en att loads in the stool were monitored (Fig.S7 and data not shown). After 39 days, immunized mice were treated with ampicillin (elimination of microbiota and remaining S. tm att or S. en att) and challenged with a 1∶1 mixture of S. tmavir and S. enavir (ampicillin resistant; sseDinvG mutants; 200 cfu each by gavage; Table S1). These latter mutants can colonize the gut lumen of naïve mice for up to four days, remain confined to the gastrointestinal tract and they do not elicit enteropathogenesis, thus mimicking the situation in the intestines of ‘asymptomatic excretors’ [21], [24]. We decided not to use S. tm ΔO for this type of competition experiments as it displays a pronounced competitive growth defect in mice when co-infections are performed with an isogenic wild type strain [36]. In the gut lumen of S. tm att immunized mice, S. enavir out-competed S. tmavir (Fig. 4A; black symbols). In S. en att immunized mice, S. tmavir out-competed S. enavir (red symbols), and in mock-immunized mice, both strains colonized with equal efficiency. Therefore, O-antigen specific sIgA may help controlling pathogen growth or survival in the gut lumen. Furthermore, S. tmavir (but not S. enavir ) was aggregated in the gut lumen and occluded from the mucosal surface of S. tm att immunized mice (Fig. 4A, right panels). Pathogen occlusion was confirmed by assessing pathogen loads in the gut tissue of challenged mice. In S. tm att -immunized animals, S. tm wt tissue loads were 100-fold lower than in mock-immunized controls (Fig. 4B). In contrast, S. tm att immunization did not prevent the invasion of S. enteritidis. Furthermore, S. tm att immunized pIgR−/− mice, which cannot transport sIgA across the gut epithelium, failed to prevent gut tissue invasion by wt S. typhimurium into the mucosal tissue (Fig. 4B). Thus, the O-antigen-specific sIgA response conferred protection by restricting pathogen growth in the gut lumen and preventing the interaction of the pathogen with the intestinal mucosa. To some extent, this may also contribute to pathogen clearance from the gut lumen. 10.1371/journal.ppat.1001097.g004 Figure 4 Immunization elicits O-antigen specific sIgA which retards pathogen growth and restricts mucosal access. A. sIgA retards pathogen growth and occludes mucosal access. Mice were immunized with S. tm att, S. en att or mock (n = 5 per group), treated with ampicillin and challenged for 1 day with a 1∶1 mixture S. tm avir(ampR; pWKS30) and S. en avir (ampR; pM979-GFP+). None of the mice developed gut inflammation. Left panel: competitive index (c.i.) of S. tm avir(ampR; pWKS30) and S. en avir (ampR; pM979-GFP+) in the cecal lumen. Right panels: Immunofluorescence microscopy of the cecal lumen (left panels) and the mucosal surface (right panels; actin brush boarder = blue) to detect S. tm avir (red α-S. tm LPS stain) and S. en avir (green GFP+). Scale bar: 10µm. B. sIgA blocks mucosal invasion by S. tm wt. C57Bl/6 (circles) or pIgR−/− mice (diamonds) were immunized with mock (open symbols) or S. tm avir (closed symbols) and challenged with S. tm wt (black) or S. en wt (red). Bacterial loads in the cecal lumen (left) and Salmonella invasion into the mucosal tissue (middle) was determined at 2 days post challenge. Right panels: Immunofluorescence microscopy of GFP-expressing S. tm wt or S. en wt (green) in the cecal mucosa (red: ICAM-1, lamina propria; blue: Actin, epithelial brush border); White dotted lines mark epithelial-submucosal boarder; Scale bar: 10µm. Mice harboring a low-complexity microbiota become long-term asymptomatic excretors While O-antigen-specific sIgA was indispensable to prevent disease, it did not seem to be a major determinant in pathogen clearance from the gut lumen. The onset of adaptive sIgA responses and cessation of symptoms seemed to occur well ahead of S. tm att elimination from the intestines. Moreover, IgA deficiency did not affect pathogen clearance kinetics (Table 1; Fig. 3). This was different from most well studied paradigms of acute systemic infection where the onset of protective immunity coincides with declining pathogen loads. This strongly suggested that sIgA-independent mechanisms may underlie pathogen clearance from the gut. Thus, we hypothesized that the microbiota might play a crucial role in pathogen clearance. The microbiota is a dense bacterial community composed of approx. 500–1000 different species [9], [37]. It confers numerous beneficial effects to the host [38] including ‘colonization resistance’, i.e. a generalized interference with the growth of many pathogens in the gut of a naïve host [3]. Antibiotic treatment disrupts the normal microbiota, alleviates colonization resistance and constitutes a known risk factor for Salmonella infections in humans and mice [7], [9], [10], [21], [39]. Furthermore, the species composition of the microbiota - and by inference the degree of colonization resistance - can vary significantly between different individuals [40]. Therefore, the microbiota composition might explain why Salmonella shedding by ‘asymptomatic excretors’ can last for months or years. In sm-treated mice, the microbiota is transiently reduced, but rapidly returns to pretreatment community composition, re-establishes ‘colonization resistance’ and ‘asymptomatic excretion’ occurs just transiently (Fig. 1C; [21], [41]). For this reason, our original infection model was not optimally suited for dissecting the differential role of the microbiota and sIgA in pathogen clearance. To overcome this problem we used ‘L-mice’ which harbor a well defined, low complexity microbiota (L = ‘LCM mice’; [20]). L -mice are ex-germ free mice that are stably associated with the ‘Altered Schaedler Flora’ [42] comprising <20 species. The representatives with the highest abundance are ASF500 (Firmicutes; Clostridia; Clostridiales; Lachnospiraceae; unclassified_Lachnospiraceae) and ASF519 (Bacteroidetes; Bacteroidia; Bacteroidales; Porphyromonadaceae; Parabacteroides). Thus, the L microbiota resembles the conventional (C) microbiota of mice and men at broad lineages levels [43]. However, in spite of an equally high bacterial density as the C microbiota, the L microbiota does not confer colonization resistance [20]. Accordingly, S. tm att efficiently colonized the gut lumen of L-mice at high levels (≥108cfu/g) and elicited pronounced enteropathogenesis by day 2 p.i. even without previous antibiotic treatment (Fig. 5A). After 40 days, all immunized L-mice had resolved acute inflammation, but kept on shedding S. tm att at high levels for at least 83 days (Fig. 5A; see also below). This was not due to a defective O-antigen-specific sIgA response: sIgA responses in L-mice were as pronounced as in C-mice as indicated by the increased numbers of IgA+ cells in the cecal mucosa (Fig. 5B,C and Fig.S2) and by Western Blot analysis (Fig.S8 and Fig 2D). The strong adaptive mucosal immune response was also confirmed by gene expression profiling of the cecal mucosa (Fig. 5D, Fig.S9, Table S3). Furthermore, challenge experiments confirmed the O-antigen-specific protection from enteropathogenesis (Fig. 6). However, despite this O-antigen-specific sIgA response, high-level pathogen shedding persisted in all analyzed animals (Fig. 5A; see also below). Therefore, O-antigen-specific sIgA was insufficient for luminal S. tm att clearance. This was in line with our hypothesis that elements of the normal, complex microbiota (which is lacking in L-mice) may play a key role in terminating fecal S. tm att shedding. 10.1371/journal.ppat.1001097.g005 Figure 5 An O-antigen-specific sIgA response is insufficient to terminate S. tm att shedding by L-mice. A. L-mice do not clear S. tm att from the gut. L-mice (n = 15) were immunized with S. tm att (no streptomycin-treatment) and fecal shedding was monitored for 40 days (left panel). Gut inflammation was analyzed at day 2 and day 40 p.i. (5 mice per time point; right panel). B. and C. L-mice mount a pronounced adaptive mucosal immune response. Cecal tissue of naïve L-mice and L-mice at days two and 40 p.i. was stained by immunohistochemistry for Ly6G (granulocytes), IgA, CD4 (T-cells) and CD8 (T-cells) cellular markers. Quantitative data were from 15 randomly selected 40× high power fields (hpf) from 3–5 mice per group. Y-axis: average cell number per 40× hpf (C). D. Gene expression analysis of naïve and L-mice at day 40 post S. tm att immunization. Left: Numbers of differentially expressed genes in S. tm att-immunized L-mice. Middle: relative abundance of biological pathways significantly upregulated in S. tm att-immunized L-mice. Processes, p-values and characteristic examples are shown as a table (right). 10.1371/journal.ppat.1001097.g006 Figure 6 O-antigen specific protection from enteropathogenesis of S. tm att-immunized L-mice. S. tm att- (filled symbols) or mock- (open symbols) immunized mice were challenged for two days with S. tm wt (black; n = 10) or S. en wt (red circles; n = 5) at day 40 post immunization. Challenge-strain loads in the cecal content (left panel) and the MLN (middle panel) and cecal inflammation were assessed at day 2 post challenge. Dashed lines: detection limit or limit defining the absence of disease (pathological score ≤3). Black bars: median; *p<0.05; **p<0.005; n.s. = not significant. Right: H&E stained cross-sections of the cecum of S. tm att-immunized L-mice at day two post challenge with S. tm wt (upper panel) or S. en wt (lower panel). Scale bar: 50µm. Transfer of a complex microbiota facilitates pathogen clearance To formally define the importance of the commensal microbiota in pathogen clearance, two groups of L-mice were infected with S. tm att for 83 days. The first group was kept under strict hygiene isolation and shed high loads of S. tm att until the end of the experiment (S. tm att→L; Fig. 7A; open symbols). The second group was exposed to C microbiota at day 40 by placing C donor mice into the same cage (S. tm att→L/C; 6 independent cages). Both groups of mice developed the typical pathogen-specific, adaptive sIgA response by day 83 p.i. (Fig. 7B,C). Upon introduction of the C donor mice, fecal shedding decreased gradually and ceased in most of the S. tm att→L/C mice by day 83 (<105cfu/g; Fig. 7A; black symbols) but not in the S. tm att→L group. This suggested that pathogen clearance was mediated in some way by the complex microbiota. 10.1371/journal.ppat.1001097.g007 Figure 7 Transfer of a complex microbiota curbs S. tm att shedding by L-mice. A. Exposure to conventional gut microbiota curbs pathogen shedding by ‘asymptomatic excretors’. L-mice were immunized with S. tm att. At day 40 p.i., donor animals with a conventional microbiota (C-mice) were added to one group (black circles; n = 17; S. tm att→L/C). The other group remained under hygienic isolation (open circles; n = 10; S. tm att→L). Fecal S. tm att shedding was monitored until day 83 p.i. Right panels show gut inflammation and organ loads at day 83 p.i.. B. and C. S. tm-specific Ig-response (serum IgA and IgG; IgA in gut wash; day 83 p.i.) in S. tm att→L and S. tm att→L/C mice was determined by bacterial FACS and by immunoblot as in Fig. 2A,B. D. A complex microbiota is transferred to L-mice by day 83 p. S. tm att immunization. Collectors' curves (CD = 0.05) were created for each mouse from the total number of filtered sequences for representative mice: S. tm att→L (d.2), S. tm att→L (d.40), S. tm att→L (d.83), S. tm att→L/C and 2 C-mice (donor). E. Phylotypes of mice shown in (D) (Clustering distance CD = 0.05) were sorted according to their family taxon (family level; x-axis) and average clustering was performed on Euclidean distances calculated between abundance profiles for each mouse and every time-point sampled. Red color indicates high abundance (Log2), yellow color low abundance. In order to verify microbiota-transfer, we assessed microbiota composition using high-throughput 16S rRNA gene sequence analysis (Materials and Methods). S. tm att→L/C mice displayed significantly higher diversity than S. tm att→L mice as well as LCM mice at day 2 and 40 after S. tm att immunization (Fig. 7D). The rarefaction curves indicated that S. tm att→L/C mice had acquired a microbiota of the similar complexity as the C donor mice. This was confirmed by assessing the richness (actual diversity) of the samples by calculating the Shannon index (H) and species evenness (E) as well as the Chao1 diversity estimate (Table S4). Accordingly, the taxonomy assignment confirmed that the number of bacterial taxa in the stool increased significantly in S. tm att→L/C mice. All S. tm att→L mice carried high loads of Enterobacteriaceae (i.e. Salmonella spp., E. coli spp.; red colors) in their stools. In contrast, no Enterobacteriaceae were detected in the stools of 3 (out of 6) S. tm att→L/C mice and the remaining 3 animals carried low levels of this family (yellow colors, Fig. 7E). In addition, the microbiota composition was similar between all S. tm att→L/C animals as demonstrated by hierarchical cluster analysis of eubacterial family profiles (Fig. 7E). This indicated that pathogen displacement occurs in a reproducible, stereotypic fashion and may not result from random transfer of only few members of the conventional microbiota. Most importantly, these data demonstrate that members of the conventional microbiota can upon transfer lead to the termination of sustained pathogen shedding in L-mice. It remained unclear whether pathogen clearance was mediated directly by the microbiota or by microbiota-induced mucosal responses. Recently, it has been shown that parts of the microbiota (i.e. segmented filamentous bacteria) induce mucosal TH-17 cell responses that can protect from pathogen infection [2]. However, we did not observe differences in IL-17A or IFN gamma-producing CD4 T-cells in the MLN of S. tm att→L and S. tm att→L/C animals by day 83 (Fig. 8). Furthermore, we tested if total MLN cells obtained from S. tm att→L/C animals would, upon transfer into S. tm att→L (d.40) induce clearance of intestinal S. tm att. However, this was not the case (Fig.8). This was in line with the notion that the microbiota may directly mediate pathogen clearance. 10.1371/journal.ppat.1001097.g008 Figure 8 Adoptive transfer of MLN cells isolated form mice that cleared S. tm att from the intestine into S. tm att→L mice (day 40 p.i.) does not induce S. tm att clearance. L-mice were immunized with S. tm att. At day 40 p.i., animals with a conventional microbiota (C-mice) were added to one group (black circles; n = 5; S. tm att→L/C). The other group remained under hygienic isolation (open circles; n = 5; S. tm att→L). A. All mice were sacrificed at day 80 p.i. and S. tm att levels in the cecal content were defined. B. Total MLN cells from the two groups of mice were isolated and analyzed with respect to the expression of intracellular cytokines IL-17A (left) and IFNγ (right) in CD4+ T-cells at day 80 post S. tm att infection (% cytokine-expressing CD4+ T-cells). C. 3×107 MLN cells of ‘donor’ mice S. tm att→L and S. tm att→L/C described above were adoptively transferred into another cohort of S. tm att→L mice (day 40 p.i.: = ‘aceptors’). Fecal S. tm att shedding was followed in the ‘acceptors’ until day 80 post S. tm att infection ( = day 40 post adoptive transfer, left). S. tm att levels in the MLN of ‘acceptors’ at day 80 post S. tm att infection ( = day 40 post adoptive transfer, left). Discussion In this study, we have defined the contributions of sIgA and the microbiota in protecting the host from NTS infection. During the first encounter with the pathogen, the microbiota mediates at least two different protective functions, colonization resistance and pathogen clearance. The former is well established and prohibits the growth of diverse incoming pathogens, thus preventing colonization right on [3], [5], [20], [44]. Here, we identified pathogen clearance as a second protective function attributable to the microbiota. Pathogen clearance eliminates the pathogen from the gut lumen after an episode of acute infection, i.e. after Salmonella diarrhea. This differs from colonization resistance as the pathogen starts out at high density and the normal microbiota must re-grow from a state of depletion and disturbed composition (i.e. caused by the pathogen and the inflammatory response). Compared to the microbiota, an adaptive sIgA response mounted during the later stages of an acute infection contributes little to clearing the pathogen from the gut lumen. However, pathogen-specific sIgA protects from mucosal inflammation if the same pathogen is encountered for a second time. Thus, the microbiota and sIgA have complementary functions which jointly protect against enteropathogenic bacteria during the initial infection and subsequent exposure. How does the microbiota mediate pathogen clearance? The lack of suitable assay systems has hampered addressing this question in the past. Clearly, S. tm clearance starts out in a situation where the pathogen has grown up in the gut lumen, inflicted disease and thereby slashed microbiota density, composition and/or function [45]. This situation is gradually reversed, involving the decrease of luminal pathogen loads as well as microbiota re-growth. Finally, normal microbiota composition, density and function are restored. Conceivably, some of the mechanisms conferring colonization resistance, i.e. bacteriocin production, inhibitory metabolites, oxygen depletion, receptor blocking, stimulating mucin- or antimicrobial peptide release, stabilization of the mucosal barrier, improvement of gut motility and/or nutrient limitation [5], might also contribute to different phases of pathogen clearance. Also, microbiota-mediated stimulation of the mucosal cellular immune system may be involved [2], [46] even though TH-17 mediated responses do not seem to contribute significantly to S. tm clearance, as indicated by adoptive transfer experiments (Fig. 8). The mechanisms mediating clearance and the relative importance of the microbiota, sIgA and other mucosal immune responses may differ between different pathogens or even between different strains of a given pathogen. Identifying the commensal species (or consortia) involved and the molecular mechanisms mediating pathogen clearance will be an important task for future research. Does sIgA contribute to pathogen clearance? Pathogen specific sIgA is produced by wild type animals during the phase of pathogen clearance. O-antigen-specific sIgA led to aggregation of luminal pathogens, prevented access to the enterocyte surface and reduced net pathogen growth as indicated by a reduced competitive index. Surprisingly, this had little effect on S. tm clearance. Wild type mice and IgA−/− littermates displayed equivalent rates of pathogen clearance. Moreover, sIgA did not reduce pathogen loads in the stool, at least in the L-mice in the absence of a complex gut flora. Thus, for pathogen clearance at the end of a primary enteric S. tm infection, a pathogen-specific sIgA response is neither necessary nor sufficient. Instead, sIgA protected from mucosal inflammation upon re-infection with the same pathogen. The LPS O-antigen was the key protective antigen of this adaptive immune response. Protection was attributable to pathogen-aggregation in the gut lumen, reduced net pathogen growth and pathogen-exclusion from the epithelial surface, thus inhibiting pathogen invasion into the gut tissue. This required sIgA transport into the gut lumen, as immunized pIgR−/− mice, which fail to transport sIgA across the intestinal epithelium, have equivalent pathogen loads in the gut mucosa as non-immunized littermates or wild type animals. Thus, pathogen specific antibodies do not seem to contribute much to protection, once S. tm has breached the mucosal barrier. This is in line with earlier work on the roles of antibody responses in systemic S. tm infection [17], [47]. In conclusion, our experiments show that O-antigen-specific sIgA responses protect against Salmonella-mediated gut inflammation upon re-infection. It is interesting to consider the protective function of sIgA and the microbiota from an evolutionary perspective. The intestines of most animals are colonized by bacterial communities [43]. It seems safe to assume that microbiota have an evolutionary ancient function in protecting from infection. We speculate that this pertains to colonization resistance and to pathogen clearance and that both have evolved to provide protection against a broad range of pathogens. The elaborate adaptive immune system of modern mammals, including sIgA responses, evolved much later. It evolved in the presence of the protective functions provided by the microbiota, i.e. colonization resistance and pathogen clearance. The high efficiency of this microbiota-mediated protection may explain why sIgA responses have not evolved to affect the first round of infection with a given pathogen. This was simply not necessary. In contrast, evolving the sIgA response to protect in the case of repeated exposure to the same pathogen may have represented a strong benefit which cannot be accomplished by the microbiota. This evolutionary history may explain why the sIgA response contributes little during the primary infection. Anyhow, in modern mammals the microbiota and sIgA have quite different protective functions which complement each other during the initial- and subsequent encounters with a given pathogen. An ‘unfavorable’ microbiota composition, e.g. in L-mice, can result in long term shedding. Asymptomatic NTS excretion is also observed in humans recovering from acute diarrhea. This period of asymptomatic excretion normally lasts for two to eight weeks, but may last for more than a year in a few patients. This poses a risk of transmission. In analogy to the long term shedding by L-mice, we propose that these individuals might lack some unidentified component of the normal microbiota. In L-mice, the pathogen is cleared upon transferring microbiota from a healthy donor. This may have implications for managing human long term asymptomatic excretors. Traditionally, patients are advised to adhere to strict personal hygiene and might even be isolated in order to reduce the risk of transmission. However, at the same time this deprives the patients from exposure to conventional microbiota from healthy individuals which might enhance pathogen clearance. So far, we do not know the species of the microbiota, the cellular interactions, and molecular mechanisms explaining pathogen clearance. However, the experimental systems presented in our study may provide the tools to address these important issues. Our findings provide a basis for future research on optimal management of ‘asymptomatic excretors’, NTS vaccine development and microbiota-directed therapy for acute diarrheal NTS infections. Materials and Methods Animals Specified pathogen-free (SPF) wild type C57BL/6 mice, JH −/− [48] and IgA−/− [49], pIgR−/− [50] and TCRβ−/−δ−/− mice [51] (7–10 weeks old; all C57BL/6 background) were bred at the Rodent center HCI (RCHCI) under barrier conditions in individually ventilated cages (Ehret). IgA−/−, IgA+/− and IgA+/+ littermates were generated by crossing IgA−/− with C57BL/6 mice and breeding IgA+/−×IgA+/− animals. L-mice were generated by colonizing germfree C57BL/6 mice with the Altered Schaedler flora (ASF). Mice, housed in a bubble isolator, were inoculated at eight weeks of age by intra-gastric and intra-rectal administration of 107–108 cfu of ASF bacteria on consecutive days (www.taconic.com/library). Later, L-mice (C57BL/6 background) were maintained under barrier conditions in IVCs with autoclaved chow and autoclaved, acidified water. Mice with complex microbiota were never housed together with these in the same room to prevent contamination with additional commensal bacteria. Ethics statement All animal experiments were approved (license 201/2004 and 201/2007 Kantonales Veterinäramt Zürich) and performed according to local guidelines (TschV, Zurich) and the Swiss animal protection law (TschG). Infection experiments Salmonella infections were performed in individually ventilated cages at the RCHCI, Zurich, as previously described [52]. In brief, wild type C57BL/6 mice, JH −/−, IgA−/−, pIgR−/− and TCRβ−/−δ−/− mice were pretreated with 20mg of streptomycin (sm) by gavage. 24h later, the mice were inoculated with 5×107 CFU of S. tm att, PBS (mock) or as indicated. ‘Challenge infections’ were performed 40 days later (or as indicated). Mice were treated with ampicillin (20mg; by gavage) and 24h later infected with a dose of 200 CFU of the respective ampicillin-resistant (pM973) challenge strain. Samples of cecal tissue were cryo-embedded, and inflammation was quantified on cryosections (5 µm, cross-sectional) stained with hematoxylin and eosin (H&E). Pathogen-colonization was assessed as described, below. Histology H&E-stained cecum cryosections were scored as described, evaluating submucosal edema, PMN infiltration, goblet cells and epithelial damage yielding a total score of 0–13 points [53]. Analysis of Salmonella loads in cecal content, mesenteric lymph nodes, and spleen Mesenteric lymph nodes (MLN), spleen and liver were removed aseptically and homogenized in cold PBS (0.5% tergitol, 0.5% BSA). The cecum content was suspended in 500ìl cold PBS and bacterial loads were determined by plating on MacConkey agar plates (50 ìg ml−1 streptomycin) as described [21]. Colonization levels of the challenge strain (carrying pM973 with an antibiotic marker) and immunization strain (S. tm att; kmR) were determined by selective plating (100ìg ml−1 ampicillin or 30 ìg ml−1 kanamycin; levels of challenge strain: ampR-kmR Salmonella). For co-infection experiments shown in Fig. 2C, competitive indices were calculated according to the formula CI = ratio S. tm:S. en cecal content/ratio S. tm:S. en inoculum. Adoptive transfer of MLN cells MLN were harvested from S. tm att→L (total of 5 mice) and S. tm att→L/C (total of 5 mice) at day 80 post S. tm att infection. Single cell suspensions were prepared using 100µm cell strainers and 40µg/ml DNAse (Roche). S. tm att→L mice (at day 40 post S. tm att infection) were injected intravenously with 3×107 cells (pooled for each of the two groups) in 200µl PBS. Fecal S. tm att shedding was monitored for another 40 days. Intracellular cytokine staining Single-MLN-cell suspensions were prepared as described above. For intracellular staining of IFN-γ and IL-17A, 1×107 nucleated MLN cells were cultured for 3 h in 1 ml of RPMI 1680 supplemented with 10% heat-inactivated FCS and stimulated with PMA (5pg/ml) /ionomycin (500pg/ml). After adding 20 µg/ml brefeldin A, the cells were incubated for another 3h at 37°C. Cells were harvested and washed in ice-cold FACS buffer (PBS, 2% heat-inactivated FCS, 5 mM EDTA, and 0.02% sodium azide). Cells were resuspended in FACS buffer and stained on the surface with fluorescently-labeled antibodies for 30 min on ice. For intracellular staining of IL-17A and IFN-γ, cells were washed once and fixed/permeabilized for 10 min at room temperature using 500 µl of FIX/perm solution (FACSLyse; BD Biosciences; diluted to 2× concentration in distilled water and 0.05% Tween 20). Cells were washed once and stained with directly conjugated Abs against IFN-γ-APC (BD) and IL-17A-PE (Biolegend). Cells were then washed again and resuspended in PBS. Data were collected on a LSRII flow cytometer (BD Biosciences) and analyzed using FlowJo software (Tree Star). Gut wash The intestine was flushed with 2 ml of a washing buffer containing PBS, 0.05M EDTA pH8.0 and 66µM PMSF. Intestinal wash was briefly vortexed and centrifuged at 4°C, 30 min, 40.000 rpm (Eppendorf centrifuge). Aliquots of supernatants were stored at −80°C. Immunofluorescence microscopy Bacteria harboring pM973 (GFP expression after tissue entry) in the lamina propria and epithelium were enumerated by fluorescence microscopy as described [22] using cryo-sections of PFA-fixed cecal tissue stained with Armenian hamster anti-CD54 (clone 3E2; stains lamina propria) antibody (Becton Dickinson), Cy3-conjugated goat anti–Armenian hamster Ig (Jackson ImmunoResearch Laboratories), DAPI (stains DNA; Sigma-Aldrich), and Alexa647-conjugated phalloidin (stains polymerized actin; Fluoprobes). We evaluated three 20 µm thick sections of the cecum per mouse and plotted for each mouse the average of the three values. For detecting S. tm and S. en pM979 in the gut lumen in situ, cecal tissues were recovered and treated as described recently [54]. Briefly, the tissues were fixed in paraformaldehyde (4% in PBS, pH 7.4 over night, 4°C), washed with PBS, equilibrated in PBS (20% sucrose, 0.1% NaN3 over night, 4°C), embedded in O.C.T. (Sakura, Torrance, CA), snap-frozen in liquid nitrogen and stored at −80°C. Cryosections (7ìm) were air-dried for 2 h at room temperature, fixed in 4% paraformaldehyde (5 min), washed and blocked in 10% (w/v) normal goat serum in PBS for 1h. S. tm was detected by staining for 1h with a polyclonal rabbit á-Salmonella-O-antigen group B serum (factors 1, 4, 5 and 12, Difco; 1∶500 in PBS, 10% (w/v) goat serum) and Cy3-conjugated secondary goat-α-rabbit antibody. S. en pM979 expresses gfp under the control of a constitutive promoter and bacteria were detected in the green channel. F-Actin (epithelial brush border) was visualized by staining with Alexa-647-conjugated phalloidin, as indicated (Molecular Probes). Sections were mounted with Vectashield hard set (Vector laboratories) and sealed with nail polish. Images were recorded with a microscope (Axiovert 200; Carl Zeiss, Inc.), an Ultraview confocal head (PerkinElmer), and a krypton argon laser (643-RYB-A01; Melles Griot). Infrared, red, and green fluorescence was recorded confocally, and blue fluorescence was recorded by epifluorescence microscopy. Immunohistochemistry Frozen consecutive sections of spleen, liver, cecum, colon and small intestine (7ìm thick) were briefly fixed (10 min) in acetone and blocked for 30 min with phosphate buffered saline (PBS) containing 0.5% bovine serum albumin (BSA). Sections were then incubated with the primary antibody for 1 h at room temperature. Primary antibodies included: B220/CD45R (Pharmingen 553084; 1∶200) , CD4 (clone YTS191; 1∶200) and CD8 (clone YTS169; 1∶200) for T-cells (kindly provided by Rolf Zinkernagel; 1∶50), Ly-6G (Gr-1) for neutrophils (clone RB6-8C5; 1∶600), F4/80 for macrophages (Serotec MCAP 497; 1∶50), CD11c for dendritic cells (BD Biosciences 553800; 1∶100) and IgA (rat-anti-mouse IgA; Pharmingen 556969; clone C10-3; 1∶4000). Secondary antibodies and detection chromagens were applied and visualized using standard methods (see also [55]). Statistical analysis Statistical analysis was performed using the exact Mann-Whitney U test (Prism 4.0c). A P value of <0.05 (two tailed) was considered to be statistically significant. In mouse experiments, values were set to the minimal detectable value (10 cfu for cecum; 10 CFU for MLNs; 20 CFU for the spleen) for samples harboring “no bacteria.” Two figures (Fig. 1A and 4F) were generated using the statistical software package R. To assess the distribution of Salmonella loads in mice during the 60 day infection experiments, median and quantiles (corresponding to 0.05, 0.25, 0.75 and 0.95 probabilities) were plotted for each day or group of days. We performed a linear regression on medians and both 0.05 and 0.95 quantiles, weighted by the number of data points sampled for each day. The OTUs abundance heatmap represents the mouse normalized OTU abundances (log2) clustered by average linkage clustering on Euclidean distances. This was generated using the function ‘heatmap.2’ from the ‘gplots’ R library. Immunoblot analysis The equivalent of 1 OD600 units/ml (where OD600 is the optical density at 600nm) of liquid o.n. cultures of S. tm ΔO, Lactobacillus reuteri RR, Enterococcus faecalis, S. en wt, E.coli, S. tm wt, S. tm wt proteinase K treated (Gibco/Life Technologies; 0.4 mg/ml; 1h 57°C) or S. tm M933 was pelleted by centrifugation at 14,000×g for 2 min, and the supernatant was discarded. Cells were resuspended in Laemmli sample buffer (0.065 M Tris-HCl [pH 6.8], 2% [wt/vol] sodium dodecyl sulfate [SDS], 5% [vol/vol] β-mercaptoethanol, 10% [vol/vol] glycerol, 0.05% [wt/vol] bromophenol blue) and lysed for 5 min at 95°C. Equal amounts of the different strains and purified S. tm flagellin FliC were loaded on a 12% SDS-polyacrylamide gel and proteins wer separated by electrophoresis. Immunoblots were stained with mouse serum (diluted 1∶200 in PBS) or intestinal lavages (diluted 1∶20 in PBS) from naïve or immunized mice, goat-α-mouse-IgA HRP (Southern Biotech), goat-anti-mouse-IgG HRP (Bethyl Laboratories) and developed using an ECL kit (Amersham). The same protocol was used for the analysis of the human patient serum (dilution 1∶20 in PBS), where goat-anti-human-IgA-HRP (2050-05, NEB) and goat-anti-human-IgG-HRP (2040-05, NEB) were used as secondary antibodies. Bacterial FACS Analysis was performed as described in [30]. 3ml LB cultures were inoculated from single colonies of plated bacteria and cultured overnight at 37°C without shaking. 1ml of culture was gently pelleted for 4min at 7,000 rpm in an Eppendorf minifuge and washed 3× with sterile-filtered PBS (1% BSA, 0.05% sodium azide) before resuspending to yield a final density of 107 bacteria per ml. Mouse serum was diluted 1∶20 in PBS (1% BSA, 0.05% sodium azide) and heat-inactivated at 60°C for 30min. The serum solution was then spun at 13,000 rpm in an Eppendorf minifuge for 10min to remove any bacteria-sized contaminants and the supernatant was used to perform serial dilutions (1∶20, 1∶60, 1∶180). 25µl serum solution and 25µl bacterial suspension were mixed and incubated at 4°C for 1h. Bacteria were washed twice before resuspending in monoclonal FITC-anti-mouse IgA (559354; BD Pharmingen), PE-anti-mouse total IgG (715-116-151; Jackson Immunoresearch Europe) and APC-anti-mouse IgM (550676; BD Pharmingen). After a further hour of incubation bacteria were washed once with PBS (1% BSA, 0.05% sodium azide) and then resuspended in PBS (2% PFA) for acquisition on a FACSCalibur using FSC and SSC parameters in logarithmic mode. Data were analysed using FlowJo software (Treestar, USA). Analysis of specific IgA in intestinal lavages was achieved using an identical protocol, using a dilution of 1∶2, 1∶6 and 1∶18 of gut wash. Microarray-Analysis The cecum tissue was excised (3 biological replicates per group), washed in cold PBS, placed in 300µl RLT-buffer (RNeasy Mini Kit, Qiagen; 1% β-Mercaptoethanol) and snap-frozen in liquid nitrogen. Total RNA was extracted with the Nucleospin RNA II kit (Macherey Nagel, Germany) and prepared for hybridization as recommended by the manufacturer (Applied biosystems, USA). Briefly, 2µg of total RNA and a T7-oligo(dT) primer were used for reverse transcription. The double-stranded cDNA was purified and converted to DIG labeled-cRNA by in vitro transcription using DIG-UTP (Roche, Germany). The cRNA was purified, fragmented and hybridized on ABI Mouse Genome Survey v2.0 microarrays for 16h. The microarray was washed and incubated with anti-DIG antibodies conjugated to alkaline phosphatase and a chemiluminescent substrate. The microarrays were scanned with the Applied Biosystems 1700 chemiluminescent microarray analyzer [56]. Normalization was achieved using the NeONORM method [57]. Significance of log2 fold changes (log2Q) were determined based on a double-log normal distribution hypothesis of signal intensities using mixture ANOVA methodology [56]. A change in the gene expression profiles was considered as significant if p<0.001. Heat maps were created according to standard methods [56]. Gene Ontology (GO) annotations were analyzed using the Panther Protein Classification System (http://www.pantherdb.org). Microarray data were deposited in the publicly available database: http://mace.ihes.fr with accession number: 2947924142. Bacterial DNA extraction and 16SrRNA gene specific PCR Total DNA was extracted from cecal contents using a QIAmp DNA stool mini kit (Qiagen). Bacterial lysis was enhanced using 0.1mm glass beads in buffer ASF and a Tissuelyzer device (5 minutes, 30Hz; Qiagen). V5-V6 regions of bacterial 16S rRNA were amplified using primers B-V5 (5′ GCCTTGCCAGCCCGCTCAG ATT AGA TAC CCY GGT AGT CC 3′) and A-V6-TAGC (5′GCCTCCCTCGCGCCATCAG [TAGC] ACGAGCTGACGACARCCATG 3′). The brackets contain one of the 20 different 4-mer tag identifiers [TAGC, TCGA, TCGC, TAGA, TGCA, ATCG, AGCT, AGCG, ATCT, ACGT, GATC, GCTA, GCTC, GATA, GTCA, CAGT, CTGA, CAGA, CTGT, CGTA]. Cycling condition were as follows: 95°C, 10min; 22 cycles of (94°C, 30s; 57°C, 30s; 72°C, 30s); 72°C, 8min; 4°C, ∞; Reaction conditions (50µl) were as follows: 50ng template DNA; 50 mM KCl, 10 mM Tris-HCl pH 8.3, 1.5 mM Mg2+, 0.2mM dNTPs; 40pmol of each primer, 5U of Taq DNA polymerase (Mastertaq; Eppendorf). PCR products of different reactions were pooled, ethanol-precipitated and fragments of ∼300bp were purified by gel electrophoresis, excised and recovered using a gel-extraction kit (Machery-Nagel). Amplicon sequencing of the PCR products was performed using a 454 FLX instrument (70×70 Picotitre plate) according to the protocol recommended by the supplier (www.454.com). PCR to detect ASF bacteria in the feces was done as described in [42]. Quality filtering and reads sorting We applied quality control to 454 reads in order to avoid artificial inflation of ecosystem diversity estimates [58]. Reads containing the consensus sequence (‘ACGAGCTGACGACA[AG]CCATG’) of the V6 reverse primer were filtered with respect to their length (200nt≤length≤300nt). Quality filtering was then applied to include only sequences containing one of the exact 4nt tag sequences and displaying at maximum one ambiguous nucleotide ‘N’. The latter criterion has been reported as a good indicator of sequence quality for a single read [41]. We identified 6,754 reads with an incorrect primer sequence, 1,155 reads shorter than 200nt, 8 reads longer than 300nt and 119 reads containing more than one ‘N’. After filtering, 140,237 reads remained (out of an initial total of 149,786 raw reads) and were processed as described below for OTU definition and chimera filtering. To estimate the reliability of sample discrimination using our primer-tagging approach, we assessed the number of reads observed to have an illegitimate 4-mer tag (i.e., different from our set of 20 tags). The sequencing plate produced a total of 141,784 quality-filtered reads from which 1,547 contained an incorrect tag (1.09%). Given that 256 distinct 4-mer tags are possible and that we used only 20 of these, the majority of sequencing or primer errors in this region are detectable. Correcting for the small fraction of undetectable errors (20/256) and division by four yields an error rate of 0.296% per single nucleotide - at the position of the tag in the primer (this includes errors during primer synthesis as well as sequencing). Since most errors are actually visible as errors, the rate of unintentional ‘miscall’ of sample identity is 0.092%. Definition of OTUs To reduce computational time and complexity, we built OTUs using the complete filtered dataset covering all non-redundant reads from the 20 samples. Exactly identical sequences were represented by one representative only; after OTU computation, redundant sequences were taken into account for OTU abundance analysis. For subsequent taxonomy classification, we included additional quality-filtered 16S rRNA reference sequences, selected from the Greengenes database (http://greengenes.lbl.gov/Download/Sequence_Data/Greengenes_format/greengenes16SrRNAgenes.txt.gz, release 01-28-2009 [59]). This reference database is based on full-length non-chimerical sequences with a minimum length of 1100nt (in order to fully cover the V6 region of all entries). No archaeal sequences were included in the analysis. The alignment of non-redundant reads from all mice with the reference database was performed using the secondary-structure aware Infernal aligner (http://infernal.janelia.org/, release 1.0, [60]) and based on the 16S rRNA bacterial covariance model of the RDP database (http://rdp.cme.msu.edu/; [61]). Before defining OTUs, we first removed reference sequences for which the alignment was not successful (Infernal bit-score<0). The alignment was then processed to include an equivalent amount of information from every read. To do so, we identified the consensus reverse primer sequence of the V6 region within the aligned sequence of Escherichia coli K12, as a reference. The full alignment was then trimmed from the start position (defined by the E. coli V6 reverse primer) and ended after 200nt. This also ensured a further limitation of the effect of pyrosequencing errors by trimming the 3′ end of each read, a region which is more sequencing-error prone (the trimmed and aligned reads length ranged from 152 to 231nt) [58]. Using this alignment, OTUs were built by hierarchical cluster analysis at various distances (0.01, 0.03, 0.05 and 0.10) using the ‘complete linkage clustering’ tool of the RDP pyrosequencing pipeline http://pyro.cme.msu.edu/; [61]). Taxonomy assignment In a first step, taxonomy was inferred for all reads using the stand-alone version of the RDP classifier (http://sourceforge.net/projects/rdp-classifier, revision 2.0, [62]). Taxon-level predictions were considered reliable when supported by a minimum bootstrap value of 80%. In order to predict taxonomy for each OTU, we either used any reference sequences present within a cluster, or the taxonomy of the reads present in the cluster, as predicted by the RDP classifier. To increase the resolution of the prediction, we privileged any reference sequences over the reads. For each OTU, taxonomy was inferred by a simple majority vote: if more than half of the reference sequences (or reads) present within a cluster agreed on a taxon, the OTU was annotated according to this taxon. In case of conflicts, we assigned a consensus taxon to a higher phylogenetic level for which the majority vote condition was met. Chimera estimation Deep pyrosequencing on the 454 platform has revealed extensive microbial diversity that was previously undetected with culture-dependent methods [63]. Nevertheless, sequencing data generated from pools of PCR products have to be interpreted carefully; limitations and biases of the PCR technique have to be taken into account. This can lead to over-estimations of microbial diversity as has been recently reported [58], [64]. Moreover, during amplification, chimerical sequences can be generated. On such short sequences, recombination points (recombination can occur from an incompletely extended primer or by template-switching; [65]) are extremely difficult to detect. Recently, a new tool to filter noise and remove chimera in 454 pyrosequencing data has been published [64]. There, the authors suggest that because of sequencing errors, diversity estimates may be at least an order of magnitude too high. To our best knowledge, at the time of our analysis, there were no available tools to detect chimera within libraries of short 454 reads. Therefore, in order to detect chimeras we decided to compare taxonomies assigned to N-terminal and C-terminal read fragments using BLASTn. In order to ensure a reasonable alignment length and a relatively high identity to the matching reference sequences, we only analyzed reads for which both fragments had a minimum identity of 95% and a minimum bit-score of 150 (these cutoffs were selected heuristically). A given read was deemed chimeric when the taxonomies of the best hits of each half were clearly not congruent (i.e., differing at the phylum level). Our simple chimera detection method resulted in a slightly higher rate of detected chimera compared to the method of Quince et al., 2009 (∼4.5% compare to ∼3% in their example), suggesting that our approach is at least of comparable stringency [64]. Supporting Information Figure S1 ‘Bacterial FACS’ assay for detection of a Salmonella-surface-specific immunoglobulins (Ig). A. Experimental design for detection of bacteria specific antibodies via FACS analysis. Antibodies directed against the surface of Salmonella typhimurium in murine serum or intestinal lavage were detected by bacterial FACS (Materials and Methods for details) as described previously [30]. Overnight cultures of S. tm att (approx. 106 cfu) were first incubated with decreasing dilutions (1∶180, 1∶60, 1∶20) of serum from a naïve (upper row) or a mouse at day 21 p.i. with S. tm att (lower row). Bacteria-bound IgG was then detected using an anti-mouse-IgG-PE antibody (or α-mouse-IgA-FITC, not shown). Left panel: Bacterial gate determined in FSC/SSC (forward-/sideward scatter). Other panels: Dot plot (FSC/FL2) showing IgG-PE+ bacteria. From these data, we calculated the Salmonella-specific Ig (relative fluorescence units) (Materials and Methods; Supplemental Fig1C and Fig.2C, Fig.6B): S. tm-specific Ig = (Mean fluorescence intensity of Ig+ bacteria § ) * (% of positively stained bacteria of all bacteria*). The percentage of positively stained bacteria of all bacteria at the respective dilution is shown in each FACS plot. The mean fluorescence intensity of the Ig+ bacteria is shown below each FACS plot. B. Antibodies specifically detect S. tm but not E. coli. S. tm att or E. coli were incubated with increasing dilutions (1∶2; 1∶6; 1∶18) of an intestinal lavage obtained from a mouse at day 21 p.i. with S. tm att. Bacteria-bound IgA was detected by anti-mouse-IgA-FITC. Upper panel: Dot plot (FSC/FL1) showing IgA-FITC+ S. tm att. Lower panel: Dot plot (FSC/FL1) showing IgA-FITC+ E.coli. C. Plotting Salmonella-specific antibodies. Specific Ig (rfu) against S. tmatt (S. tmatt coated; experiment) or E. coli (E. coli coated; neg. control) for decreasing dilutions (indicated as black slope) of serum (1∶180, 1∶60, 1∶20) or gut wash (1∶18, 1∶6, 1∶2) at day 21 post S. tmatt immunization were calculated as described in Supplemental Fig.1A and plotted for indicated mice. (0.20 MB TIF) Click here for additional data file. Figure S2 Immunohistological analysis of S. tm att-immunized mice is in line with an adaptive mucosal immune response. A. Immunohistological analysis of naïve C57BL/6 mice (conventional SPF gut flora; C-mice) and of mice at day 40 p.i. with S. tm att. Quantitative data of Ly6G+, IgA+, CD4+ and CD8+ cells in cecal tissue of naïve C57BL/6 and day 40 S. tm att-immunized mice were obtained by counting 15 randomly selected 40× high power fields (hpf) from 3 mice per group. Y-axis: average cell number per 40× hpf. B. Immunohistochemical stainings for Ly6G, IgA, CD4 and CD8 markers were prepared as indicated in Materials and Methods. Scale bar = 50µm. (6.86 MB TIF) Click here for additional data file. Figure S3 S. tm att immunized mice mount an O-antigen specific IgA response. Serum from the same naïve and S. tm att infected mice (day 40 p.i.) as shown in Fig. 2D was analyzed by immunoblot against different bacterial lysates (S. tm ΔO; L. reuteri; E. faecalis; S. en wt; E. coli; S. tm att; S. tm att digested with proteinase K; S. tm M933 [no flagella, no functional TTSS]; flagellin FliC). IgA was detected with a goat-anti-mouse-IgA-HRP secondary antibody. The experiment is representative for 6 different animals. (0.68 MB TIF) Click here for additional data file. Figure S4 Serum of a S. tm infected human patient (‘asymptomatic excretor’) shows a S. tm-specific serum Ig-response. The human patient had S. typhimurium-positive stool cultures for at least two months. The specificity of the Ig-response in the serum was tested by immunoblot against different bacterial lysates (S. tm ΔO; L. reuteri; E. faecalis; S. en wt; E. coli; S. tm att; S. tm att digested with proteinase K; S. tm M933 [no flagella, no functional TTSS]; flagellin FliC). Bound human Igs were detected using anti-human-IgA-HRP and anti-human-IgG-HRP secondary antibodies. The patient serum revealed S. tm- (but not S. en)-O-antigen-specific IgA- and IgG-responses. We speculate that the E. coli-O-antigen-specific antibodies might be attributable to a previous exposure to pathogenic E. coli spp.. However, this has not been analyzed. (1.00 MB TIF) Click here for additional data file. Figure S5 An acute inflammatory response in the gut might be required for the induction of S. tm-specific mucosal IgA. Non-sm-treated S. tm att immunized, sm-treated S. tm avir-immunized as well as S. tm att i.v. immunized mice do not mount a Salmonella-specific adaptive IgA response and are not protected against colitis upon challenge with S. tm wt. A. The first group of C57BL/6 mice (n = 5; black symbols) was pretreated with sm and immunized with S. tm avir (5×107 cfu by gavage). The second group of C57BL/6 mice (n = 5; blue symbols) was not pretreated with sm and immunized with S. tm att (5×107 cfu by gavage). The third group of C57BL/6 mice (n = 5; green symbols) was not pretreated with sm and immunized with S. tm att (5×105 cfu i.v.). At day 40 p.i. mice were treated with ampicillin and orally challenged with S. tm wt (200 cfu by gavage). Mice were sacrificed at day 2 post challenge and Salmonella loads in the cecal content (left panel) and the MLN (middle panel) were determined. Cecal pathology was evaluated (right panel). B. Salmonella-specific sIgA response. Ig-specific antibody responses against different bacterial lysates (S. tm ΔO; L. reuteri; E. faecalis; S. en wt; E. coli; S. tm att; S. tm att digested with proteinase K; S. tm M933 [no flagella, no functional TTSS]; flagellin FliC) were tested by immunoblot analysis. Immunoblots were incubated with serum or gut wash of sm-treated S. tm avir immunized mice, non-sm-treated S. tm att-immunized or S. tm att i.v. immunized mice that were orally challenged with S. tm wt at day 40 p.i.. Specific antibodies were detected with an anti-mouse-IgA-HRP conjugate. (1.27 MB TIF) Click here for additional data file. Figure S6 Antibody responses of S. tm att immunized JH −/− IgA−/− , pIgR−/− and TCRβ−/−δ−/− mice. The specific Ig-response of the mice shown in Table 1 (day 40 post immunization with S. tm att) was analyzed by Western blot using different bacterial lysates (S. tm ΔO; L. reuteri; E. faecalis; S. en wt; E. coli; S. tm att; S. tm att digested with proteinase K; S. tm M933 [no flagella, no functional TTSS]; flagellin FliC). Serum or gut wash of d 40 S. tm att-immunized knockout mice was tested and specific antibodies were detected with anti-mouse-IgA-HRP or anti-mouse-IgG-HRP conjugates. Panels show specificity of Ig in serum and gut wash of the indicated knock-out mice. Slight amounts of S. tm specific sIgA were detected in the gut wash of pIgR−/− mice. We speculate that this is attributable to the 10-fold increased serum IgA levels and consequent leakage in to the gut lumen as described before [18], [66] and our own data (not shown). (3.32 MB TIF) Click here for additional data file. Figure S7 Serovar specificity of the anti-LPS antibody response. S. tm att and S. en att immunized C57BL/6 mice mount a serovar-specific serum IgA response by day 40 p.i.. Sera of Salmonella-or mock-immunized mice were analyzed by immunoblot using anti-mouse-IgA-HRP antibody against lysates of S. tm att and S.en, respectively. Left panel: Serum of a S. tm att-immunized C57BL/6 mouse. Middle panel: Serum of a S. en att-immunized C57BL/6 mouse. Right panel: Serum of a mock immunized C57BL/6 mouse. (0.32 MB TIF) Click here for additional data file. Figure S8 L-mice mount a LPS-O-antigen-specific sIgA response by day 40 post S. tm att infection. The specific Ig-response of the mice shown in Fig. 4A (day 40 post immunization with S. tm att) was analyzed by Western blot using different bacterial lysates (S. tm ΔO; L. reuteri; E. faecalis; S. en wt; E. coli; S. tm att; S. tm att digested with proteinase K; S. tm M933 [no flagella, no functional TTSS]; flagellin FliC). Serum or gut wash was tested and specific antibodies were detected with anti-mouse-IgA-HRP or anti-mouse-IgG-HRP conjugates. (0.58 MB TIF) Click here for additional data file. Figure S9 Protein association network of genes significantly upregulated in the cecal mucosa of S. tm att→L mice. Gene expression profiles of the cecal mucosa of naïve L and S. tm att→L mice (day 40) were determined using mouse gene expression microarrays (see Materials and Methods; Supplemental Table 3). Genes significantly up-regulated in the cecal mucosa of S. tm att→L mice at day 40 p.i. (compared to naïve L-mice) were determined (significance of log2 fold changes p<0.001) and protein functional interactions visualized using STRING version 8.2 [67]. Colors denote the results of unsupervised clustering of the interaction network, the resulting clusters are indicative of proteins with related biological function and are annotated by GO categories assigned to genes detected as significantly up-regulated. For some gene groups, biological functions were added to ease understanding (i.e. T-cell-mediated immunity, immunity and defense, MHCI, MHCII, B-cell and antibody mediated immunity, cytokines and chemokines). Parameters used in STRING (Active Prediction Methods (connecting lines): Text mining (yellow), Neighborhood (green), Gene Fusion (red), Co-occurrence (blue), Co-expression (black), Experiments (purple), Databases (turquoise) ; confidence score: 0.15; Network clustering (KMeans: 10). (0.42 MB TIF) Click here for additional data file. Table S1 Bacterial strains and plasmids used in this study. (0.36 MB PDF) Click here for additional data file. Table S2 Mice used in this study. (0.19 MB PDF) Click here for additional data file. Table S3 Pathways significantly over-represented among the differentially expressed genes in the cecal mucosa of L-mice at day 40 p.i. with S. tm att. (0.47 MB PDF) Click here for additional data file. Table S4 Quantitative microbiota analysis by 454 amplicon sequencing. (0.39 MB PDF) Click here for additional data file.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Defecation frequency and timing, and stool form in the general population: a prospective study.

              Because the range of bowel habits and stool types in the community is unknown we questioned 838 men and 1059 women, comprising 72.2% of a random stratified sample of the East Bristol population. Most of them kept records of three consecutive defecations, including stool form on a validated six point scale ranging from hard, round lumps to mushy. Questionnaire responses agreed moderately well with recorded data. Although the most common bowel habit was once daily this was a minority practice in both sexes; a regular 24 hour cycle was apparent in only 40% of men and 33% of women. Another 7% of men and 4% of women seemed to have a regular twice or thrice daily bowel habit. Thus most people had irregular bowels. A third of women defecated less often than daily and 1% once a week or less. Stools at the constipated end of the scale were passed more often by women than men. In women of child bearing age bowel habit and the spectrum of stool types were shifted towards constipation and irregularity compared with older women and three cases of severe slow transit constipation were discovered in young women. Otherwise age had little effect on bowel habit or stool type. Normal stool types, defined as those least likely to evoke symptoms, accounted for only 56% of all stools in women and 61% in men. Most defecations occurred in the early morning and earlier in men than in women. We conclude that conventionally normal bowel function is enjoyed by less than half the population and that, in this aspect of human physiology, younger women are especially disadvantaged.
                Bookmark

                Author and article information

                Journal
                Intest Res
                Intest Res
                IR
                Intestinal Research
                Korean Association for the Study of Intestinal Diseases
                1598-9100
                2288-1956
                July 2019
                7 February 2019
                : 17
                : 3
                : 419-426
                Affiliations
                [1 ]Department of Internal Medicine, Kosin University College of Medicine, Busan, Korea
                [2 ]Cell Biotech, Co., Ltd., Gimpo, Korea
                Author notes
                Correspondence to Jae Hyun Kim, Department of Internal Medicine, Kosin University College of Medicine, 262 Gamcheon-ro, Seo-gu, Busan 49267, Korea. Tel: +82-51-990-5061, Fax: +82-51-990-5055, E-mail: kjh8517@ 123456daum.net
                Co-Correspondence to Sanghyun Lim, Cell Biotech, Co., Ltd., 50 Aegibongro 409beon-gil, Wolgot-myeon, Gimpo 10003, Korea. Tel: +82-31-987-6205, Fax: +82-31-987-6209, E-mail: shlim@ 123456cellbiotech.com
                [*]

                These authors contributed equally to this study.

                Author information
                http://orcid.org/0000-0003-1238-3961
                http://orcid.org/0000-0002-4183-3792
                http://orcid.org/0000-0001-6657-7361
                http://orcid.org/0000-0003-3422-3766
                http://orcid.org/0000-0003-3217-5115
                http://orcid.org/0000-0002-4272-8003
                Article
                ir-2018-00149
                10.5217/ir.2018.00149
                6667361
                30704159
                95e0938f-3f81-4107-9e4b-dfb1507a741c
                © Copyright 2019. Korean Association for the Study of Intestinal Diseases. All rights reserved.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 2 November 2018
                : 20 December 2018
                : 24 December 2018
                Categories
                Original Article
                Functional Bowel Disorders

                feces,gastrointestinal microbiome,composition,distribution

                Comments

                Comment on this article