INTRODUCTION Metagenomic studies on gut microbiota burst onto the scientific scene during the last decade, due to the advent of next-generation sequencing techniques. In a very short period of time, microbiologists moved from the study of single, isolated, cultivable microorganisms, specifically, those able to grow under standard laboratory conditions, to the investigation of very complex microbial communities, mainly composed of uncultivable bacteria (1, 2). The first metagenomics reports enabled an overview of the complexity of our gut microbial communities (3, 4). Further studies focused on establishing the correlation between the human gut microbiome, the collective genomes of all microbes inhabiting the gut (5), and different physiological states, including those having an influence on health. Currently, we know that the gut microbiota might affect food and drug metabolism (6), influences human behavior (7), shifts during the course of pregnancy (8), displays age-associated changes (9 – 12), and possesses distinctive features depending on geographical location (12, 13), among other features. It is also becoming clear that there is a strong link between dietary patterns and the gut microbial profile (14, 15). Furthermore, some links have been established between some disorders (for example, obesity and metabolic syndrome) and an imbalance in the gut microbial ecology, also called dysbiosis (16 – 18). Remarkably, intestinal dysbiosis has also been associated with autoimmune diseases, such as rheumatoid arthritis, type 1 diabetes, and inflammatory bowel disease (IBD) (19 – 21). Systemic lupus erythematous (SLE) is a prototypical autoimmune disease in humans that is characterized by the presence of hyperactive immune cells and aberrant antibody responses to nuclear and cytoplasmic antigens. Genetic, immunological, hormonal, and environmental factors contribute to disease susceptibility (22), and its prevalence varies greatly depending on the population under study, although a prevalence of 2 to 5 cases per 10,000 inhabitants is reportedly considered normal (23). Among the environmental factors, growing evidence suggests that molecular mimicry as a result of viral infection may contribute to the development of lupus (24). Also, some reports have highlighted intestinal infections that may ameliorate SLE symptoms (25), and a marked difference in the specificity of antibodies to bacterial DNA in healthy people and SLE patients has been indicated (26). In fact, there is early evidence of a different abundance of cultivable intestinal bacteria in SLE (27). Remarkably, it has been suggested that novel SLE biomarkers can be potentially found in the human microbiota (28). However, a study of the potential dysbiosis associated with SLE had not been tackled until now. Therefore, in this report we took advantage of next-generation sequencing techniques to explore the potential interplay of the human microbiome and SLE. We have proven, for the first time, that there is a gut microbial dysbiosis associated with SLE. RESULTS AND DISCUSSION Despite all the scientific knowledge generated in the last few years, and although few studies published so far support the dysbiosis theory as a key factor promoting chronic inflammation in autoimmune diseases (29 – 32), there is no scientific work that has taken advantage of next-generation sequencing techniques to explore the potential interplay of the human microbiome and SLE, the prototypical autoimmune disease in humans. Therefore, we designed our work with the aim of answering if there is an SLE-associated intestinal dysbiosis and, if so, which microbial population groups are related to the dysbiosis. We defined the SLE population group by considering that there is a census of about 300 SLE patients in Asturias (from a total population of about 1,000,000 inhabitants). Thus, we were able to obtain a group of SLE patients from a well-defined geographical location to compare them with a similar group of healthy controls (HC), considering factors such as sex, age, medication (absence of antibiotic, steroid, and immunological treatments during the last 6 months), medical history (presenting a wide variety of clinical SLE manifestations), duration of the disease (2 to 24 years), and absence of flares of disease activity at the time of sampling (systemic lupus erythematosus disease activity index [SLEDAI] score of ≤8 at the time of sample collection). The group of SLE patients included individuals with a large variety of symptoms (Table 1), allowing us to establish correlations between the microbial profile and SLE, which are very likely independent of a specific pattern of symptoms. This variability in the phenotype of the disease is an intrinsic characteristic of SLE (22). We also selected patients with no active disease at the time of sampling, because the clinical manifestations of the disease in this population group are not biased by the pharmacological treatment necessary to treat SLE individuals during disease relapse. Furthermore, mean dietary intakes of energy, macronutrients, micronutrients, fiber, and phyto-compounds were recorded, both from patients with SLE and healthy subjects, and we found that there was no significant difference between the 2 groups (Table 2). Also, no significant difference was found between the 2 groups regarding lifestyle-related factors (smoking, alcohol consumption, physical activity, and use of vitamin and mineral supplements [data not shown]). This reduced the possibility that our analysis was affected by factors shown to have an influence on the gut microbial profile, such as age (9), diet (15), or phenolic compound intake (33, 34). TABLE 1 Demographic, clinical, and immunological features of SLE patients Subject no. Age (yrs) Disease duration(yrs) Anti-dsDNA titer (U/ml) a Complement C3 (g/liter) a Complement C4 (g/liter) a Clinical and immunological features b SLE1 43 2 0.3 0.93 0.2 MR, PH, HD, ANA SLE2 68 3 0.7 0.96 0.18 MR, DL, PH, AR SLE4 35 4 7.7 1.53 0.22 PH, OU, RD, ANA SLE5 50 5 18 1.67 0.44 PH, OU, HD, ANA, anti-SSa SLE6 35 3 48 0.81 0.13 MR, OU, AR, HD, ANA, anti-dsDNA, anti-SSa SLE7 70 3 27 1.74 0.37 PH, OU, HD, ANA, anti-dsDNA SLE11 54 24 99.1 1.43 0.28 MR, DL, PH, AR, HD, ANA, anti-dsDNA, anti-Sm SLE12 58 6 13 0.84 0.22 DL, PH, AR, HD, ANA, anti-SSa, RF SLE13 40 6 0.6 0.83 0.25 MR, OU, ANA SLE14 40 12 4 0.92 0.18 AR, SE, RD, ANA, anti-SSb SLE15 51 24 104 0.83 0.14 MR, DL, PH, AR, SE, HD, ANA, anti-dsDNA, anti-SSa, anti-SSb, anti-Sm, anti-RNP, anti-CLP SLE16 54 24 45 1.76 0.3 PH, AR, ANA, anti-dsDNA, anti-SSa SLE17 46 13 19 0.8 0.11 MR, DL, PH, OU, AR, ANA, anti-dsDNA, anti-SSa, RF SLE18 43 12 4.1 1.04 0.16 DL, PH, OU, AR, SE, HD, ANA, anti-SSa, RF SLE19 34 4 0 1.19 0.25 MR, PH, OU, ANA SLE20 51 7 5.8 0.67 0.14 PH, OU, ANA, anti-SSa SLE21 59 11 1.2 1.16 0.17 PH, ANA, anti-dsDNA, anti-SSa, anti-CLP SLE22 64 11 4.4 1.17 0.25 MR, PH, AR, ANA, anti-dsDNA, anti-SSa SLE24 46 14 0.4 1.08 0.4 MR, PH, HD, ANA, anti-SSa, RF SLE26 46 20 38 0.89 0.18 MR, PH, OU, RD, HD, ANA, anti-dsDNA. a At the time of sampling. b Cumulatively registered. Abbreviations: ANA, antinuclear antibodies; anti-RNP, antiribonucleoprotein antibodies; anti-Sm, anti-Smith antigen antibodies; anti-CLP, anticardiolipin antibodies; RF, rheumatoid factor; AR, arthritis; DL, discoid lesions; HD, hematological disorder; MR, malar rash; OU, oral ulcers; PH, photosensitivity; RD, renal disorder; SE, serositis. TABLE 2 General characteristics and mean dietary intake of energy, macronutrients, fiber forms, and phyto-compounds in patients with SLE and healthy controls Characteristic SLE patients (n = 20) Healthy controls (n = 20) Female sex (%) 100 100 Age (yrs) 49.2 ± 10.7 b 46.9 ± 8.6 BMI (kg/m2) 26.1 ± 5.3 25.2 ± 4.2 Energy (kcal/day) 2,173.1 ± 722.4 1,875.9 ± 332.8 Lipid (g/day) a 84.5 ± 41.0 85.4 ± 20.5 MUFA (g/day) a 35.3 ± 19.7 35.7 ± 7.6 PUFA (g/day) a 17.2 ± 9.7 17.5 ± 9.4 SFA (g/day) a 24.9 ± 14.1 25.0 ± 6.0 Protein (g/day) a 104.9 ± 27.6 100.6 ± 20.9 Carbohydrates (g/day) a 205.0 ± 75.6 203.5 ± 47.0 Dietary fiber (g/day) a 24.9 ± 10.4 25.3 ± 9.1 Insoluble fiber (g/day) a 16.0 ± 8.6 16.6 ± 7.5 Soluble fiber (g/day) a 2.9 ± 1.5 2.8 ± 1.1 Total isoflavones (mg/day) a 2.4 ± 2.4 2.5 ± 2.7 Total phenolics (mg/day) a 833.2 ± 527.3 916.3 ± 437.8 a Model was adjusted for energy and BMI. PUFA, polyunsaturated fatty acids; MUFA, monounsaturated fatty acids; SFA, saturated fatty acids. b Values are means ± SD. Our work is based on 16S rRNA gene-based data for fecal microbiota and the bioinformatic analysis of the results. In a previous work (35), we optimized protocols to study the human fecal microbial population by using an Ion Torrent PGM sequencing platform. This methodology was applied in the current study, and we obtained an average of 592,305 high-quality reads per fecal sample (see Table S1 in the supplemental material). Rarefaction curves obtained by plotting the Shannon, Chao1, and phylogenetic diversity indexes against the number of sequences (see Fig. S1 in the supplemental material) showed that a large part of the diversity of the samples was detected. The microbiota composition at the phylum and family levels was obtained (Fig. 1; see also Fig. S2 in the supplemental material). Remarkably, even considering the broad heterogeneity of the clinical manifestations of SLE in the individuals under study (Table 1), we obtained a particular type of microbiota for the SLE group. In this regard, the presence of anti–double-stranded DNA (dsDNA) antibodies and other clinical data were organized in a metadata file for all the microbiota profiles. A principal component analysis (PCoA) was performed with both metadata/microbiota profiles, using the variability of the 16S rRNA gene profiling at different taxonomic levels. Sample classification according to the metadata revealed no specific clustering of the samples or correlations with the different clinical features or anti-dsDNA antibodies (data not shown). FIG 1 Aggregate microbiota composition in fecal samples from control (HC) and lupus-affected (SLE) subjects at the phylum level (a) and family level (b). In panel b, only taxonomic groups representing >0.5% are shown. In silico analysis of the sequences highlighted the key findings of our work. A high-quality filtering approach was used in order to process the Ion Torrent-generated sequencing data (see Table S1 in the supplemental material); a total of 293,436 unfiltered operational taxonomic units (OTUs) were identified by using uclust for de novo OTU picking. Based on each of five alpha-diversity measures (Chao1, PD whole tree, observed species, Shannon, and Simpson indexes), patients and controls were not significantly different (data not shown). Notably, one of the main results was the identification of a clear dysbiosis between the two study groups which was characterized by a higher relative abundance of Bacteroidetes in the SLE group. Overall, we detected a significant decrease in the Firmicutes/Bacteroidetes ratio in SLE individuals: the microbiota of SLE patients, compared with controls, had an almost-2.5-fold-decreased ratio (Fig. 2A) (P 25, and with truncation of a sequence at the first base if a low-quality rolling 10-bp window was found. Presence of homopolymers of >7 bp and sequences with mismatched primers were omitted. In order to calculate downstream diversity measures (alpha and beta diversity indices, Unifrac analysis), 16S rRNA OTUs were defined at ≥97% sequence homology by using uclust (55). Chimeric sequences were removed using ChimeraSlayer (56). Furthermore, OTUs that included less than 10 sequences were filtered using QIIME (54). All reads were classified to the lowest possible taxonomic rank by using QIIME and a reference data set from the Ribosomal Database Project (57). The sequence data features of all the samples are included in Table S1 in the supplemental material. Different alpha diversity indexes were calculated using QIIME and information from the OTU tables using the alpha_diversity.py script. The different diversity metrics were set by passing the option –s to the script. The following indexes were calculated for every sample and compared between groups by using a two-sided Student’s t test: Chao1, PD whole tree, observed species, Shannon, and Simpson. Analysis by qPCR. Quantification of total fecal bacteria, Firmicutes, and Bacteroidetes by qPCR was performed by using previously described primers and conditions (58 – 60). Analyses were done in duplicate in a 7500 Fast real-time PCR system (Applied Biosystems) using Sybr green PCR master mix (Applied Biosystems). Standard curves were made with pure cultures, grown under anaerobic conditions at 37°C, of Escherichia coli LMG 2092 in Gifu anaerobic medium (GAM; Nissui Pharmaceutical Co., Tokyo, Japan), Faecalibacterium prausnitzii DMSZ 17677 in RCM formula (Oxoid Ltd., Basingstoke, Hampshire, United Kingdom), and Bacteroides thetaiotaomicron DSMZ 2079 in GAM. Functional inference. The functionality of the different metagenomes, grouped by disease status (healthy control versus SLE), was predicted using the software PICRUSt 0.9.1 (http://picrust.github.io), which has been explained in detail elsewhere (48). In short, this software allows the prediction of functional pathways from the 16S rRNA reads. First, a collection of closed-reference OTUs was obtained from the filtered reads by using QIIME v 1.7.0 (54) and by querying the data against the IMG/GG reference collection (GreenGenes database, May 2013 version; http://greengenes.lbl.gov). Reverse-strand matching was enabled during the query, and OTUs were picked at a 97% identity. A BIOM-formatted table (biological observation matrix [61]) was obtained with the pick_closed_reference_otus.py script. This table, containing the relative abundances of the different reference OTUs in all the metagenomes, was normalized based on the predicted 16S rRNA copy number by using the script normalize_by_copy_number.py. Final functional predictions, inferred from the metagenomes, were created with the script predict_metagenomes.py. When necessary, tab-delimited tables were obtained with the script convert_biom.py. Analysis of predicted metagenomes. PICRUSt and QIIME provide a number of scripts that can be useful for analyzing both 16S rRNA gene relative abundances and the predicted metabolic data. Predicted metagenomic contents were collapsed at KEGG pathway level 3 (http://www.genome.jp/kegg/pathway.html) with the categorize_by_function.py script, and the data were analyzed statistically by using STAMP v 2.0.0 (62). STAMP allows data filtering and the application of different statistical tests and corrections, including PCoA. It also generates different graphics, including box plots, error plots, and scatter plots. Data of the KEGG pathway distributions were plotted by using the script summarize_taxa_through_plots.py. Associations of different taxonomic categories to SLE were statistically analyzed with the script otu_category_significance.py. Statistical analyses. Statistical analysis was performed using IBM-SPSS version 19.0 (IBM SPSS, Inc., Chicago, IL). For descriptive purposes, in Table 2 the mean values are presented as means ± SD on untransformed variables. Differences between SLE patients and controls were compared by using a multivariate linear model, including energy intake and BMI as covariates. Individuals were ordered according to their sequence data composition by principal component analysis using the taxonomic data at the phylum and family levels. Patterns were extracted using all the variations from the taxonomic data via an indirect method as a model and SLE as metadata. To analyze the associations between inferred metabolic pathways and SLE, metabolic pathways with very low abundance levels ( 1% are shown in the keys. Download Figure S2, DOC file, 0.1 MB Figure S3 (A) Scatter plot at the phylum level in which the relative abundances (means) of Firmicutes and Bacteroidetes in healthy controls (x axis) and SLE patients (y axis) are represented. (B) The normalized abundances of the classes Bacteroidia and Clostridia and the orders Bacteroidales and Clostridiales differed significantly between the lupus (SLE) and the healthy control (HC) 16S rRNA sequence data, as represented in the error plots. Tables with relative abundances and sequence metadata were elaborated using QIIME and analyzed using STAMP v 2.0.0. Two-sided Welch’s tests were run on every pair of means. The axes represent mean proportions of the taxa in HC (x axis) or SLE (y axis) groups. The vertical and horizontal lines in each taxa represent data spread (2nd and 98th percentiles) in the HC group (horizontal lines) and in the SLE group (vertical lines). The statistical fit of the points to the gray dashed y = x line is represented by the coefficient of determination in the upper left part of the plot (R2). The y = x line represents the boundary delimiting taxa that are enriched in the HC group (blue dots) or in the SLE group (yellow dots). Download Figure S3, DOC file, 0.3 MB Figure S4 Scatter plot at the family level in which the relative abundances (means) of the bacterial families in healthy controls (x axis) and lupus patients (y axis) are represented. Families showing the greatest relative abundances and divergence from the diagonal are labeled with arrows. Download Figure S4, DOC file, 0.2 MB Figure S5 Summaries of the communities present in the different samples analyzed in this study. Plots were obtained at four taxonomic levels: phylum (A), class (B), order (C), and family (D), and the most abundant taxons are labeled with arrows. Data were obtained using QIME v 1.7.0 with the scripts pick_de_novo_otus.py and summarize_taxa_through_plots.py. Download Figure S5, DOC file, 0.5 MB Figure S6 KEGG pathways (hierarchical level 3) for which normalized abundances differed between SLE patients and healthy controls. Functional metagenomes were predicted using PICRUSt scripts, and only pathways reaching statistically significance are shown. On the y axis, the KEGG pathways are indicated. Mean proportions for each metadata group (healthy controls versus SLE) are shown together with the confidence intervals of the data. Variations due to every single metagenome on mean proportions are shown by horizontal error bars. Download Figure S6, DOCX file, 0.2 MB Table S1 Sequence data features Table S1, PDF file, 0.04 MB. Table S2 Statistically significant differences between healthy controls and SLE patients at the phylum and family levels Table S2, PDF file, 0.4 MB.