Introduction It is estimated that one third of the world's population is infected with Mycobacterium tuberculosis (M. tuberculosis), although the majority will never develop active disease. The factors that govern the development of tuberculosis disease are complex and incompletely understood. Various factors have been clearly associated with increased susceptibility to tuberculosis. HIV infection is by far the most important; it increases the lifetime risk of sub-clinical infection converting to active disease from 1 in 10 to 1 in 3  and is strongly associated with disseminated disease. Defining the contribution of host genetic polymorphisms to disease susceptibility has been more difficult. Studies have suggested polymorphisms in several genes are associated with the development of pulmonary tuberculosis. Some of the genes with polymorphisms that have been validated in multiple studies and may have an effect on gene function include solute carrier family 11, member 1 (SLC11A1, formerly NRAMP1) –, interferon gamma ,, TIRAP/MAL , P2XA7 ,, and CCL2 (or MCP-1), –. Others have shown the less common extra-pulmonary manifestations of tuberculosis may have a different host genetic susceptibility profile and have implicated various polymorphism in components of the innate host response to infection  , ,. We have recently reported associations between the development of TBM and single nucleotide polymorphisms (SNP) in the Toll-interleukin-1 receptor domain containing adaptor protein (TIRAP) and Toll-like receptor-2 (TLR-2) genes ,. However, tuberculosis disease results from the interactions between host and bacteria and there have been no studies examining the influence and relationship of both host and bacterial genotype variation on clinical disease phenotype. M. tuberculosis exhibits a clonal population structure , and therefore was regarded until recently as an organism with little relevant genetic variation . However, studies examining M. tuberculosis isolates from wider geographic distributions using whole genome scanning approaches have revealed a cladal phylogeographic distribution with significant variation between major lineages, each of which is associated with specific geographic regions , (Figure 1). The degree to which this genetic variation influences disease phenotype has been difficult to study. In vitro and in vivo models of infection have shown different genotypes of M. tuberculosis induce different patterns of host immune response –, but the relevance of these findings to human disease remains uncertain. Epidemiological studies have found some genotypes may be associated with different disease phenotypes. For example, several studies have suggested an association between mycobacterial plc gene polymorphism and disseminated extra-pulmonary disease –, but these studies have been small, retrospective, or unable to determine if differences are due to host genetic susceptibility or bacterial genetic virulence determinants. 10.1371/journal.ppat.1000034.g001 Figure 1 Mycobacterium tuberculosis lineages defined by large sequence polymorphism (LSP) analysis. The circles represent Region of difference (RD region) deleted in each lineage. Only East Asian, Indo-Oceanic and Euro-American lineages were identified in Vietnam. There has been much interest in the Beijing genotype of M. tuberculosis, which is highly prevalent in Asia and the states of the former USSR and has been responsible for outbreaks of multi-drug resistant tuberculosis in the USA ,. Animal models of infection with this genotype have suggested it leads to a hypervirulent phenotype compared with other common strains of M. tuberculosis . This behaviour has been attributed to an intact polyketide synthase (pks 15/1) gene and the production of a phenolic glycolipid (PGL) . PGL synthesis appears to attenuate the early host immune response to infection and is associated with reduced production of inflammatory cytokines (30). The ability of Beijing strains to elude the host innate immune response may explain why a recent study has found this genotype is associated with haematogenously disseminated disease . Animal infection models suggest haematogenous dissemination of infection occurs before the onset of T-cell mediated immunity  and supports the hypothesis that the ability of different strains of M. tuberculosis to produce different clinical phenotypes varies dependent upon their interaction with the host innate immune response. The study described here examined the relationship between polymorphisms in genes responsible for host innate immunity, bacterial genotype, and the development of pulmonary or meningeal tuberculosis. TBM represents the most severe form of haematogenously disseminated tuberculosis causing death or severe disability in more than half of sufferers . We demonstrate that bacterial genotype does influence disease phenotype and interactions between bacterial and host genotype further influence disease expression. Results Association between bacterial genotype and disease phenotype Spoligotyping, RFLP, and MIRU typing To investigate whether different strains of M. tuberculosis are associated with disseminated disease, we examined isolates from HIV-negative adult patients in Vietnam who either had meningeal disease (n = 187) or localized pulmonary TB (n = 237). Isolates of M. tuberculosis were collected from the CSF of patients with meningitis or the sputum of those with pulmonary TB. The median age of TBM patients was 32 years (range 15–78 years) and of pulmonary patients 36 (range 15–89) (Table 1). We then genotyped each strain by 3 standard methods: spoligotyping, RFLP, and MIRU typing. Three pulmonary isolates showed evidence of mixed culture by more than one method on repeated occasions (dual bands on LSP typing, dual peaks on MIRU, secondary banding on RFLP, for example) and were therefore excluded from further analysis. It is not known if these cases represent mixed infections or laboratory contamination but it is likely that in a sample of this size some patients would be infected with multiple strains. 234 pulmonary isolates were therefore included in all further analyses. 10.1371/journal.ppat.1000034.t001 Table 1 Demographic data for cases of TBM and pulmonary tuberculosis recruited to the study. TBM (n = 187) Pulmonary TB (n = 237) male female Male female Age group (years) 15–25 20 33 32 28 26–35 25 31 35 19 36–45 20 13 34 17 46–55 13 6 18 10 56–65 6 6 11 8 65+ 5 9 10 15 Total 89 98 140 97 Address of participants a Urban 24 20 31 21 Sub-urban 8 5 14 9 Rural (HCMC surrounds) 4 13 8 17 Rural south-East 23 22 48 22 Rural south West 30 38 39 28 Total 89 98 140 97 a Defined as the main place of residence on entry to the study. Urban addresses were those within the central districts of Ho Chi Minh City (HCMC); sub-urban addresses were those within the outer districts of HCMC; rural (HCMC surrounds) addresses were in the immediate surrounding rural districts of HCMC; the other rural addresses were defined by whether they were south east or south west of HCMC. Table 2 summarises how the methods clustered the isolates and their respective ability to discriminate between strains. Overall, 348/421 (82.7%) of isolates clustered by spoligotyping, of which 159/421 (37.8%) were ST1 or the ‘Beijing’ genotype (including variants lacking additional spacers 37–43) and 74/421 (17.6%) belonged to the Vietnam genotype, ST319 . By RFLP, the single largest cluster, the Hanoi genotype , was formed by single copy isolates, n = 119/421 (28.3%). MIRU typing clustered 57.7% (n = 243/421) of isolates. The 3 largest clusters were composed of MIRU 233325173533 (n = 28); MIRU 364225223533 (n = 20), MIRU 223325173533 (n = 15). There was no significant difference (P>0.05) between the proportions clustering in the pulmonary and meningeal tuberculosis groups by any of these three methods and no significant associations were found between any cluster and the two disease phenotypes. 10.1371/journal.ppat.1000034.t002 Table 2 Spoligotype, IS6110 RFLP and MIRU typing for all M. tuberculosis isolates in the study. Typing technique All isolates clustering (n = 421) Major clusters Median cluster size Hunter-Gaston Discrimination index a Pulmonary isolates clustering (n = 234) TBM isolates clustering (n = 187) Pulmonary TBM All Spoligotyping 348 (82.7%) ST1 (Beijing) (38%)*, 3 0.842 0.798 0.826 179 (76.5%) 144 (77%) ST319 (18%)b † RFLP 238 (56.5%) Ha Noi genotypec †(28.3%) 2 0.932 0.908 0.917 121 (51.7) 94 (50.3%) zero copy isolates (5%)d † MIRU 243 (57.7%) 233325173533 (6.6%)* 2 0.990 0.986 0.988 112 (47.9%) 99 (52.9%) 223325173533 (3.5%)* 364225223533 (4.7%)† a where N = the total number of strains in the sample population, s = the total number of types described and nj = the total number of strains belonging to the j th type . b ST319, also known as the Vietnam genotype . c The Ha Noi genotype has a single IS6110 copy and is prevalent throughout Vietnam . d M. tuberculosis isolates with no IS6110 insertion elements are relatively common in South-East Asia and have been reported in several studies of Vietnamese strains ,. * Isolates of East-Asian Genotype in the LSP typing system of Gagneux et al. . † Isolates of the Indo-Oceanic genotype in the LSP typing scheme of Gagneux et al. . LSP typing and the pks 15/17 bp deletion We next examined whether M. tuberculosis clades defined by large-sequence polymorphisms (LSPs) were associated with the clinical disease phenotype. The Indo-oceanic lineage, also known as East-African Indian (EAI) , or ancestral lineage , with RD239 deleted, represented 104/234 (44.4%) pulmonary isolates and 88/187 (47.1%) of the meningeal isolates (Table 3). The East Asian or ‘Beijing’ lineage (RD105 deleted) represented 87/234 (37.1%) of pulmonary isolates and 81/187 (43.3%) meningeal isolates. There was no significant association between either of these lineages and disease phenotype. However, we found a significant association between the Euro-American lineage and pulmonary rather than meningeal tuberculosis (13% (13/234) v.s 5.9% (7/187), Crude odds ratio for causing TBM 0.40, 95% confidence intervals 0.19–0.80, P = 0.009) (Table 3). We sequenced the pks gene codons 54 to 154 to confirm that all isolates in the Euro-American lineage were wild-type, identical to the H37Rv sequence. In addition, we sequenced the pks 15/1 gene from 12 isolates randomly selected from the RD105 and RD239 deleted clades and demonstrated all contained the identical 7 bp insertion described in HN878 ,. As expected, all RD105 or RD239 deleted isolates were subsequently shown to have the pks 7 bp insertion by MAS-PCR screening. 10.1371/journal.ppat.1000034.t003 Table 3 LSP lineages of M. tuberculosis isolates causing pulmonary and meningeal tuberculosis. Group All isolates (%) Pulmonary tuberculosis (%) TBM (%) χ2 P-value OR [95% CI]b East Asian (RD105 deleted) 168 (39.9) 87 (37.2) 81 (43.3) 1.631 0.20 1.29 [0.87–1.91] Indo-Oceanic (RD239 deleted) 192 (45.6) 104 (44.4) 88 (47.1) 0.286 0.593 1.11 [0.76–1.63] Euro-American (pks 15/1 Δ7 bp) 43 (10.2) 32 (13.7) 11 (5.9) 6.88 0.009 0.40 [0.19–0.81] Undefined a 18 (4.3) 11 (4.7) 7 (3.7) 0.232 0.629 0.79 [0.30–2.08] Total 421 (100) 234 (100) 187 (100) a Undefined isolates failed to generate a product on repeated PCR for one of the two RD regions despite generating product for other PCRs; it is likely these isolates carried additional deletions or mutations in the primer region. b Odds ratio was calculated comparing the meningeal and pulmonary proportions for each lineage. To confirm the association was not an artifact of demographic differences between the populations we performed multivariate logistic regression with genotype, disease phenotype, age, sex and the participant address (classified into 5 areas) entered into the model. Age and sex influence susceptibility to extrapulmonary tuberculosis , certain genotypes of M. tuberculosis are associated with young age in Vietnam  and analysis by residential district eliminated any potential bias in urban/rural populations of M. tuberculosis. By this analysis the Euro-American isolates were still strongly associated with pulmonary rather than meningeal disease (OR for TBM = 0.40, 95% C.I. 0.20–0.83 P = 0.013). To provide further support for the biological significance of this finding we investigated whether outcome from TBM was influenced by bacterial lineage. No deaths occurred among those infected with fully drug susceptible Euro-American isolates (n = 0/8), whereas 22.6% (27/119) of patients with susceptible isolates of Indo-Oceanic and East-Asian lineages had died by 9 months (Fisher's exact test, P = 0.201). Relationship between host and bacterial genotypes and disease phenotype The polymorphisms found in the TIRAP and TLR-2 genes and their associations with disease phenotype have been reported previously ,. In brief, we found previously that the TIRAP SNP C558T and the TLR-2 SNP T597C were associated with susceptibility to meningeal rather than pulmonary tuberculosis and this was reconfirmed in the current dataset. Therefore, we examined whether these polymorphisms were associated with infection with any particular bacterial genotype and whether the relationship influenced disease phenotype. Host genotype was available on 314 patients; TIRAP 558 genotype was defined in 313 (145 TBM, 168 pulmonary) and TLR2 597 in 306 (141 TBM, 165 pulmonary). The polymorphism frequencies and pathogen genotypes are shown in Table 4. All SNPs were in Hardy Weinberg equilibrium (HWE) in cord-blood control individuals (P≥0.05). 10.1371/journal.ppat.1000034.t004 Table 4 TLR2 T597C SNP allele and bacterial genotype frequencies: comparison with host genotype distribution in the cord blood control group. Group, lineage Allele Genotype Genotype comparison Allelic comparison T (frequency) C (frequency) TT (frequency) TC (frequency) CC (frequency) χ2 P OR (95% C.I)a χ2 P Cord blood controls 564 (0.748) 190 (0.252) 205 (0.544) 154 (0.408) 18 (0.048) 1 All isolates 428 (0.699) 184 (0.301) 153 (0.500) 122 (0.399) 31 (0.101) 7.412 0.025 1.28[1.01–1.62] 4.023 0.045 Indo-Oceanic 206 (0.725) 78 (0.275) 76 (0.535) 54 (0.380) 12 (0.085) 2.630 0.268 1.12 [0.83–1.53] 0.553 0.457 Euro-American 44 (0.710) 18 (0.290) 16 (0.516) 12 (0.387) 3 (0.097) 1.410 0.494 1.21 [0.69–2.15] 0.443 0.505 All Indo-Oceanic+Euro-American 271 (0.728) 101 (0.272) 100 (0.538) 71 (0.382) 15 (0.081) 2.532 0.282 1.11 [0.84–1.47] 0.495 0.481 East-Asian/Beijing 157 (0.654) 83 (0.346) 53 (0.442) 51 (0.425) 16 (0.133) 11.635 0.003 1.57 [1.15–2.15 8.048 0.004 TBM only All isolates 187 (0.663) 95 (0.337) 66 (0.468) 55 (0.390) 20 (0.142) 13.596 0.001 1.51 [1.12–2.03] 7.417 0.006 Indo-Oceanic+Euro-American 114 (0.704) 48 (0.296) 42 (0.519) 30 (0.370) 9 (0.111) 4.861 0.088 1.25 [0.86–1.82] 1.361 0.243 East-Asian/Beijing 73 (0.608) 47 (0.392) 24 (0.400) 25 (0.417) 11 (0.183) 16.390 0.0003 1.91 [1.28–2.86] 10.219 0.001 Pulmonary isolates All isolates 241 (0.730) 89 (0.270) 87 (0.527) 67 (0.406) 11 (0.067) 0.828 0.661 1.10 [0.81–1.47] 0.376 0.539 Indo-Oceanic+Euro-American 157 (0.748) 53 (0.252) 58 (0.552) 41 (0.390) 6 (0.057) 0.223 0.895 1.00 [0.71–1.43] 0.0001 0.991 East-Asian/Beijing 84 (0.700) 36 (0.300) 29 (0.483) 26 (0.433) 5 (0.083) 1.676 0.433 1.27 [0.82–1.47] 1.244 0.264 a OR was calculated comparing each group to the genotype/allele distribution in the cord blood controls. We analyzed the distribution of alleles and genotypes of the TB groups in comparison with the cord-blood controls (Table 4). TIRAP C558T was associated with susceptibility to TBM as previously reported OR = 2.96 [95% C.I. 1.71–5.11], however, there was no stronger association between TIRAP C558T and TB caused by any unique M. tuberculosis lineage (data not shown). As previously reported , the TLR2 T597C polymorphism was associated with all cases of tuberculosis (control vs. all isolates; OR = 1.28 [95% C.I. 1.01–1.62], P = 0.045). However, the allelic association was strongest for TB cases caused by the Beijing genotype isolates (control vs. East Asian/Beijing; OR = 1.57 [95% C.I. 1.45–2.15], P = 0.004). There was no association between the TLR2 597C polymorphism and tuberculosis caused by the Indo-Oceanic (P = 0.457) and Euro-American isolates (P = 0.505). We next examined whether clinical disease phenotype, pulmonary or meningeal disease, influenced the association between TLR2 T597C and bacterial genotype. There was no allelic association between TLR2 T597C and pulmonary TB caused by non-Beijing isolates (control vs. pulmonary non-Beijing: OR = 1.00 [95% CI 0.71–1.43] P = 0.991) or for Beijing isolates (control vs. pulmonary East-Asian/Beijing: OR = 1.27 [95% C.I. 0.82–1.47], P = 0.264) (Table 4). There was an overall association of TLR2 T597C with meningeal disease (OR = 1.51 [95% C.I. 1.12–2.03] P = 0.006) but this was not significant for meningeal disease caused by non-Beijing isolates (control vs. TBM non-Beijing OR = 1.25, [95% C.I. 0.86–1.82], P = 0.243). The strongest allelic association was between TLR2 T597C and TBM caused by Beijing genotype isolates (control vs. TBM East Asian/Beijing; OR = 1.91 [95% C.I. = 1.28–2.86], P = 0.001). On genotypic analysis this association was also highly significant (χ2 = 16.39, P = 0.0003) (Table 4). We previously used a likelihood ratio test with Bayesian Information Criterion values to determine that the association between TLR2 T597C genotypes and TB showed best fit with a dominant (comparing 597TT/TC vs. 597CC) rather than a recessive (comparing 597TT vs 597TC/CC) model . When we analyzed the association of TB caused by the Beijing lineage and TLR2 T597C using a dominant model for all types of clinical TB, we found a highly significant association (Table 5) (control vs. all East Asian/Beijing isolates: OR = 3.07 [95% C.I. 1.51–6.23], P = 0.001]. By comparison, there was no significant association between TLR2 T597C and TB aused by non-Beijing strains (control vs. all non-Beijing isolates: OR 1.75 (95% CI 0.86–3.56, P = 0.118). The association between TLR2 T597C and the Beijing strains was strongest for patients with meningeal TB (control vs. TBM East-Asian Beijing OR = 4.48 [95% C.I. 2.00–10.04], P 15 years old with a negative HIV test. All patients were followed for 9 months after the start of treatment; disability was assessed in survivors by the modified Rankin score . Adult patients with uncomplicated pulmonary tuberculosis were recruited between September 2003 and December 2004 at 5 district tuberculosis units (DTUs) from Ho Chi Minh City and the surrounding districts, chosen to represent the geographic distribution of isolates among TBM patients in order to avoid an urban/rural bias in one sample set. Cases were defined by the culture of M. tuberculosis from sputum, a chest X-ray appearance consistent with active tuberculosis without evidence of miliary or extra-pulmonary tuberculosis, and no clinical evidence of extra-pulmonary disease. As far as possible, patients were prospectively matched to TBM patients by age (+/−5 years) and district of residence, defined in five groups as: urban, sub-urban, rural (surrounding HCMC), rural south-East or rural South-West. Matched patients were recruited from a DTU within each of these districts. Gender matching was attempted but not achieved due to a larger number of men with pulmonary TB attending the DTUs. The control group comprised of 389 DNA samples extracted from the umbilical cord blood of newborn babies born at Hung Vuong Hospital, Ho Chi Minh City, in 2003. All samples came from unrelated individuals who were ethnic Vietnamese Kinh, as assessed by questionnaire. Written informed consent was obtained from each patient or an accompanying relative if the patient could not provide consent. All protocols were approved by ethical review committees at the HTD, PNT Hospital for Tuberculosis and Lung Disease, Hung Vuong Hospital and Health Services of Ho Chi Minh City in Vietnam. Ethical approval was also granted by Oxfordshire Clinical Research Ethics Committee UK, Oxford Tropical Research Ethics Committee UK, The University of Washington USA and the Western Institutional Review Board USA. Host genotyping Host genotyping and identification of TLR2 and TIRAP SNPs have been reported in detail previously ,. Briefly, polymorphisms in both genes were identified by sequencing a randomly selected sub-group of patients with TBM. All subjects were then genotyped for the designated SNPs by an allele-specific primer extension assay (MassARRAY™, Sequenom, San Diego, USA). M. tuberculosis genotyping All M. tuberculosis isolates were genotyped by four established methods: IS6110 restriction fragment length polymorphisms (RFLP) , spacer oligo-nucleotide typing (spoligotyping) , 12 allele mycobacterial interspersed repetitive unit (MIRU) typing , and large sequence polymorphisms (LSP) defined by deligotyping . RFLP has limited discrimination in low-copy number isolates ( 0.055 to remove) to identify variables associated with disease phenotype on multivariate analysis. The variables examined in the model were LSP genotype, site of TB, age, sex and residential district. For analysis of host polymorphisms, allelic and genotypic frequencies were compared between the groups using a Chi square test. We also analyzed the data with recessive and dominant models as previously described . P values of ≤0.05 were considered statistically significant.