8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Metabomatching: Using genetic association to identify metabolites in proton NMR spectroscopy

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          A metabolome-wide genome-wide association study (mGWAS) aims to discover the effects of genetic variants on metabolome phenotypes. Most mGWASes use as phenotypes concentrations of limited sets of metabolites that can be identified and quantified from spectral information. In contrast, in an untargeted mGWAS both identification and quantification are forgone and, instead, all measured metabolome features are tested for association with genetic variants. While the untargeted approach does not discard data that may have eluded identification, the interpretation of associated features remains a challenge. To address this issue, we developed metabomatching to identify the metabolites underlying significant associations observed in untargeted mGWASes on proton NMR metabolome data. Metabomatching capitalizes on genetic spiking, the concept that because metabolome features associated with a genetic variant tend to correspond to the peaks of the NMR spectrum of the underlying metabolite, genetic association can allow for identification. Applied to the untargeted mGWASes in the SHIP and CoLaus cohorts and using 180 reference NMR spectra of the urine metabolome database, metabomatching successfully identified the underlying metabolite in 14 of 19, and 8 of 9 associations, respectively. The accuracy and efficiency of our method make it a strong contender for facilitating or complementing metabolomics analyses in large cohorts, where the availability of genetic, or other data, enables our approach, but targeted quantification is limited.

          Author summary

          Metabolome-wide genome-wide association studies aim to discover how genetic variation affects metabolome traits. Such studies typically follow an acquire-identify-associate procedure: metabolome data are acquired experimentally, metabolites are identified in the experimental data and their concentrations quantified, and the metabolite concentrations are tested for association with genetic variants. The untargeted approach follows instead an acquire-associate-identify procedure: the experimental data are binned into metabolome features, and the features tested directly for genetic association. When the metabolome is measured by proton NMR spectroscopy, genetically associated features tend to correspond to peaks in the NMR spectrum of the underlying metabolites. This inherent property of the untargeted approach acts as a genetic spiking which informs on the identities of involved metabolites. Metabomatching is a method that uses genetic spiking information to identify the metabolite candidates, listed in a spectral database, most likely to underlie observed feature associations. Here, we present the method and its software, and evaluate its performance.

          Related collections

          Most cited references16

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          BioMagResBank

          The BioMagResBank (BMRB: www.bmrb.wisc.edu) is a repository for experimental and derived data gathered from nuclear magnetic resonance (NMR) spectroscopic studies of biological molecules. BMRB is a partner in the Worldwide Protein Data Bank (wwPDB). The BMRB archive consists of four main data depositories: (i) quantitative NMR spectral parameters for proteins, peptides, nucleic acids, carbohydrates and ligands or cofactors (assigned chemical shifts, coupling constants and peak lists) and derived data (relaxation parameters, residual dipolar couplings, hydrogen exchange rates, pKa values, etc.), (ii) databases for NMR restraints processed from original author depositions available from the Protein Data Bank, (iii) time-domain (raw) spectral data from NMR experiments used to assign spectral resonances and determine the structures of biological macromolecules and (iv) a database of one- and two-dimensional 1H and 13C one- and two-dimensional NMR spectra for over 250 metabolites. The BMRB website provides free access to all of these data. BMRB has tools for querying the archive and retrieving information and an ftp site (ftp.bmrb.wisc.edu) where data in the archive can be downloaded in bulk. Two BMRB mirror sites exist: one at the PDBj, Protein Research Institute, Osaka University, Osaka, Japan (bmrb.protein.osaka-u.ac.jp) and the other at CERM, University of Florence, Florence, Italy (bmrb.postgenomicnmr.net/). The site at Osaka also accepts and processes data depositions.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics.

            Inferring large-scale covariance matrices from sparse genomic data is an ubiquitous problem in bioinformatics. Clearly, the widely used standard covariance and correlation estimators are ill-suited for this purpose. As statistically efficient and computationally fast alternative we propose a novel shrinkage covariance estimator that exploits the Ledoit-Wolf (2003) lemma for analytic calculation of the optimal shrinkage intensity. Subsequently, we apply this improved covariance estimator (which has guaranteed minimum mean squared error, is well-conditioned, and is always positive definite even for small sample sizes) to the problem of inferring large-scale gene association networks. We show that it performs very favorably compared to competing approaches both in simulations as well as in application to real expression data.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Mining the Unknown: A Systems Approach to Metabolite Identification Combining Genetic and Metabolic Information

              Introduction Recently, genome-wide association studies (GWAS) on metabolic quantitative traits have proven valuable tools to uncover the genetically determined metabolic individuality in the general population [1]–[5]. Interestingly, a great portion of the genetic loci that were found to significantly associate with levels of specific metabolites are within or in close proximity to metabolic enzymes or transporters with known disease or pharmaceutical relevance. Moreover, compared to GWAS with clinical endpoints the effect sizes of the genotypes are exceptionally high. The number and type of the metabolic features that went into these GWAS was mainly defined by the metabolomics techniques used: Gieger et al. [1] and Illig et al. [2] used a targeted mass spectrometry (MS)-based approach giving access to the concentrations of 363 and 163 metabolites, respectively. Suhre et al. [3] and Nicholson et al. [4] applied untargeted nuclear magnetic resonance (NMR) based metabolomics techniques, yielding 59 metabolites that had been identified in the spectra prior to the GWAS and 579 manually selected peaks from the spectra, respectively. In Suhre et al. [5], 276 metabolites from an untargeted MS-based approach were analyzed. While these previous GWAS focused on metabolic features with known identity, untargeted metabolomics approaches additionally provide quantifications of so-called “unknown metabolites”. An unknown metabolite is a small molecule that can reproducibly be detected and quantified in a metabolomics experiment, but whose chemical identity has not been elucidated yet. In an experiment using liquid chromatography (LC) coupled to MS, such an unknown would be defined by a specific retention time, one or multiple masses (e.g. from adducts), and a characteristic fragmentation pattern of the primary ion(s). An unknown observed by NMR spectroscopy would correspond to a pattern in the chemical shifts. Unknowns may constitute previously undocumented small molecules, such as rare xenobiotics or secondary products of metabolism, or they may represent molecules from established pathways which could not be assigned using current libraries of MS fragmentation patterns [6], [7] or NMR reference spectra [8]. The impact of unknown metabolites for biomedical research has been shown in recent metabolomics-based discovery studies of novel biomarkers for diseases and various disease-causing conditions. This includes studies investigating altered metabolite levels in blood for insulin resistance [9], type 2 diabetes [10], and heart disorders [11]. A considerable number of high-ranking hits reported in these biomarker studies represent unknown metabolites. As long as their chemical identities are not clarified the usability of unknown metabolites as functional biomarkers for further investigations and clinical applications is rather limited. In mass-spectrometry-based metabolomics approaches, the assignment of chemical identity usually involves the interpretation and comparison of experiment-specific parameters, such as accurate masses, isotope distributions, fragmentation patterns, and chromatography retention times [12]–[14]. Various computer-based methods have been developed to automate this process. For example, Rasche and colleagues [15] elucidated structural information of unknown metabolites in a mass-spectrometry setup using a graph-theoretical approach. Their approach attempts to reconstruct the underlying fragmentation tree based on mass-spectra at varying collision energies. Other authors excluded false candidates for a given unknown by comparing observed and predicted chromatography retention times [16], [17], or by the automatic determination of sum formulas from isotope distributions [18]. Furthermore, Gipson et al. [19] and Weber et al. [20] integrated public metabolic pathway information with correlating peak pairs in order to facilitate metabolite identification. However, these methods might not be applicable for high-throughput metabolomics datasets that have been produced in a fee-for-service manner, since the mass spectra as such might not be readily available. Approaching the problem from a conceptually different perspective, we here present a novel functional metabolomics method to predict the identities of unknown metabolites using a systems biological framework. By combining high-throughput genotyping data, metabolomics data, and literature-derived metabolic pathway information, we generate testable hypotheses on the metabolite identities based solely on the obtained metabolite quantifications (Figure 1). No further experiment-specific data such as retention times, isotope patterns and fragmentation patterns are required for this analysis. 10.1371/journal.pgen.1003005.g001 Figure 1 Data integration workflow for the systematic classification of unknown metabolites. We combine high-throughput metabolomics and genotyping data in Gaussian graphical models (GGMs) [21] and in genome-wide association studies (GWAS) [5] in order to produce testable predictions of the unknown metabolites' identities. These hypotheses are then subject to experimental verification by mass-spectrometry. Six such cases have been fully worked through and are presented in Table 3. The concept of our approach is based on the following observations from our previous work on genome-wide association studies and Gaussian graphical modeling (GGM) with metabolomics: We showed that GWAS with metabolic traits can reveal functional relationships between genetic loci encoding metabolic enzymes and metabolite concentration levels in the blood [1]–[3], [5]. A genetic variant can alter, for instance, the expression levels of mRNAs or affect the properties of the respective enzymes through changes of the protein sequence (e.g. enzyme activity, substrate specificity). Moreover, we found that GGMs, which are based on partial correlation coefficients, can identify biochemically related metabolites from high-throughput metabolomics data alone [21], [22]. These observations suggest that if an unknown compound displays a similar statistical association with a genetic locus in a GWAS or a known metabolite in a GGM, then this may provide specific information of where it is located in the metabolic network. Based on this information we can then derive testable hypotheses on the biochemical identity of the unknown metabolite. This annotation idea parallels classical concepts from functional genomics, where, for instance, co-expression between RNA transcripts is used to predict the function of poorly characterized genes [23], [24]. The manuscript is organized as follows: We first conduct a full genome-wide association study on 655,658 genotyped SNPs with concentrations of 225 unknown metabolites using fasting blood serum samples from a large German population cohort (n = 1768) [25]. We thereby extend our previous work on known metabolites [5] to a GWAS with hitherto unpublished unknown metabolic traits. We then compute a Gaussian graphical model including both known and unknown metabolites. In a third step, we integrate the results of the GWAS and GGM computations and combine them with metabolic pathway information from public databases to derive predictions for a total of 106 unknown metabolites. In order to validate the approach, we investigate six distinct cases, in which we derive specific identity predictions for a total of nine unknown metabolites, which we then confirm experimentally. Finally, we discuss the relevance of newly discovered genetic loci and unknown identity predictions in the context of existing disease biomarker discovery and pharmacogenomics studies. All GWAS and GGM results, unknown metabolite classifications and pathway annotations are available as spreadsheets and in .graphml format in Dataset S1 or from our study website at http://cmb.helmholtz-muenchen.de/unknowns. Results Genetic association links unknown metabolites to functionally related genes In the first step of our analysis, we conducted a GWAS with the concentrations of known and unknown metabolites, testing a total of 655,658 genotyped SNPs from the KORA cohort for association. Thus, in addition to the unknown metabolite data, we included the association data for known metabolites from our previous study [5] into the present analysis. Unknown metabolites are uniquely labeled in the format “X-12345”, which are identical throughout all published studies that use the Metabolon platform. In total, we observe 34 distinct loci that display metabolite associations at a genome-wide significance level (Figure 2 and Dataset S1). Out of these 34 loci, 15 associate with at least one unknown compound. For 12 loci, an unknown compound constitutes the strongest association of all tested compounds. From the 213 unknown metabolites analyzed (see Methods for the determination of this metabolite subset), 28 show at least one genome-wide significant hit. These 28 associations at the 15 loci are presented in Table 1 along with all previously described GWAS hits to metabolic traits or other endpoints. Associating traits were determined from the GWAS catalog [26] for SNPs in LD (r2≥0.5) with the respective lead SNP. Seven of the 15 loci (SLC22A2, COMT, CYP3A5, CYP2C18, GBA3, UGT3A1, rs12413935) have not been described in GWAS with metabolic traits before and thus represent new genetic loci of metabolic individuality. Interestingly, genetic variants in strong LD with CYP2C18 have been reported to associate with warfarin maintenance dose [27]. 10.1371/journal.pgen.1003005.g002 Figure 2 Manhattan plot of genetic association. The strength of association for known (bottom) and unknown (top) metabolites is indicated as the negative logarithm of the p-value for the linear model (see Methods). Only metabolite-SNP associations with p-values below 10−6 are plotted (grey circles). Triangles represent metabolite-SNP associations with p-values below 10−40. Horizontal lines indicate the threshold for genome-wide significance (  = 1.6×10−10 corresponding to α = 0.05 after Bonferroni correction); red vertical dashes indicate loci at which this threshold is attained. 10.1371/journal.pgen.1003005.t001 Table 1 Genome-wide significant associations (p 0.5). Metabolite 1 Metabolite 2 ζ Interpretation X-11847 X-11849 0.901 biochemical link between two unknowns 3-indoxyl sulfate X-12405 0.840 tryptophan metabolism X-11452 X-12231 0.832 biochemical link between two unknowns X-12094 X-12095 0.822 biochemical link between two unknowns guanosine inosine 0.798 nucleosides X-11441 X-11442 0.760 biochemical link between two unknowns androsterone sulfate epiandrosterone sulfate 0.755 steroid sulfates X-11537 X-11540 0.753 biochemical link between two unknowns X-02269 X-11469 0.734 biochemical link between two unknowns X-11204 X-11327 0.706 biochemical link between two unknowns decanoylcarnitine octanoylcarnitine 0.689 β-oxidation footprints linoleamide (18:2n6) oleamide 0.654 C18:1/C18:2 acylamides 3-methyl-2-oxovalerate 4-methyl-2-oxopentanoate 0.646 branched-chain amino acid degradation catecholsulfate X-12217 0.601 catechol metabolism X-14189 X-14304 0.593 biochemical link between two unknowns 1,5-anhydroglucitol (1,5-AG) X-12696 0.580 sugar metabolism dehydroisoandrosterone sulfate (DHEA-S) X-18601 0.575 steroid hormones 1-arachidonoylglycerophosphoethanolamine X-12644 0.570 phospholipids (PE) X-14208 X-14478 0.558 biochemical link between two unknowns caffeine paraxanthine 0.554 caffeine metabolism X-11423 X-12749 0.549 biochemical link between two unknowns 1-linoleoylglycerophosphocholine 2-palmitoylglycerophosphocholine 0.544 phospholipids (PC) piperine X-01911 0.526 amino acid-derived alkaloids 2-hydroxypalmitate 2-hydroxystearate 0.523 hydroxy fatty acids X-14056 X-14057 0.519 biochemical link between two unknowns 3-methyl-2-oxovalerate isoleucine 0.514 isoleucine degradation X-11244 X-11443 0.510 biochemical link between two unknowns urea X-09706 0.506 urea metabolism isoleucine leucine 0.506 branched-chain amino acids 1-arachidonoylglycerophosphoethanolamine 1-linoleoylglycerophosphoethanolamine 0.502 phospholipids (PE) Connections between two known metabolites indicate a direct metabolic relationship, e.g. between purines (guanosine/inosine) or steroid hormones (androsterone sulfate/epiandrosterone sulfate). A link between a known and an unknown compound therefore provides evidence for a shared metabolic pathway. For instance, the link between 3-indoxylsulfate and X-12405 suggests a role of this unknown in tryptophan metabolism. Abbreviations: PC = phosphatidylcholine, PE = phosphatidylethanolamine, ζ = partial correlation coefficient. Italic text represents hypothetical known-unknown connections. We selected four high-scoring sub-networks in the GGM to show that this concept is indeed applicable to real data. The first two of these sub-networks consist of a series of intermediate compounds from purine metabolism, including guanosine, inosine, xanthine derivatives and urate (Figure 3B and 3C). In these cases, one can actually follow the addition and removal of chemical groups by following the edges in the GGM network: Most edges in these sub-networks correspond to the change of either a single methyl group at the purine double-ring structure or to the removal of a ribose residue in the reaction from nucleosides to xanthine variants. While the compounds in both sub-networks appear structurally similar, the distinction into two groups by the GGM is indeed biochemically sound. The metabolites in Figure 3B correspond to endogenous substances in the nucleoside pathway, whereas the molecules in Figure 3C relate to signals from xenobiotic metabolism of drugs and caffeine. Here, the unknown metabolites X-11422 and X-10810, as well as X-14473 and X-14374 are prominently placed in the networks, making them direct targets for closer inspection with respect to endogenous xanthines and xenobiotics, respectively. The third sub-network comprises three androsterone sulfate variants, which belong to the class of steroid hormones (Figure 3D). We observe direct GGM links between the unknowns X-11450, X-11244 and X-11443 with both dehydroepiandrosterone sulfate (DHEAS) and epiandrosterone sulfate, suggesting androsterone derivatives as likely candidates for these three metabolites. The fourth sub-network involves different stereoisomers of bilirubin, which is the degradation product of the oxygen transporter hemoglobin [31] (Figure S1). In this sub-network, we observe high partial correlations between the bilirubin variants and a series of unknown metabolites (X-11441, X-11530, X-11442, X-11793, X-11809, X-14056, and X-14057). The seven unknown compounds in this GGM sub-network are thus likely to be involved in hemoglobin degradation processes. Taken together, the examples confirm that further information on the biochemical identity of unknown metabolites can be extracted from GGM networks. Combining GGMs and GWAS allows deriving specific pathway annotations for unknown metabolites The next step in our analysis was the integration of the GGM and GWAS approaches with general pathway information from external databases, in order to generate concrete predictions for the unknowns' metabolic pathway memberships. As a feasibility test, we first asked whether the local neighborhood of a metabolite in the GGM can be used to correctly predict its metabolic class. Using a majority-voting based classifier and subsequent permutation testing, we detected significant classification abilities (mean sensitivity 0.674, mean specificity 0.84, macro-averaged F1 score 0.72) far beyond random (p 95%, Hardy-Weinberg-Equilibrium p-value p(HWE)>10−6, minor allele frequency MAF>1%. In total, 655,658 SNPs were left after filtering. Genome-wide associations In order to avoid spurious false positive associations due to small sample sizes, only metabolic traits with at least 300 non-missing values were included and data-points of metabolic traits that lay more than 3 standard deviations off the mean were excluded by setting them to ‘missing’ in the analysis (leaving 273 known and 213 unknown metabolites). Genotypes are represented by 0, 1, and 2 for major allele homozygous, heterozygous, and minor allele homozygous, individuals respectively. We employed a linear model to test for associations between a SNP and a metabolite assuming an additive mode of inheritance. Statistical tests were carried out using the PLINK software (version 1.06) [41] with age and gender as covariates. Based on a conservative Bonferroni correction, associations with p-values<1.6×10−10 meet genome-wide significance, corresponding to a significance level of α = 0.05. SNP-to-gene assignments were derived via linkage disequilibrium (LD) from HAPMAP [42]. A SNP was associated with a gene whenever there was at least one other SNP lying in the transcribed region of this gene (that is from 5′UTR to 3′UTR) that displays an r2≥0.8 with the query SNP. A detailed description of the GWAS procedure can be found in [5]. Lookups of previously known associations between phenotypes and genetic variants were performed using the GWAS catalog [26]. We list a phenotype with one of our GWAS hits, if the phenotype was reported with at least one SNP that displays an LD r2≥0.5 with the respective “lead SNP”. Lookups of eQTLs were performed for all significant SNPs (474, see Dataset S1) using the GTEx database [29]. We applied a p-value cutoff of 2.7×10−9, corresponding to a significance level of 0.05 and correction for 474×40,000 tests (the number of SNPs times number of transcripts, conservative estimate). Detailed results up to a p-value of 10−5 can be found in Dataset S1. Gaussian graphical modeling For the GGM calculation, we require a full data matrix without missing values. From the original data matrix containing n = 1768 samples and 517 metabolites (thereof 292 knowns and 225 unknowns), we first excluded metabolites with more than 20% missing values (column direction), and then samples with more than 10% missing values (row direction). The filtered data matrix still contained n = 1764 samples with 355 metabolites (217 knowns and 138 unknowns). Remaining missing values were imputed with the ‘mice’ R package [43]. Note that the numbers of metabolites used in the GWAS and in the GGM analysis differ due to specific constraints for the treatment of missing values in the two methods. Gaussian graphical models are induced by full-order partial correlation coefficients, i.e. pairwise correlations corrected against all remaining (n-2) variables. GGMs are based on linear regressions with multiple predictor variables. When regressing two random variables X and Y on the remaining variables in the data set, the partial correlation coefficient between X and Y is given by the Pearson correlation of the residuals from both regressions. Since our dataset contains more samples than variables, full-order partial correlations can be conveniently calculated by a matrix inversion operation. A significance cutoff of α = 0.05 with Bonferroni correction was applied. A detailed description of the GGM calculation procedure can be found in [21]. Age, gender and SNP effects were removed by adding the respective variables and SNPs states to the data matrix. For each pair of variables under investigation, Gaussian graphical models remove the effects of all remaining variables on this correlation (due to the above-mentioned linear regression approach). That is, adding a variable to the data matrix will automatically result in the removal of confounding effects of this variable on the correlations of all other variables. Note that age, gender and SNPs were not investigated as an actual node in the network but merely used for the correction procedure. For the later analysis steps, we then only considered metabolite-metabolite edges in the network. SNP states were coded as numerical values of 0, 1 and 2 (see previous section), such that the linear regressions that underlie the GGM correspond to an additive genetic model (cf. [5]). Gender represents a “dummy variable” [44] in the linear regression model which only takes values of 1 (male) and 0 (female). Metabolic pathway model and functional annotations Metabolic reactions were imported from three independent human metabolic reconstruction projects: (1) H. sapiens Recon 1 from the BiGG databases [45], (2) the Edinburgh Human Metabolic Network (EHMN) reconstruction [46] and (3) the KEGG PATHWAY database [47] as of January 2012. We attempted to create a highly accurate mapping between the different metabolite identifiers of the respective databases, in order to ensure the identity of each compound in our list. Entries referring to whole groups of metabolites, such as “phospholipid”, “fatty acid residue” or “proton acceptor” were excluded from our study. Furthermore, we did not consider metabolic cofactors such as “ATP”, “CO2”, and “SO4” etc. in our analysis, since such metabolites unspecifically participate in a plethora of metabolic reactions. For each enzyme catalyzing one or more reactions in our pathway model, we retrieved functional annotations from two independent sources: (i) GO-Terms from the Gene Ontology [48] and (ii) enzyme pathway annotations from the KEGG PATHWAY database [47]. All imported metabolic pathways along with metabolite database identifiers, excluded compounds and pathway annotations can be found in Dataset S1. Supporting Information Dataset S1 ZIP archive containing Excel sheets and .graphml files for the GGM and GWAS results, as well as detailed pathway annotations. (ZIP) Click here for additional data file. Figure S1 GGM sub-network with bilirubin variants. (PDF) Click here for additional data file. Table S1 Systematic classifications for 106 unknown metabolites. (XLS) Click here for additional data file. Text S1 Detailed GGM modularity analysis results. (PDF) Click here for additional data file. Text S2 Assessment of the majority voting classification approach. (PDF) Click here for additional data file. Text S3 Supporting experimental data and description of the unknown identification scenarios CARNITINE, BILIRUBIN, and ASCORBATE. (PDF) Click here for additional data file.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Data curationRole: Formal analysisRole: InvestigationRole: MethodologyRole: SoftwareRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: SoftwareRole: ValidationRole: Writing – review & editing
                Role: Data curationRole: InvestigationRole: Writing – review & editing
                Role: Formal analysis
                Role: Resources
                Role: Funding acquisitionRole: Resources
                Role: Funding acquisitionRole: Resources
                Role: SupervisionRole: Writing – review & editing
                Role: ConceptualizationRole: Funding acquisitionRole: InvestigationRole: SupervisionRole: Writing – review & editing
                Role: ConceptualizationRole: Funding acquisitionRole: Project administrationRole: SupervisionRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                December 2017
                1 December 2017
                : 13
                : 12
                : e1005839
                Affiliations
                [1 ] Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
                [2 ] Swiss Institute of Bioinformatics, Lausanne, Switzerland
                [3 ] Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
                [4 ] Institute of Clinical Chemistry and Laboratory Medicine, University Medicine Greifswald, Greifswald, Germany
                [5 ] German Centre for Cardiovascular Research (DZHK), Partner site, Greifswald, Germany
                [6 ] Department of Medicine, Internal Medicine, Lausanne University Hospital (CHUV), Lausanne, Switzerland
                [7 ] German Center for Diabetes Research, Neuherberg, Germany
                [8 ] Institute of Social and Preventive Medicine, Lausanne University Hospital (CHUV), Lausanne, Switzerland
                [9 ] Department of Integrative Biomedical Sciences, University of Cape Town, Cape Town, South Africa
                Centre for Research and Technology-Hellas, GREECE
                Author notes

                The authors have declared that no competing interests exist.

                Author information
                http://orcid.org/0000-0002-6713-2214
                http://orcid.org/0000-0003-2495-4020
                http://orcid.org/0000-0002-9216-8825
                http://orcid.org/0000-0002-0765-896X
                http://orcid.org/0000-0003-4193-788X
                http://orcid.org/0000-0002-6785-9034
                Article
                PCOMPBIOL-D-17-00169
                10.1371/journal.pcbi.1005839
                5711027
                29194434
                baa833e2-570d-430b-81b2-6255f98479f1
                © 2017 Rueedi et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 30 January 2017
                : 23 October 2017
                Page count
                Figures: 3, Tables: 2, Pages: 17
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/501100001711, Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung;
                Award ID: 31003A-143914
                Award Recipient :
                Funded by: Swiss Institute of Bioinformatics
                Award Recipient :
                Funded by: Swiss Institute of Bioinformatics
                Award Recipient :
                Funded by: SystemsX.ch
                Award ID: 51RTP0-151019
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100006387, Fondation Leenaards;
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100007601, Horizon 2020;
                Award ID: 654241
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100001711, Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung;
                Award ID: 310030-152724
                Award Recipient :
                This work was supported by the Leenards Foundation (to ZK), the European Comission’s Horizon 2020 program via the PhenoMeNal project (654241 to SB), the Swiss Institute of Bioinformatics (to SB, to ZK), the Swiss National Science Foundation (31003A-143914 to ZK, 310030-152724 to SB) and SystemsX.ch (51RTP0-151019 to ZK). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Biochemistry
                Metabolism
                Metabolites
                Biology and Life Sciences
                Biochemistry
                Metabolism
                Metabolomics
                Research and analysis methods
                Spectrum analysis techniques
                NMR spectroscopy
                Biology and Life Sciences
                Anatomy
                Body Fluids
                Urine
                Medicine and Health Sciences
                Anatomy
                Body Fluids
                Urine
                Biology and Life Sciences
                Physiology
                Body Fluids
                Urine
                Medicine and Health Sciences
                Physiology
                Body Fluids
                Urine
                Research and analysis methods
                Spectrum analysis techniques
                NMR spectroscopy
                Proton NMR spectroscopy
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Genome-Wide Association Studies
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Genome-Wide Association Studies
                Biology and Life Sciences
                Genetics
                Human Genetics
                Genome-Wide Association Studies
                Biology and Life Sciences
                Genetics
                Molecular Genetics
                Biology and Life Sciences
                Molecular Biology
                Molecular Genetics
                Physical Sciences
                Physics
                Nuclear Physics
                Nucleons
                Protons
                Custom metadata
                CoLaus association summary statistics are available for download from [ https://zenodo.org/record/1039306]. SHIP association summary statistics are available for download from [ https://zenodo.org/record/1040795].

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article