7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Aims/hypothesis

          People with type 2 diabetes are heterogeneous in their disease trajectory, with some progressing more quickly to insulin initiation than others. Although classical biomarkers such as age, HbA 1c and diabetes duration are associated with glycaemic progression, it is unclear how well such variables predict insulin initiation or requirement and whether newly identified markers have added predictive value.

          Methods

          In two prospective cohort studies as part of IMI-RHAPSODY, we investigated whether clinical variables and three types of molecular markers (metabolites, lipids, proteins) can predict time to insulin requirement using different machine learning approaches (lasso, ridge, GRridge, random forest). Clinical variables included age, sex, HbA 1c, HDL-cholesterol and C-peptide. Models were run with unpenalised clinical variables (i.e. always included in the model without weights) or penalised clinical variables, or without clinical variables. Model development was performed in one cohort and the model was applied in a second cohort. Model performance was evaluated using Harrel’s C statistic.

          Results

          Of the 585 individuals from the Hoorn Diabetes Care System (DCS) cohort, 69 required insulin during follow-up (1.0–11.4 years); of the 571 individuals in the Genetics of Diabetes Audit and Research in Tayside Scotland (GoDARTS) cohort, 175 required insulin during follow-up (0.3–11.8 years). Overall, the clinical variables and proteins were selected in the different models most often, followed by the metabolites. The most frequently selected clinical variables were HbA 1c (18 of the 36 models, 50%), age (15 models, 41.2%) and C-peptide (15 models, 41.2%). Base models (age, sex, BMI, HbA 1c) including only clinical variables performed moderately in both the DCS discovery cohort (C statistic 0.71 [95% CI 0.64, 0.79]) and the GoDARTS replication cohort (C 0.71 [95% CI 0.69, 0.75]). A more extensive model including HDL-cholesterol and C-peptide performed better in both cohorts (DCS, C 0.74 [95% CI 0.67, 0.81]; GoDARTS, C 0.73 [95% CI 0.69, 0.77]). Two proteins, lactadherin and proto-oncogene tyrosine-protein kinase receptor, were most consistently selected and slightly improved model performance.

          Conclusions/interpretation

          Using machine learning approaches, we show that insulin requirement risk can be modestly well predicted by predominantly clinical variables. Inclusion of molecular markers improves the prognostic performance beyond that of clinical variables by up to 5%. Such prognostic models could be useful for identifying people with diabetes at high risk of progressing quickly to treatment intensification.

          Data availability

          Summary statistics of lipidomic, proteomic and metabolomic data are available from a Shiny dashboard at https://rhapdata-app.vital-it.ch.

          Graphical Abstract

          Supplementary Information

          The online version contains peer-reviewed but unedited supplementary available at 10.1007/s00125-024-06105-8.

          Related collections

          Most cited references16

          • Record: found
          • Abstract: found
          • Article: not found

          Better prediction by use of co-data: adaptive group-regularized ridge regression.

          For many high-dimensional studies, additional information on the variables, like (genomic) annotation or external p-values, is available. In the context of binary and continuous prediction, we develop a method for adaptive group-regularized (logistic) ridge regression, which makes structural use of such 'co-data'. Here, 'groups' refer to a partition of the variables according to the co-data. We derive empirical Bayes estimates of group-specific penalties, which possess several nice properties: (i) They are analytical. (ii) They adapt to the informativeness of the co-data for the data at hand. (iii) Only one global penalty parameter requires tuning by cross-validation. In addition, the method allows use of multiple types of co-data at little extra computational effort. We show that the group-specific penalties may lead to a larger distinction between 'near-zero' and relatively large regression parameters, which facilitates post hoc variable selection. The method, termed GRridge, is implemented in an easy-to-use R-package. It is demonstrated on two cancer genomics studies, which both concern the discrimination of precancerous cervical lesions from normal cervix tissues using methylation microarray data. For both examples, GRridge clearly improves the predictive performances of ordinary logistic ridge regression and the group lasso. In addition, we show that for the second study, the relatively good predictive performance is maintained when selecting only 42 variables.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Accumulation of MFG-E8/lactadherin on exosomes from immature dendritic cells.

            Exosomes are vesicles of endocytic origin secreted spontaneously by dendritic cells (DCs). We have shown previously that exosomes can transfer antigen or MHC-peptide complexes between DCs, thus potentially amplifying the immune response. We had also identified milk fat globule EGF/factor VIII (MFG-E8), also called lactadherin, as one of the major exosomal proteins. MFG-E8 has two domains: an Arg-Gly-Asp sequence that binds integrins alphavbeta3 and alphavbeta5 (expressed by human DCs and macrophages) and a phosphatidyl-serine (PS) binding sequence through which it associates to PS-containing membranes (among which exosomes). MFG-E8 is thus a good candidate molecule to address exosomes to DCs. Here, we show that MFG-E8 is expressed by immature bone-marrow-derived DCs (BMDCs) and secreted in association with exosomes in vitro. We have generated mice expressing an inactive form of MFG-E8, fused to beta-galactosidase. Analyzing these mice, we demonstrate that MFG-E8 is expressed in vivo in splenic DCs. In a mouse DC-dependent, antigen-specific, CD4 T cell-stimulation assay, exosomes produced by MFG-E8-deficient BMDCs were barely less efficient than exosomes bearing MFG-E8. We conclude that MFG-E8 is efficiently targeted to exosomes but is not essential to address exosomes to mouse BMDCs. Involvement of MFG-E8/lactadherin in exosome targeting to other DC subpopulations, or to human DCs, is still possible.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The Hoorn Diabetes Care System (DCS) cohort. A prospective cohort of persons with type 2 diabetes treated in primary care in the Netherlands

              Purpose People with type 2 diabetes (T2D) have a doubled morbidity and mortality risk compared with persons with normal glucose tolerance. Despite treatment, clinical targets for cardiovascular risk factors are not achieved. The Hoorn Diabetes Care System cohort (DCS) is a prospective cohort representing a comprehensive dataset on the natural course of T2D, with repeated clinical measures and outcomes. In this paper, we describe the design of the DCS cohort. Participants The DCS consists of persons with T2D in primary care from the West-Friesland region of the Netherlands. Enrolment in the cohort started in 1998 and this prospective dynamic cohort currently holds 12 673 persons with T2D. Findings to date Clinical measures are collected annually, with a high internal validity due to the centrally organised standardised examinations. Microvascular complications are assessed by measuring kidney function, and screening feet and eyes. Information on cardiovascular disease is obtained by 1) self-report, 2) electrocardiography and 3) electronic patient records. In subgroups of the cohort, biobanking and additional measurements were performed to obtain information on, for example, lifestyle, depression and genomics. Finally, the DCS cohort is linked to national cancer and all-cause mortality registers. A selection of published findings from the DCS includes identification of subgroups with distinct development of haemoglobin A1c, blood pressure and retinopathy, and their predictors; validation of a prediction model for personalised retinopathy screening; the assessment of the role of genetics in development and treatment of T2D, providing options for personalised medicine. Future plans We will continue with the inclusion of persons with newly diagnosed T2D, follow-up of persons in the cohort and linkage to morbidity and mortality registries. Currently, we are involved in (inter)national projects on, among others, biomarkers and prediction models for T2D and complications and we are interested in collaborations with external researchers. Trial registration ISRCTN26257579
                Bookmark

                Author and article information

                Contributors
                j.beulens@amsterdamumc.nl
                Journal
                Diabetologia
                Diabetologia
                Diabetologia
                Springer Berlin Heidelberg (Berlin/Heidelberg )
                0012-186X
                1432-0428
                19 February 2024
                19 February 2024
                2024
                : 67
                : 5
                : 885-894
                Affiliations
                [1 ]Department of Epidemiology and Data Science, Amsterdam UMC, Vrije Universiteit, ( https://ror.org/05grdyy37) Amsterdam, the Netherlands
                [2 ]GRID grid.16872.3a, ISNI 0000 0004 0435 165X, Amsterdam Public Health, ; Amsterdam, the Netherlands
                [3 ]Amsterdam Cardiovascular Sciences, Amsterdam, the Netherlands
                [4 ]Department of Cell and Chemical Biology, Leiden University Medical Center, ( https://ror.org/05xvt9f17) Leiden, the Netherlands
                [5 ]Population Health & Genomics, School of Medicine, University of Dundee, ( https://ror.org/03h2bxq36) Dundee, UK
                [6 ]Delft Bioinformatics Lab, Delft University of Technology, ( https://ror.org/02e2c7k09) Delft, the Netherlands
                [7 ]Vital-IT Group, SIB Swiss Institute of Bioinformatics, ( https://ror.org/002n09z45) Lausanne, Switzerland
                [8 ]Department of General Practice, Amsterdam UMC, Vrije Universiteit, ( https://ror.org/05grdyy37) Amsterdam, the Netherlands
                [9 ]CRCHUM, Faculty of Medicine, Université de Montréal, ( https://ror.org/0161xgx34) Montréal, QC Canada
                [10 ]Department of Metabolism, Digestion and Reproduction, Faculty of Medicine, Imperial College London, ( https://ror.org/041kmwe10) London, UK
                [11 ]Lee Kong Chian School of Medicine, Nanyang Technological University, ( https://ror.org/02e7b5302) Singapore, Republic of Singapore
                [12 ]Department of Biomedical Data Sciences, Section of Molecular Epidemiology, Leiden University Medical Center, ( https://ror.org/05xvt9f17) Leiden, the Netherlands
                [13 ]Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, ( https://ror.org/0575yy874) Utrecht, the Netherlands
                Author information
                http://orcid.org/0000-0003-0961-9152
                http://orcid.org/0000-0001-9659-2044
                http://orcid.org/0000-0003-4902-1040
                http://orcid.org/0000-0002-4311-8725
                http://orcid.org/0000-0002-5907-7219
                http://orcid.org/0000-0001-6360-0343
                http://orcid.org/0000-0003-3152-5670
                http://orcid.org/0000-0001-9237-8585
                http://orcid.org/0000-0003-4401-2938
                http://orcid.org/0000-0003-4780-8472
                http://orcid.org/0000-0002-4181-0937
                Article
                6105
                10.1007/s00125-024-06105-8
                10954972
                38374450
                50d3d2bc-276a-4a59-beac-dc5702fa93ee
                © The Author(s) 2024

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 3 July 2023
                : 5 January 2024
                Funding
                Funded by: IMI-RHAPSODY
                Award ID: 115881
                Categories
                Article
                Custom metadata
                © Springer-Verlag GmbH Germany, part of Springer Nature 2024

                Endocrinology & Diabetes
                machine learning,prediction model,progression,type 2 diabetes
                Endocrinology & Diabetes
                machine learning, prediction model, progression, type 2 diabetes

                Comments

                Comment on this article