To predict end‐stage renal disease (ESRD) in patients with type 2 diabetes by using machine‐learning models with multiple baseline demographic and clinical characteristics.
In total, 11 789 patients with type 2 diabetes and nephropathy from three clinical trials, RENAAL (n = 1513), IDNT (n = 1715) and ALTITUDE (n = 8561), were used in this study. Eighteen baseline demographic and clinical characteristics were used as predictors to train machine‐learning models to predict ESRD (doubling of serum creatinine and/or ESRD). We used the area under the receiver operator curve (AUC) to assess the prediction performance of models and compared this with traditional Cox proportional hazard regression and kidney failure risk equation models.
The feed forward neural network model predicted ESRD with an AUC of 0.82 (0.76‐0.87), 0.81 (0.75‐0.86) and 0.84 (0.79‐0.90) in the RENAAL, IDNT and ALTITUDE trials, respectively. The feed forward neural network model selected urinary albumin to creatinine ratio, serum albumin, uric acid and serum creatinine as important predictors and obtained a state‐of‐the‐art performance for predicting long‐term ESRD.
Despite large inter‐patient variability, non‐linear machine‐learning models can be used to predict long‐term ESRD in patients with type 2 diabetes and nephropathy using baseline demographic and clinical characteristics. The proposed method has the potential to create accurate and multiple outcome prediction automated models to identify high‐risk patients who could benefit from therapy in clinical practice.