HIV infection remains a global public health challenge, with an estimated 42.3 million cumulative deaths to date. Given the heterogeneity among people living with HIV and AIDS, there is a critical need to develop robust prognostic models to predict survival and guide individualized clinical management.
We aimed to develop and externally validate a predictive model for the survival of people living with HIV and AIDS following the initiation of highly active antiretroviral therapy (HAART) in China.
We used data from the HIV and AIDS epidemic surveillance system of the National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, for this retrospective cohort study. The training set and the external validation set included people living with HIV and AIDS from the cities of Nanjing and Nantong, respectively. The prediction model was developed by using the random survival forest (RSF), and its performance was evaluated against the Cox model, integrated area under the curve (iAUC), consistency index (C index), calibration curves, integrated Brier score (iBS), and decision curve analysis.
A total of 8960 patients were eligible for this study, consisting of 5261 (58.71%) cases in the training set (mean age 32.39, SD 13.30 years; n=4891, 92.97% male patients) and 3699 (41.28%) cases in the external validation set (mean age 43.31, SD 14.18 years; n=3086, 83.42% male patients). The RSF model was developed based on the top 7 variables ranked by variable importance, including hemoglobin, age at HAART treatment, infection route, white blood cell count, education level, blood glucose, and the CD4 count before HAART. The RSF model exhibited good performance, with an iBS of 0.129 in the internal validation set and 0.113 in the external validation set, and a C index of 0.896 (95% CI 0.885-0.906) in the internal validation set and 0.756 (95% CI 0.730-0.782) in the external validation set, respectively. The iAUC was 0.917 (95% CI 0.906-0.929) for the internal validation set and 0.750 (95% CI 0.724-0.776) for the external validation set. Using the Cox model as the benchmark model, the variables included in the RSF model yielded an iBS of 0.172 and 0.115, a C index of 0.829 (95% CI 0.815-0.842) and 0.742 (95% CI 0.714-0.770), and an iAUC of 0.871 (95% CI 0.856-0.885) and 0.740 (95% CI 0.711-0.768) for the internal and external validation sets, respectively.
A machine learning–based RSF model demonstrated promising potential for providing personalized and accurate survival predictions and effective prognostic stratification for people living with HIV and AIDS following HAART in China. Compared to the Cox model, the RSF model exhibited slightly superior performance. A web-based application of the RSF model provides a practical tool for risk assessment and clinical decision-making.