A model for predicting death in adult patients within 10 years
https://doi.org/10.21045/2782-1676-2025-5-2-4-16
Abstract
Introduction. The identification of risk factors and the prediction of mortality from various causes are important issues in medicine. From a preventive perspective, it is crucial to identify patients at high risk of death, as early detection and treatment of diseases effectively increase life expectancy. The purpose of the study: to develop a universal model for predicting death in adult patients within 10 years and to compare the predictive ability of predicting death in a large contemporary cohort of the machine learning model (decision trees) with a Cox regression. Materials and methods. The data source for the study was the database of the Webiomed predictive analytics platform. The study included 1,129,268 records of 201,985 patients aged 18 years and older. 177 predictive features were investigated, of which 12 were selected for modelling as a result of a multi-stage selection process. Two survival analysis algorithms, CoxPHFitter and RandomSurvivalForest, were used for modelling. The models were used to determine the probability of death within 1, 3, 5 and 10 years. Results. Both models performed well in predicting death, however, the best result was obtained by the RSF model. Metrics of the best model with 95% CI for predicting death within 10 years: AUC0.921 (0.914–0.929), Accuracy 0.849 (0.84–0.858), Sensitivity 0.813 (0.795–0.83), Specificity 0.871 (0.859–0.882), Concordance index 0.867 (0.861–0.874), Positive predictive value 0.791 (0.776–0.806), Negative Predictive Value 0.886 (0.876–0.895). Conclusion. Machine learning models predict mortality outcomes well, demonstrating high discrimination and classification accuracy. Their use may help to identify high-risk patients to inform decisions to prevent death.
About the Authors
A. N. KaftanovRussian Federation
Alexey N. Kaftanov – PhD in Medical sciences, data analyst
Petrozavodsk
A. E. Andreychenko
Russian Federation
Anna E. Andreychenko – PhD in Physics and Mathematics sciences, lead research scientist
St. Petersburg
A. D. Ermak
Russian Federation
Andrey D. Ermak – data analyst
Petrozavodsk
D. V. Gavrilov
Russian Federation
Denis V. Gavrilov – Head of the medical department
Petrozavodsk
A. V. Gusev
Russian Federation
Aleksandr V. Gusev – PhD in Engineering sciences, artificial intelligence expert
Moscow
R. E. Novitskiy
Russian Federation
Roman E. Novitskiy – General manager
Petrozavodsk
References
1. Qiu W., Chen H., Dincer A.B., Lundberg S., Kaeberlein M., Lee S.I. Interpretable machine learning prediction of allcause mortality. Commun Med (Lond). 2022 Oct 3;2:125. https://doi.org/10.1038/s43856-022-00180-x
2. Kawano K., Otaki Y., Suzuki N., Fujimoto S., Iseki K., Moriyama T., Yamagata K., Tsuruya K., Narita I., Kondo M., Shibagaki Y., Kasahara M., Asahi K., Watanabe T., Konta T. Prediction of mortality risk of health checkup participants using machine learning-based models: the J-SHC study. Sci Rep. 2022 Aug 19;12(1):14154. https://doi.org/10.1038/s41598 022 18276 8.
3. Weng S.F., Vaz L., Qureshi N., Kai J. Prediction of premature all-cause mortality: A prospective general population cohort study comparing machine-learning and standard epidemiological approaches. PLoS One. 2019 Mar 27;14(3): e0214365. https://doi.org/10.1371/journal.pone.0214365.
4. Global health risks: mortality and burden of disease attributable to selected major risks. – Geneva: World health organization, cop. 2009. ISBN 978 92 4 156387 1.
5. Федеральная служба государственной статистики: официальный сайт. – Москва, 2024. – URL: https://rosstat.gov.ru/folder/12781 (Дата обращения: 06.05.2024).
6. Motwani M., Dey D., Berman D.S., Germano G., Achenbach S., Al-Mallah M.H., Andreini D., Budoff M.J., Cademartiri F., Callister T.Q., Chang H.J., Chinnaiyan K., Chow B.J., Cury R.C., Delago A., Gomez M., Gransar H., Hadamitzky M., Hausleiter J., Hindoyan N., Feuchtner G., Kaufmann P.A., Kim Y.J., Leipsic J., Lin F.Y., Maffei E., MarquesH., Pontone G., Raff G., Rubinshtein R., Shaw L.J., Stehli J., Villines T.C., Dunning A., Min J.K., Slomka P.J. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur Heart J. 2017 Feb 14;38(7):500–507. https:// doi.org/10.1093/eurheartj/ehw188.
7. Argyridou S., Zaccardi F., Davies M.J., Khunti K., Yates T. Walking pace improves all-cause and cardiovascular mortality risk prediction: A UK Biobank prognostic study. Eur J Prev Cardiol. 2020 Jul;27(10):1036–1044. https://doi.org/10.1177/2047487319887281.
8. Moll M., Qiao D., Regan E.A., Hunninghake G.M., Make B.J., Tal-Singer R., McGeachie M.J., Castaldi P.J., San Jose Estepar R, Washko GR, Wells JM, LaFon D, Strand M, Bowler R.P., Han M.K., Vestbo J., Celli B., Calverley P., Crapo J., Silverman E.K., Hobbs B.D., Cho M.H. Machine Learning and Prediction of All-Cause Mortality in COPD. Chest. 2020 Sep;158(3):952–964. https://doi.org/10.1016/j.chest.2021.03.045.
9. Siga M.M., Ducher M., Florens N., Roth H., Mahloul N., Fouque D., Fauvel J.P. Prediction of all-cause mortality in haemodialysis patients using a Bayesian network. Nephrol Dial Transplant. 2020 Aug 1;35(8):1420–1425. https://doi.org/10.1093/ndt/gfz295.
10. Lee S., Zhou J., Leung K.S.K., Wu W.K.K., Wong W.T., Liu T., Wong I.C.K., Jeevaratnam K., Zhang Q., Tse G. Development of a predictive risk model for all-cause mortality in patients with diabetes in Hong Kong. BMJ Open Diabetes Res Care. 2021 Jun;9(1): e001950. https://doi.org/10.1136/bmjdrc 2020–001950.
11. Liu X., Jiang J., Wei L., Xing W., Shang H., Liu G., Liu F. Prediction of all-cause mortality in coronary artery disease patients with atrial fibrillation based on machine learning models. BMC Cardiovasc Disord. 2021 Oct 16;21(1):499. https://doi.org/10.1186/s12872-021-02314-w.
12. lifelines: official site. – 2024. – URL: https: //lifelines. readthedocs.io/en/latest/fitters/regression/CoxPHFitter.html#lifelines.fitters.coxph_fitter.SemiParametricPHFitter. predict_survival_function (Date of address: 15.05.2024).
13. Scikit-survival: official site. – 2024. – URL: https:// scikit-survival.readthedocs.io/en/stable/api/generated/sksurv.linear_model.CoxnetSurvivalAnalysis.html#sksurv.linear_model.CoxnetSurvivalAnalysis.predict_survival_function (Date of address: 15.05.2024).
14. Ping Wang, Yan Li and Chandan K. Reddy. 2019. Machine Learning for Survival Analysis: A Survey. ACM Comput. Surv. 51, 6, Article 110 (November 2019), 36 pages. https://doi.org/10.1145/3214306.
15. Unnikrishnan P., Kumar D.K., Poosapadi Arjunan S., KumarH, Mitchell P., Kawasaki R. Development of Health Parameter Model for Risk Prediction of CVD Using SVM. Comput Math Methods Med. 2016 Aug 9. 2016;2016:3016245. https://doi.org/10.1155/2016/3016245.
Review
For citations:
Kaftanov A.N., Andreychenko A.E., Ermak A.D., Gavrilov D.V., Gusev A.V., Novitskiy R.E. A model for predicting death in adult patients within 10 years. Public Health. 2025;5(2):4-16. (In Russ.) https://doi.org/10.21045/2782-1676-2025-5-2-4-16