Comparison of machine-learning algorithms efficiency to build a predictive model for mortality risk in COVID-19 hospitalized patients

authors:

avatar Mostafa Shanbehzadeh 1 , avatar Ali Valinejadi 2 , avatar Ramin Afrah 1 , avatar Hadi KazemiArpanahi 3 , * , avatar Azam Orooji 4 , avatar Mohammadreza Kaffashian 5

Dept.of Biomedical Engineering, Faculty of Advanced Technologies, Isfahan University of Medical Sciences, Isfahan, Iran
Dept.of Health Information Technology, School of Allied Medical Sciences, Semnan University of Medical Sciences, Semnan, Iran
Dept.of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran
Dept.of Advanced Technologies, Faculty of Medicine, North Khorasan University of Medical Science (NKUMS), Bojnourd, Iran
Dept.of Physiology, Faculty of Medicine, Ilam University of Medical Sciences, Ilam, Iran

how to cite: Shanbehzadeh M, Valinejadi A, Afrah R, KazemiArpanahi H, Orooji A, et al. Comparison of machine-learning algorithms efficiency to build a predictive model for mortality risk in COVID-19 hospitalized patients. koomesh. 2022;24(1):e154100. https://doi.org/10.5812/koomesh-154100.

Abstract

Introduction: The rapid worldwide outbreak of SARS-CoV-2 has posed serious and unprecedented challenges to healthcare systems in predicting disease behavior and outcomes. To overcome these challenges or ambiguities, this study aimed to create and validate several predictive models using of selected ML algorithms to stratify the mortality risk in COVID-19 hospitalized patients and choice the best performing algorithm. Materials and Methods: Data of 1224 hospitalized patients with COVID-19 diagnosis based on the findings of the confirmed-laboratory test were extracted from the Ilam COVID-19 registry (Ilam CoV reg) database. Then the most important clinical parameters in the COVID-19 mortality were identified and used as inputs of the selected ML algorithms, including K-Nearest Network (KNN), Support Vector Machine (SVM), Logistic Regression (LR) and Random Forest (RF). Finally, the performance of the developed models was compared based on different confusion matrix evaluation criteria and the most appropriate predictive model was determined. Results: A total of 17 parameters were identified as the most influential clinical variables in the mortality of COVID-19. By comparing the performance of ML algorithms according to various evaluation criteria, the KNN algorithm with precision of 94.21%, accuracy of 93.74%, recall of 100%, F-measure of 93.2% and ROC of 92.23%, yielded better performance than other developed algorithms. Conclusion: KNN enables a reasonable level of accuracy and certainty in predicting the mortality of patients with COVID-19 and potentially facilitates identifing high risk patients and, inform proper interventions by the clinicians