J Crit Care Excell

Image Credit:J Crit Care Excell

Investigation on Ensemble Models for Mortality Prediction in Intensive Care Unit Patients

Author(s):
Sina Moosavi KashaniSina Moosavi KashaniSina Moosavi Kashani ORCID1, Hana NazarpourfardHana NazarpourfardHana Nazarpourfard ORCID1, Sanaz Zargar Balaye JameSanaz Zargar Balaye Jame2, Nader Markazi MoghaddamNader Markazi Moghaddam3, Mohammad FathiMohammad FathiMohammad Fathi ORCID3, 4,*
1Department of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, Iran
2Department of Health Management and Economics, Faculty of Medicine, AjA University of Medical Sciences, Tehran, Iran
3Critical Care Quality Improvement Research Center, Shahid Modarres Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
4Department of Anesthesiology, Faculty of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran

Journal of Critical Care Excellence:Vol. 1, issue 1; e138141
Published online:Mar 31, 2024
Article type:Research Article
Received:Mar 25, 2022
Accepted:Mar 27, 2024
How to Cite:Moosavi Kashani S, Nazarpourfard H, Zargar Balaye Jame S, Markazi Moghaddam N, Fathi M. Investigation on Ensemble Models for Mortality Prediction in Intensive Care Unit Patients.J Crit Care Excell.2024;1(1):e138141.https://doi.org/10.5812/jcce-138141.

Abstract

Background:

The intensive care unit (ICU) is a crucial component of the hospital. Allocating resources according to the needs of patients in the ICU is vital for the quality of care. Predicting mortality in this unit can assist nurses and doctors in allocating optimal resources for patients.

Objectives:

The present study aims to compare the performance of bagging and boosting methods in predicting the mortality of patients admitted to the ICU using demographic, clinical, and laboratory information.

Methods:

Starting in February 2020, we conducted a study analyzing the demographic, clinical, and laboratory characteristics of 2,055 adult patients admitted to the ICU of a selected hospital over one year. We employed Random Forest (RF), LightGBM (LGBM), and XGBoost (XG) models to compare their accuracy in predicting outcomes. To ensure data integrity, we utilized the interquartile range (IQR) to identify and remove outliers and excluded rows with missing values. Our study also highlighted the significance of various patient characteristics on mortality rates and utilized logistic regression to calculate odds ratios with a 95% confidence interval.

Results:

The study indicated that the accuracy of the RF model is 0.91, while LGBM and XG both achieved an accuracy of 0.93. We also compared them using the receiver operating characteristic (ROC) curve, with RF (area = 0.91), LGBM (area = 0.94), and XG (area = 0.94). It can be concluded that LGBM and XG had almost the same performance.

Conclusions:

Based on the accuracy of traditional scoring methods in past studies, we found that machine learning methods have higher accuracy. In this study, the performance of ensemble models was reported to be better than individual models used in previous studies. Furthermore, when comparing ensemble methods (bagging and boosting), boosting techniques (LGBM, XG) demonstrated similar performance and were superior to the bagging strategy (RF).

1. Background

Intensive care units (ICUs) house patients with poor health who often have at least one life-threatening condition (1). An intensive care unit (ICU) provides specialized equipment and medical and nursing care, resulting in high healthcare expenditure within this unit (2, 3). Allocating resources according to the needs of patients in the ICU is essential for the quality of care (4). The prediction of mortality in the ICU has been a critical issue in medicine for decades, as it is used to prioritize patients and make critical decisions (5, 6). Most researchers predict mortality using severity of illness scoring systems designed for risk estimation 24 hours after ICU admission or data-mining algorithms (7). The three major predictive scoring systems used to predict mortality in general ICU patients are the Acute Physiologic and Chronic Health Evaluation (APACHE) scoring system, the Simplified Acute Physiologic Score (SAPS), and the Mortality Prediction Model (MPM0) (8). A study comparing traditional scoring models for mortality prediction showed that the performance of the APACHE II/III scoring systems was higher than that of other systems (9). Overall, previous studies have indicated that the accuracy of machine learning models is higher than traditional scoring models, and clinicians should select models that have been more validated (10). Several studies have shown that ensemble models like Random Forest (RF) and Gradient Boosting for mortality prediction are more accurate (11, 12). The novel proposed algorithm is based on the generalization stacking ensemble model (also called the stacking ensemble model) and has presented a heterogeneous ensemble classifier for ICU mortality prediction (13). A machine learning model was developed to predict patients admitted to the ICU for acute gastrointestinal (GI) bleeding with a 2% - 10% mortality risk (14). A developed predictive model to predict patients with sepsis in the ICU can help physicians make optimal clinical decisions, thereby reducing the mortality rate (15). Some previous studies have demonstrated that deep-learning models can identify novel temporal data patterns predictive of ICU mortality and achieve higher accuracy in identifying patients at high risk of death (16, 17). Therefore, predicting mortality in the ICU is very important, and using machine learning models is associated with better performance.

2. Objectives

The present study aims to use ensemble models to reduce the prediction error of mortality in the ICU. We intend to compare the performance of bagging and boosting methods and predict the mortality of patients admitted to the ICU using demographic, clinical, and laboratory information.

3. Methods

3.1. Data Collection and Design

From February 2020, the demographic, clinical, and laboratory characteristics of 2,055 adult patients admitted to the ICU in one of the selected hospitals were recorded for one year (Table 1). Data were initially entered into a paper form and then into the spreadsheet of SPSS software. This study was conducted by the Critical Care Quality Improvement Research Center, Shahid Modarres Hospital. The present study was approved by medical science review boards (IR.SBMU.RETECH.REC.1402.350).

Table 1.Patients’ Features a
FeaturesOutcomesP-Value
Expired (n = 865)Discharged (n = 1190)
Age (y)61.6 ± 14.350.1 ± 14.7< 0.001 b
Receiving AB (d)8.6 ± 6.17.3 ± 4.5< 0.001 b
Before ICU (d)3.3 ± 3.11.6 ± 2.5< 0.001 b
T (min)36.9 ± 0.336.9 ± 0.30.96
T (max)38.7 ± 0.738.6 ± 0.60.11
BP (min)99.3 ± 18.592.4 ± 19.2< 0.001 b
BP (max)126.7 ± 21.4118.6 ± 21.8< 0.001 b
PR (min)73.6 ± 14.072.4 ± 14.00.054
PR (max)102.3 ± 12.4101.6 ± 12.60.18
RR (min)19.0 ± 5.319.6 ± 5.40.019 b
RR (max)28.3 ± 6.428.8 ± 6.50.079
pH7.3 ± 0.17.3 ± 0.10.071
PaO269.1 ± 22.769.2 ± 22.80.94
PaCO239.8 ± 11.339.8 ± 11.40.935
Na (min)129.4 ± 3.0129.6 ± 2.90.293
Na (max)138.8 ± 3.8139.4 ± 3.30.001 b
BG (min)120.9 ± 45.593.4 ± 21.4< 0.001 b
BG (max)214.1 ± 86.0161.6 ± 49.7< 0.001 b
Cr (min)1.0 ± 0.20.9 ± 0.2< 0.001 b
Cr (max)1.5 ± 0.71.3 ± 0.4< 0.001 b
BUN (min)31.0 ± 9.727.1 ± 8.4< 0.001 b
BUN (max)55.9 ± 28.347.7 ± 15.8< 0.001 b
UA (vol)2181.8 ± 740.72366.8 ± 577.8< 0.001 b
Alb3.3 ± 0.543.3 ± 0.550.88
Bili1.7 ± 1.11.8 ± 1.40.067
Hct (min)32.0 ± 4.830.3 ± 5.1< 0.001 b
Hct (max)40.8 ± 4.839.3 ± 5.0< 0.001 b
WBC9120.1 ± 3319.58735.6 ± 3008.40.007 b
GCS10.7 ± 2.510.7 ± 2.40.965
FiO245.6 ± 19.547.7 ± 20.60.017 b
Gender0.842
Female416567
Male449623
Nosocomial< 0.001 b
Positive201106
Negative6641084
Surgery< 0.001 b
Positive340794
Negative526396
Emergency surgery< 0.001 b
Positive191432
Negative674758
Diabetes< 0.001 b
Positive446185
Negative4191005
Chronic kidney disease< 0.001 b
Positive10833
Negative7571157
Liver failure0.012 b
Positive1237
Negative8531153
Intubation0.06
Positive355538
Negative510652
HIV0.316
Positive14
Negative8641186
Lymphoma0.429
Positive1425
Negative8511165
Metastasis< 0.001 b
Positive59143
Negative8061047
Leukemia0.036 b
Positive06
Negative8651184
Immunosuppression< 0.001 b
Positive175120
Negative6901070
Readmission< 0.001 b
Positive329222
Negative536968
Myocardial infarction< 0.001 b
Positive29299
Negative5731091
Central venous catheter line< 0.001 b
Positive601545
Negative264645
Tracheostomy< 0.001 b
Positive15437
Negative7111153
Nasogastric tube< 0.001 b
Positive8561053
Negative9137
Packed cell0.288
Positive216322
Negative649868
Chronic obstructive pulmonary disease< 0.001 b
Positive22161
Negative6441129
Anesthetic< 0.001 b
Positive754872
Negative111318
Total parenteral nutrition< 0.001 b
Positive22143
Negative6441147
Alcohol0.113
Positive4140
Negative8241150
Site< 0.001 b
Blood164
Wound97
Urine6540
Sputum11155
Not infected6641084
Pathogen< 0.001 b
Candidia60
Escherichia coli3932
Acinetobacter4430
Staphylococcus aureus4717
Pseudomonas134
Klebsiella5223
Not infected6641084
Ward< 0.001 b
Surgery142409
Internal328123
Emergency395658
The main AB used< 0.001 b
AB1--
AB2--
Reason for admission< 0.001 b
Others238299
Respiratory28799
Other surgeries140296
Trauma surgery92337
Brain surgery108159

Abbreviations: AB, antibiotics; ICU, intensive care unit; T, temperature; BP, blood pressure; PR (min), minimum pulse rate; PR (max), maximum pulse rate; RR (min), minimum respiration rate; RR (max), maximum respiration rate; Na (min), minimum blood sodium; Na (max), maximum blood sodium; BG, blood glucose; Cr, blood creatinine; BUN, blood urea nitrogen; UA (vol), urine volume; Alb, albumin; Bili, bilirubin; Hct (min), minimum blood hematocrit level; Hct (max), maximum blood hematocrit level; WBC, white blood cell count; GCS, Glasgow Coma Scale; FiO2, percentage of inspiratory oxygen; HIV, human immunodeficiency virus.

a Values are expressed as No. or mean ± SD.

b Statistically significant.

3.2. Data Preprocessing

The data collected from 2,055 patients had no missing or duplicate values. The target variable in this problem has two values: Class 0 refers to discharged, and class 1 refers to expired. Due to the removal of outliers, the frequency of the target variable changed, necessitating the use of oversampling. Class imbalance is a serious problem for classification problems. The SMOTE algorithm can generate random sample points, improving the imbalance rate (18). We utilized the interquartile range (IQR) to identify and remove outliers and excluded rows with missing values. The data were separated into training and testing sets by 80% and 20%, respectively. We used a label encoder for binary columns and one-hot encoding for columns with more than two values.

3.3. Models

The ensemble learning structure is a combination of two or more classifiers instead of an individual classifier, aiming to increase prediction accuracy. In addition to being highly accurate, we aim to reduce biases or high variance, as one of the problems of individual classifier learners is that they can be high bias, highly variant, or both (19). The popular ensemble techniques are bagging, boosting, and stacking (20):

- Bagging involves fitting many decision trees on different samples of the same dataset and averaging the predictions.

- Boosting involves adding ensemble members sequentially that correct the predictions made by prior models and output a weighted average of the projections.

- Stacking involves fitting many different model types on the same data and using another model to learn how to best combine the predictions.

A RF algorithm is a supervised machine learning algorithm that is extremely popular and is used for classification and regression problems in machine learning. It is a classifier that contains several decision trees on various subsets of the given dataset and takes the average to improve the predictive accuracy of that dataset, which refers to the bagging definition. A previous study has shown that the RF classifier has a higher classification rate than single classifiers and takes less training time than decision tree and support vector machine (21). Light GBM (LGBM) is a high-performance gradient-boosting framework that uses a tree-based learning algorithm. The LGBM splits the tree leaf-wise with the best fit, whereas other boosting algorithms like XGBoost (XG) separate the tree depth-wise or level-wise rather than leaf-wise. In other words, LGBM grows trees vertically, while different algorithms grow trees horizontally. Previous studies have concluded that LGBM can significantly outperform XG in terms of computational speed, memory consumption, and accuracy (22, 23). To develop the models, we employed the default parameter settings of the RF, XG, and LGBM libraries, ensuring a standard approach to model training and evaluation.

3.4. Feature Selection and Modeling

Feature selection is a necessary stage of data analysis for selecting a small set of relevant features. The RF classifier is an instrumental base for the wrapper algorithms solving all relevant problems because it provides the variable importance measure (24). We used RF feature selection to avoid overfitting the model (Figure 1). Based on expert opinion, we removed the features whose importance was less than 0.00237 and then proceeded to build the models. Additionally, we used the logistic regression model to report the individual ratio measure with a confidence level of 95%, making its interpretation suitable for doctors.

The importance of features for mortality prediction with Random Forest (RF) feature selection
Figure 1.

The importance of features for mortality prediction with Random Forest (RF) feature selection

3.5. Software

In this study, we used SPSS version 22 software for statistical analysis and machine learning models implemented by Python libraries of Scikit-learn, XG, and LGBM. Regarding the hardware, our CPU was an Intel i5 2.53 GHz with 8 GB installed memory.

4. Results

4.1. Participants

In the data of 2,055 patients, 983 cases were women, and 1,072 cases were men, with a mean (SD) age of 55.93 (15.7) and 54.14 (15.4), respectively. In general, 865 patients died, and 1,190 were discharged. The results of Figure 1 show that the number of days of hospitalization before entering the ICU has the most substantial impact on the construction of the models. Table 1 shows that the difference between the two groups (expired and discharged) is significant, with the quartile difference between the first and third patients who died being five days, with a mean of 3.3 (3.1); for the discharge group, it is two days, with a mean of 1.6 (2.5). Table 1 shows the characteristics of patients in the ICU in two groups: Death and survival. Statistical tests were performed for each of the factors, which include: Age, number of days receiving antibiotics (AB), blood pressure (BP), minimum respiration rate [RR (min)], maximum blood sodium [Na (max)], blood sugar (BG), blood creatinine (Cr), blood urea nitrogen (BUN), urine volume [UA (vol)], blood hematocrit level (Hct), white blood cell count (WBC), percentage of inspiratory oxygen (FiO2), hospital infection, surgery, diabetes, chronic kidney disease, liver failure, metastasis, immunodeficiency, readmission, heart attack, chronic obstructive pulmonary disease, leukemia, tracheostomy, and reason for ICU admission. These tests separately show that there is a significant difference.

4.2. Models Validation

We developed three mortality ensemble models: Model 1: Light GBM, model 2: XGBoost, and model 3: Random forest. After adjusting the hyperparameters, we considered 100 estimators for RF and 150 estimators for LGBM and XG. The research indicated that the accuracy of the RF model is 0.91, while LGBM and XG both achieved an accuracy of 0.93. Other evaluation criteria are reported in Table 2. We also compared them using the receiver operating characteristic (ROC) curve, with RF (area = 0.91), LGBM (area = 0.94), and XG (area = 0.94), leading to the conclusion that LGBM and XG had almost the same performance (Figure 2).

Table 2.Evaluation Indicators
ModelsAccuracyF-ScoreRecallPrecisionSpecificity
LGBM0.9370.9370.9190.9550.956
XG 0.9370.9360.9230.9500.951
RF0.9110.9120.8800.9450.944

Abbreviations: LGBM, LightGBM; XG, XGBoost; RF, Random Forest.

The ROC curve
Figure 2.

The ROC curve

5. Discussion

Based on past studies conducted in the field of mortality in the ICU and the differences between ensemble models and individual models, this study aimed to compare the performance of ensemble models, particularly the bagging and boosting methods, to improve the prediction of mortality in the ICU. The study demonstrated that the performance of boosting methods is superior to bagging. One of the attractions of using ensemble models is the stacking method, as different results can be obtained by combining different classifiers. This method can be used for future studies and offers innovation. In this study, in addition to highlighting the importance of each patient’s characteristics in mortality, we used logistic regression to report the odds ratio criterion with a confidence level of 95%. The odds ratio is a statistical measure of the association between binary variables across two different groups, where one group is referred to as the independent group, while the other is the dependent group (25). This criterion is widely used in the medical community and is suitable for the interpretation of predictors (Table 3).

Table 3.The Odds Ratio for Predictors of Mortality
PredictorsP-Value (0.05)Odd Ratio95% CI
LowerUpper
Age0.0000.9630.9510.975
Brain surgery0.3041.3550.7592.421
Trauma surgery0.0461.8701.0123.456
Other surgeries0.8421.0410.7041.537
Respiratory0.0000.3940.2640.590
AB (d)0.0001.1011.0641.140
GCS0.0011.1851.0731.308
Nosocomial infection0.0301.5371.0442.264
Emergency surgery0.0012.5011.4674.263
Diabetes0.0005.4923.5068.604
Intubation0.0080.4140.2150.796
Metastasis0.0000.2240.1280.394
Immunosuppression0.0002.9161.9154.441
MI0.0101.6791.1302.494
CVLine0.0001.6711.2562.224
Tracheostomy0.0009.9875.40618.450
COPD0.0004.1592.7606.268
Anesthetic0.0004.1242.9095.847
TPN0.0004.3572.6607.139
Gender (male)0.8721.0270.7441.417
BP (max)0.0160.9870.9760.997
Before ICU (d)0.0000.8790.8370.923
FiO20.3681.0050.9941.018
Bili0.0011.1941.0741.328
Readmission0.0071.6151.1422.284
Hct (max)0.0470.9500.9030.999
T (min)0.3061.2360.8241.853
Alb0.2720.8780.6961.108
BUN (min)0.0160.9740.9540.995
BG (min)0.7681.0010.9951.007
Na (min)0.4650.9840.9431.027
Na (max)0.2920.9800.9441.017
PR (min)0.1720.9940.9851.003
Cr (min)0.1541.9500.7794.880

Abbreviations: AB, antibiotics; GCS, Glasgow Coma Scale; BP, blood pressure; ICU, intensive care unit; FiO2, percentage of inspiratory oxygen; Bili, bilirubin; Hct (max), maximum blood hematocrit level; T, temperature; Alb, albumin; BUN, blood urea nitrogen; BG, blood glucose; Na (min), minimum blood sodium; Na (max), maximum blood sodium; PR (min), minimum pulse rate; Cr, blood creatinine.

This study identified which characteristics of patients in the ICU have a significant relationship with mortality. Patients whose reason for referral was trauma surgery had a lower mortality risk, whereas patients with respiratory problems were at higher risk of mortality. Factors such as age, high blood pressure, blood urea nitrogen, the number of days receiving antibiotics, readmission to the ICU, and the number of days of hospital stay before entering the ICU were directly related to increased mortality risk. This study also showed that although intubated patients were less prone to mortality, they were more inclined to mortality under tracheostomy. Among other factors influencing the death rate in the ICU is nosocomial infection, which has a direct relationship with mortality. The GCS criterion has an inverse relationship with mortality; these relationships are clinically acceptable. Our sample size was only sufficient to find statistically significant large associations. The purpose of developing predictive models in machine learning is to aid in decision-making, and the more accurate the model’s performance, the more reliable it is. This study sought to improve the prediction performance of mortality in the ICU by using ensemble models.

5.1. Conclusions

Based on the accuracy of traditional scoring methods in past studies, we found that machine learning methods have higher accuracy. In this study, the performance of ensemble models was reported to be better than individual models used in previous studies. Furthermore, when comparing ensemble methods (bagging and boosting), boosting techniques (LGBM, XG) demonstrated similar performance and were superior to the bagging strategy (RF).

Footnotes

References

  • 1.
    Kose I, Zincircioglu C, Ozturk YK, Cakmak M, Guldogan EA, Demir HF, et al. Factors Affecting Anxiety and Depression Symptoms in Relatives of Intensive Care Unit Patients. J Intensive Care Med. 2016;31(9):611-7. [PubMed ID: 26168801]. https://doi.org/10.1177/0885066615595791.
  • 2.
    Dziegielewski C, Talarico R, Imsirovic H, Qureshi D, Choudhri Y, Tanuseputro P, et al. Characteristics and resource utilization of high-cost users in the intensive care unit: a population-based cohort study. BMC Health Serv Res. 2021;21(1):1312. [PubMed ID: 34872546]. [PubMed Central ID: PMC8647444]. https://doi.org/10.1186/s12913-021-07318-y.
  • 3.
    McGuire A, McConnell PC. Resource allocation in ICU: ethical considerations. Curr Opin Anaesthesiol. 2019;32(2):190-4. [PubMed ID: 30817394]. https://doi.org/10.1097/ACO.0000000000000688.
  • 4.
    Neuraz A, Guerin C, Payet C, Polazzi S, Aubrun F, Dailler F, et al. Patient Mortality Is Associated With Staff Resources and Workload in the ICU: A Multicenter Observational Study. Crit Care Med. 2015;43(8):1587-94. [PubMed ID: 25867907]. https://doi.org/10.1097/CCM.0000000000001015.
  • 5.
    Bhattacharya S, Rajan V, Shrivastava H. ICU Mortality Prediction: A Classification Algorithm for Imbalanced Datasets. Proceedings of the AAAI Conference on Artificial Intelligence. 2017;31(1). https://doi.org/10.1609/aaai.v31i1.10721.
  • 6.
    Xu J, Zhang Y, Zhang P, Mahmood A, Li Y, Khatoon S. Data Mining on ICU Mortality Prediction Using Early Temporal Data: A Survey. International Journal of Information Technology & Decision Making. 2017;16(1):117-59. https://doi.org/10.1142/s0219622016300020.
  • 7.
    Awad A, Bader-El-Den M, McNicholas J. Patient length of stay and mortality prediction: A survey. Health Serv Manage Res. 2017;30(2):105-20. [PubMed ID: 28539083]. https://doi.org/10.1177/0951484817696212.
  • 8.
    Kelley MA, Manaker S, Finlay G. Predictive scoring systems in the intensive care unit. UpToDate. Available at: URL: http://www. uptodate. com/online/content/author. do. 2012.
  • 9.
    Saleh A, Ahmed M, Sultan I, Abdel-lateif A. Comparison of the mortality prediction of different ICU scoring systems (APACHE II and III, SAPS II, and SOFA) in a single-center ICU subpopulation with acute respiratory distress syndrome. Egyptian Journal of Chest Diseases and Tuberculosis. 2015;64(4):843-8. https://doi.org/10.1016/j.ejcdt.2015.05.012.
  • 10.
    Barboi C, Tzavelis A, Muhammad LN. Comparison of Severity of Illness Scores and Artificial Intelligence Models That Are Predictive of Intensive Care Unit Mortality: Meta-analysis and Review of the Literature. JMIR Med Inform. 2022;10(5). e35293. [PubMed ID: 35639445]. [PubMed Central ID: PMC9198821]. https://doi.org/10.2196/35293.
  • 11.
    Asgari P, Miri MM, Asgari F. The comparison of selected machine learning techniques and correlation matrix in ICU mortality risk prediction. Informatics in Medicine Unlocked. 2022;31. https://doi.org/10.1016/j.imu.2022.100995.
  • 12.
    Chiu CC, Wu CM, Chien TN, Kao LJ, Li C, Jiang HL. Applying an Improved Stacking Ensemble Model to Predict the Mortality of ICU Patients with Heart Failure. J Clin Med. 2022;11(21). [PubMed ID: 36362686]. [PubMed Central ID: PMC9659015]. https://doi.org/10.3390/jcm11216460.
  • 13.
    El-Rashidy N, El-Sappagh S, Abuhmed T, Abdelrazek S, El-Bakry HM. Intensive Care Unit Mortality Prediction: An Improved Patient-Specific Stacking Ensemble Model. IEEE Access. 2020;8:133541-64. https://doi.org/10.1109/access.2020.3010556.
  • 14.
    Deshmukh F, Merchant SS. Explainable Machine Learning Model for Predicting GI Bleed Mortality in the Intensive Care Unit. Am J Gastroenterol. 2020;115(10):1657-68. [PubMed ID: 32341266]. https://doi.org/10.14309/ajg.0000000000000632.
  • 15.
    Kong G, Lin K, Hu Y. Using machine learning methods to predict in-hospital mortality of sepsis patients in the ICU. BMC Med Inform Decis Mak. 2020;20(1):251. [PubMed ID: 33008381]. [PubMed Central ID: PMC7531110]. https://doi.org/10.1186/s12911-020-01271-2.
  • 16.
    Ge W, Huh J, Park YR, Lee J, Kim Y, Turchin A. An interpretable ICU mortality prediction model based on logistic regression and recurrent neural networks with LSTM units. AMIA Annual Symposium Proceedings. 2018. 460 p.
  • 17.
    Ge W, Huh J, Park YR, Lee J, Kim Y, Zhou G, et al. Using deep learning with attention mechanism for identification of novel temporal data patterns for prediction of ICU mortality. Info Med Unlocked. 2022;29. https://doi.org/10.1016/j.imu.2022.100875.
  • 18.
    Wang S, Dai Y, Shen J, Xuan J. Research on expansion and classification of imbalanced data based on SMOTE algorithm. Sci Rep. 2021;11(1):24039. [PubMed ID: 34912009]. [PubMed Central ID: PMC8674253]. https://doi.org/10.1038/s41598-021-03430-5.
  • 19.
    Gupta V, Mehta A, Goel A, Dixit U, Pandey AC. Spam Detection Using Ensemble Learning. Harmony Search and Nature Inspired Optimization Algorithms. 2019. p. 661-8. https://doi.org/10.1007/978-981-13-0761-4_63.
  • 20.
    Odegua R. An empirical study of ensemble techniques (bagging, boosting and stacking). Proc. Conf.: Deep Learn. IndabaXAt. 2019.
  • 21.
    Parmar A, Katariya R, Patel V. A Review on Random Forest: An Ensemble Classifier. International Conference on Intelligent Data Communication Technologies and Internet of Things (ICICI) 2018. AG, Switzerland: Lecture Notes Data Engineering Communicat Technol; 2019. p. 758-63. https://doi.org/10.1007/978-3-030-03146-6_86.
  • 22.
    Al Daoud E. Comparison between XGBoost, LightGBM and CatBoost using a home credit dataset. Int J Computer Info Engineering. 2019;13(1):6-10.
  • 23.
    Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. Lightgbm: A highly efficient gradient boosting decision tree. Advances Neural Inf Process Systems. 2017;30.
  • 24.
    Kursa MB, Rudnicki WR. The all relevant feature selection using random forest. arXiv preprint arXiv:1106.5112. 2011.
  • 25.
    Bodur EK, Atsa’am DD. Filter Variable Selection Algorithm Using Risk Ratios for Dimensionality Reduction of Healthcare Data for Classification. Processes. 2019;7(4). https://doi.org/10.3390/pr7040222.

Crossmark
Crossmark
Checking
Share on
Cited by
Metrics

Purchasing Reprints

  • Copyright Clearance Center (CCC) handles bulk orders for article reprints for Brieflands. To place an order for reprints, please click here (   https://www.copyright.com/landing/reprintsinquiryform/ ). Clicking this link will bring you to a CCC request form where you can provide the details of your order. Once complete, please click the ‘Submit Request’ button and CCC’s Reprints Services team will generate a quote for your review.
Search Relations

Author(s):

Related Articles