1. Background
2. Objectives
3. Methods
3.1. Study Sample
| No. | Symbol | Definition | Unit |
|---|---|---|---|
| 1 | Age | Age | Year |
| 2 | Rank | Army rank | Individual |
| 3 | BMI | Body mass index | kg/m2 |
| 4 | FPG | Fast plasma glucose | mg/dL |
| 5 | TCL | Total cholesterol | mg/dL |
| 6 | LDL | Low-density lipoprotein | mg/dL |
| 7 | TG | Triglyceride | mg/dL |
3.2. Descriptive Study
3.3. Analytical Study
3.3.1. Statistical Analyses
3.3.2. Algorithm Evaluation
3.4. Approval Code
4. Results
4.1. Descriptive Results
Abbreviations: BMI, body mass index; FPG, fast plasma glucose; LDL, low-density lipoprotein cholesterol; TCL, total cholesterol; TG, triglyceride.
a P-value obtained from t-test
b Statistically non-significant
| Characteristic | T2DM b | Total | |
|---|---|---|---|
| Yes | No | ||
| Age (y) c | |||
| 19 - 31 | 6 (0.7) | 897 (99.3) | 903 |
| 32 - 34 | 8 (1.2) | 674 (98.8) | 682 |
| 35 - 42 | 29 (3.3) | 839 (96.7) | 868 |
| 43 - 57 | 51 (7.4) | 640 (92.6) | 691 |
| Rank | |||
| Officer | 49 (3.5) | 1363 (96.5) | 1412 |
| Conscripts | 14 (1.2) | 1107 (98.8) | 1121 |
| Staff | 31 (5.1) | 580 (94.9) | 611 |
| BMI (kg/m2) | |||
| Normal: < 25 | 22 (1.6) | 1337 (98.4) | 1359 |
| Overweight: 25 - 30 | 51 (3.4) | 1437 (96.6) | 1488 |
| Obese: ≥ 30 | 21 (7.1) | 276 (92.9) | 297 |
| TCL (mg/dL) | |||
| Ideal: < 200 | 64 (2.6) | 2391 (97.4) | 2455 |
| Borderline: 200 - 239 | 21 (3.7) | 546 (96.3) | 567 |
| High: ≥ 240 | 9 (7.4) | 113 (92.6) | 122 |
| LDL (mg/dL) | |||
| Ideal: < 100 | 41 (2.7) | 1453 (97.3) | 1494 |
| Close to ideal :100 - 129 | 32 (3.2) | 966 (96.8) | 998 |
| Borderline: 130 - 159 | 14 (2.6) | 524 (97.4) | 538 |
| High: ≥ 160 | 7 (6.1) | 107 (93.9) | 114 |
| TG (mg/dL) | |||
| Ideal: < 150 | 50 (2.4) | 2057 (97.6) | 2107 |
| Borderline: 150 - 199 | 16 (2.6) | 594 (97.4) | 610 |
| High: ≥ 200 | 28 (6.6) | 399 (93.4) | 427 |
a Subjects were identified as having T2DM if their fast plasma glucose was greater than 125 mg/dL.
b Values in parentheses show the proportion of T2DM patients in each sub-category.
c Age groups were defined based on age quantiles.
4.2. Analytical Results
Comparison of the original training set and training set after applying Synthetic Minority Over-sampling Technique (SMOTE) for the number of individuals in each category: Age, body mass index (BMI), total cholesterol (TCL), low-density lipoprotein cholesterol (LDL), and triglyceride (TG). Subjects were identified as having type 2 diabetes mellitus if their fast plasma glucose level was greater than 125 mg/dL.
Abbreviations: OR, odds ratio; BMI, body mass index; LDL, low-density lipoprotein cholesterol; TCL, total cholesterol; TG, triglyceride.
a P-value obtained from multiple logistic regression analysis
b Statistically non-significant
The classification decision tree of demographic and biological risk factors for type 2 diabetes mellitus. Information in each class model includes: Label, the probability of a fitted class, i.e. the correct classification rate at the node, and the percentage of observations that fall in the node. Subjects were identified as having type 2 diabetes mellitus if their fast plasma glucose level was greater than 125 mg/dL. BMI, body mass index; FPG, fast plasma glucose; LDL, low-density lipoprotein cholesterol; TCL, total cholesterol
The variable importance in random forest. The upper left figure shows variable importance based on a mean decrease in accuracy, the lower left figure shows variable importance based on a decrease in Gini Index, and the right figure shows overall variable importance. BMI, body mass index; LDL, low-density lipoprotein cholesterol; TCL, total cholesterol; TG, triglyceride.



