Simple Prediction of Type 2 Diabetes Mellitus via Decision Tree Modeling


avatar Mehrab Sayadi 1 , avatar Mohammad Javad Zibaeenezhad 2 , avatar Seyyed Mohammad Taghi Ayatollahi 1 , *

Department of Biostatistics, School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran
Cardiovascular Research Center, Shiraz University of Medical Sciences, Shiraz, Iran

how to cite: Sayadi M, Zibaeenezhad M J , Ayatollahi S M T . Simple Prediction of Type 2 Diabetes Mellitus via Decision Tree Modeling. Int Cardiovasc Res J. 2017;11(2):e10657. 


Background: Type 2 Diabetes Mellitus (T2DM) is one of the most important risk factors in cardiovascular disorders considered as a common clinical and public health problem. Early diagnosis can reduce the burden of the disease. Decision tree, as an advanced data mining method, can be used as a reliable tool to predict T2DM.
Objectives: This study aimed to present a simple model for predicting T2DM using decision tree modeling.
Materials and Methods: This analytical model-based study used a part of the cohort data obtained from a database in Healthy Heart House of Shiraz, Iran. The data included routine information, such as age, gender, Body Mass Index (BMI), family history of diabetes, and systolic and diastolic blood pressure, which were obtained from the individuals referred for gathering baseline data in Shiraz cohort study from 2014 to 2015. Diabetes diagnosis was used as binary datum. Decision tree technique and J48 algorithm were applied using the WEKA software (version 3.7.5, New Zealand). Additionally, Receiver Operator Characteristic (ROC) curve and Area Under Curve (AUC) were used for checking the goodness of fit.
Results: The age of the 11302 cases obtained after data preparation ranged from 18 to 89 years with the mean age of 48.1 ± 11.4 years. Additionally, 51.1% of the cases were male. In the tree structure, blood pressure and age were placed where most information was gained. In our model, however, gender was not important and was placed on the final branch of the tree. Total precision and AUC were 87% and 89%, respectively. This indicated that the model had good accuracy for distinguishing patients from normal individuals.
Conclusions: The results showed that T2DM could be predicted via decision tree model without laboratory tests. Thus, this model can be used in pre-clinical and public health screening programs.




  • 1.

    The references are available in PDF.