Presentation of hidden knowledge from a localbreast cancer dataset by the classification and regression trees

authors:

avatar Hadi LotfnezhadAfshar 1 , avatar Lili Rahmatnejad 2 , avatar Bahlol Rahimi 1 , avatar Hamid Reza Khalkhali 3 , *

Department of Health Information Management, Health Information Technology Department, School of Paramedicine, Urmia University of Medical Sciences, Urmia, Iran
Department of Midwifery, School of Nursing & Midwifery, Urmia University of Medical Sciences, Urmia, Iran
Department of Biostatistics, Patient Safety Research Center, School of Medicine, Urmia University of Medical Sciences, Urmia, Iran

how to cite: LotfnezhadAfshar H, Rahmatnejad L , Rahimi B , Khalkhali H R. Presentation of hidden knowledge from a localbreast cancer dataset by the classification and regression trees. J Clin Res Paramed Sci. 2017;6(2):e81268. 

Abstract

Introduction: The using of standard knowledge discovery methods such as decision trees, in context ofthe breast cancer has been studied. Presentation of undiscovered relationship among data in formats such as: visualization and formulating are the reasons of decision trees popularity. An algorithm from this group that has not been used in the previous published papers, applied in current study.
Methods: A dataset included data about 569 patients’ records between the years 2007 and 2010 was used. The missing data handling method was multiple imputation (MI). IBM statistics 21 was the used software for running MI and developing the model. The developed model was evaluated against the criteria such as: accuracy, sensitivity and specificity.
Results: A decision tree with seventeen nodes produced by the model. A set of clinically meaningful if-then rules were produced from nine nodes. It was clear from these rules that the variable that showed the stage of cancer was the most important variable to predict living probability of breast cancer. The performance of produced model for criteria (sensitivity, specificity and accuracy) was: 93.5, 53 and 80.3 percentage respectively.
Conclusion: The model created in current study as the first model in living probability of breast cancer revealed practical undiscovered rules from a not large dataset.

Fulltext

The full text of this article is available on PDF.

References

  • 1.

    The References of this article is available on PDF.