Many studies have indicated that ROP is associated with low gestational age and birth weight; however, different criteria cut-offs were produced in screening schedules (
12-
18). Variation in guidelines for ROP screening and implementing referral criteria maybe related to other factors including the availability of human and material resources, the health infrastructures, care programs associated with antenatal, obstetric and neonatal periods, as well as physician’s knowledge about ROP (
19).
In accordance with other investigations, we found that the mean gestational age and birth weight in the ROP treated group was lower than that in the untreated group. The mean gestational age and birth weight in the treatment-requiring group was 29.34 ± 2.460 weeks and 1187.61 g. Another study from Iran by Karkhaneh et al. have demonstrated that the mean gestational age and birth weight in 953 premature infants with severe ROP was 28.8 ± 2.4 weeks and 1256 ± 389 g (
20). Ahmadpour-Kacho et al. have also indicated that the mean gestational age and birth weight in 256 Iranian neonates with ROP diagnosis was 30.54 ± 2.28 weeks and 1403.47 ± 333.44 g, respectively. They have shown that the occurrence of ROP could be predicted in premature newborns by clinical risk index for babies (CRIB) as a scoring system; however, this index could not be a reliable predicting index for ROP severity or prognosis (
21). Vyas et al. assessed the survival rates and rates of > stage 3 ROP in different populations. Survival rates in infants with gestational age < 26, birth weight < 751 and with CRIB > 10 were 47.5%, 41.2% and 25.2%, respectively. The rate of severe ROP was also 48.4%. They have shown that ROP group had higher mean birth weight (1403 g) and gestational age (30.54 weeks) but lower CRIB scores (
22).
Pediatric screening guidelines were implemented to prevent the virulent form of ROP and childhood blindness based on Retinopathy of Prematurity Plus (
23). On the other hand, Chiang et al. showed a fair diagnostic agreement of plus disease diagnosis among 22 ROP experts. Their results showed that all participants were agreeing on the same diagnosis related only 4 of 34 wide-angle retinal photographs (
24). Other investigations also confirmed such findings by the mean weighted kappa statistic from 0.21 - 0.40 as fair to 0.41 - 0.60 as moderate values in diagnostic agreement of ROP treatment-requiring in plus disease (
24-
30).
Fortes et al. assessed the value of SNAPPE-II (score for neonatal acute physiology and perinatal extension) in predicting ROP but they could not find a significant association between the SNAPPE-II scores and the risk of ROP development (
31). Fleck et al. found a correlation between international differences in ROP treatment rates within BOOST (benefits of oxygen saturation targeting) and international variation in ROP grading. They proposed strong needs for enhancement in the standardization of the diagnosis of ROP treatment-requiring, training in the grading of ROP, implementation of international approach, and ROP image analysis software (
26).
The results of present study have demonstrated that designing data mining techniques to suggest a ROP treatment-requiring model could improve clinical outcomes. Both proposed diagnostic models including Random Forestand Naive Bayes models had high sensitivity by 77.14%. Regarding specificity, the Naive Bayes model with 94.29% was the best among the four techniques. Therefore Bayes' model with high sensitivity and specificity may be suggested as a screening treatment-requiring model. Previous studies have also shown that naïve Bayes is suitable for a small dataset with high correlation between the task and other non-task attribute variables (
32). Moreover, of all data mining techniques, the Decision Tree model with reasonable sensitivity and specificity (71.43 & 100 %) using two variables composed of oxygen therapy duration and units of blood infusion could assess the trend of the screening process. Furthermore, the decision tree technique had the manual capability and there was no computer systems requirement. Consistent with our findings, Ray et al implemented machine learning to predict the incidence of ROP. Three class problems including No ROP, Regressed ROP and Progressed ROP were entered the study. Their results have indicated the Decision Tree model with the highest accuracy (83.26%) and the least false negative values could be used as a preferable screening ROP model (
33). Other systems using the machine learning model could diagnose pre-plus and plus diseases based on ROP Imaging and Informatics data (
34-
37); however, with regard to ROP treatment-requiring, no significant correlation was observed. It is supposed that our designed models would be preferentially applied because of considering multiple risk factors affecting the development of ROP, as well as higher sensitivity and specificity rates.
Our study had some limitations. Machine learning algorithms require a huge number of samples for training (
38,
39). However, collecting large number of samples is not always feasible. This is especially true in the medical field, when we are dealing with data on a rare disease or when, for whatever reason, limited samples are available (
40,
41), besides, in this study, it was not possible to select a larger sample size due to the low prevalence of ROP in Iran.
Elimination of a few medical records that had missing data related risk factors and ROP prognosis, variance in diagnostic methods, and a single population were other limitations of the present study. These limitations could affect the accuracy of software algorithms and prediction ability. Further studies with larger sample size are strongly suggested.
5.1. Conclusion
The results of the present study have demonstrated that data mining techniques could be effectively implemented in ROP screening programs. Among the four techniques, the performance of the Naive Bayes model was the best regarding its sensitivity and specificity.