Survival Prediction in Patients with Colorectal Cancer Using Artificial Neural Network and Cox Regression


avatar Samaneh Sabouri ORCID 1 , avatar Habibollah Esmaily ORCID 2 , * , avatar Soodabeh Shahidsales ORCID 3 , avatar Mahdi Emadi 4

Epidemiology and Biostatistics Department, Mashhad University of Medical Sciences, Mashhad, Iran
Social Determinants of Health Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
Cancer Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad, Iran

how to cite: Sabouri S, Esmaily H, Shahidsales S, Emadi M. Survival Prediction in Patients with Colorectal Cancer Using Artificial Neural Network and Cox Regression. Int J Cancer Manag. 2020;13(1):e81161.



Colorectal cancer (CRC) is the third prevalent cancer worldwide, and it includes 10% of all cancer mortality. In Iran, men and women have the third and the fourth incidence rate of CRC, respectively. Survival analysis methods deal with data that measure the time until an event occurs. Artificial neural networks (ANN) and Cox regression are methods for survival analysis.


The current study was designated to determine related factors to CRC patients’ survival using ANN and Cox regression.


In this historical cohort, information of patients who were diagnosed with CRC in Omid Hospital of Mashhad was collected. A total of 157 subjects were investigated from 2006 to 2011 and were followed up until 2016. In ANN, data were divided into two groups of training and testing, and the best neural network architecture was determined based on the area under the ROC curve (AUC). Cox regression model was also fitted and the accuracy of these two models in survival prediction was compared by AUC.


The mean and standard deviation of age was 56.4 ± 14.6 years. The three-, five- and seven-year survival rates of patients were 0.67, 0.62, and 0.58, respectively. Using test dataset, the area under curve was estimated 0.759 for the chosen model in ANN and 0.544 for Cox regression model.


In this study, ANN is an appropriate approach for predicting CRC patients’ survival which was superior to Cox regression. Thus, it is recommended for predicting and also determining the influence of risk factors on patients’ survival.

1. Background

Colorectal cancer (CRC) is the third prevalent cancer and the third leading cause of death worldwide (1). According to GLOBOCAN, CRC accounted 1361000 new cases and 694000 deaths in 2012 (2). It is also predicted that there will be a 66% increase in the burden of CRC; 2.2 million new cases, and 1.1 million deaths by 2030 (3).

Most of CRC cases occur in industrialized countries; however, its incidence rate is growing in less-developed regions due to adopting the Western lifestyle (4). The lowest and highest incidence rates are observed in Western Africa and Australia, respectively (2). Among Asian countries, Japan has the highest incidence rate particularly among men, but its mortality is lower than in Europe due to screening program since 1992. After Japan, Europe has the highest incidence and mortality rates (5). In Europe, Slovakia, Hungry and the Czech Republic represent the highest rate among men while Norway, Denmark, and the Netherlands show the highest rate among women (6, 7). In Iran, CRC is the fifth and the third most common cancer among men and women, respectively (8).

The main risk factors of this disease are excessive consumption of red meat, alcohol intake, sedentary lifestyle, tobacco smoking, overweight, fruit and vegetable-free diet, family history, and age over 50 (5, 9). Numerous studies have shown that smoking increase the risk of CRC up to 30% and the effect of hereditary is estimated for 7% - 10% (10-13). It has also been found that obese men and women are at a higher risk of colon and rectal cancer, than others (14). Unlike these factors, fruit and vegetable consumption play a protective role against CRC because they are rich in antioxidants, fiber, folic acid, and vitamins. Fiber is protective and leads to faster transit times to stool, therefore, it decreases the potential chance of carcinogens (15). In addition, it is estimated that 66% - 75% of cases were preventable by adopting a healthy lifestyle (16).

The first treatment of CRC depends on the tumors’ location, size, and also patients’ health (17). In cases of early diagnosis, surgery is selected as the primary treatment but not effective in metastatic cases (18). Since the 1990s, the 5-year survival rate of patients has improved due to detection of the disease in initial stages, successful treatment in stages II and III, and also a considerable reduction in mortality after surgery (5). The 5-year survival rate of CRC patients is 50% - 60% approximately which is higher in the initial stages (19, 20).

There are different statistical methods for analyzing survival data. ANN and traditional predictive tools are utilized in different studies to predict and determine related risk factors to patients’ survival. Wang et al. showed ANN performed well for prediction the survival of breast cancer patients (21). In a study carried out by Oermann et al. (22), the efficacy of ANN and logistic regression were compared for predicting 1-year survival of patients with brain metastasis, which the result indicated a better performance for ANN model. Furthermore, studies were conducted on patients with CRC and Gastric cancer that introduced ANN as a powerful tool for survival prediction in comparison of Cox regression model (23, 24). Numerous studies have been done in the field of CRC survival rate that they have differed in statistical methods and results.

2. Objectives

In this paper, we applied ANN and Cox regression models to determine related risk factors of survival in CRC patients.

3. Methods

In this historical cohort study, data of patients who were diagnosed with CRC in Omid Hospital of Mashhad were collected. A total of 157 subjects were investigated from 2006 to 2011 and were followed up until 2016. Demographic and clinical information of the patients were gathered using the patient’s medical records.

Patients’ information including gender, age at diagnosis, BMI, family history, tobacco smoking, opium or drug user, tumor stage (I, II, III and IV) (25), tumor grade (well-differentiated, moderately differentiated, and poorly differentiated), first treatment, and relapse were obtained. According to the date of the first diagnosis, the survival time for each patient was calculated in year and death from CRC was defined as an event; so those who survived considered as censored. The information of patients for their regularly checkups were available in their medical records. In some cases, we made phone calls to gather the survival status (death/censor) of patients who did not refer to the hospital for more than six months.

Kaplan-Meier and log-rank test were used for preliminary analyses. To fit the Cox regression model, the proportional hazard assumption tested by the log-minus-log plot. Then we utilized the backward conditional method with an inclusion criterion of 0.10 to enter and 0.15 to remove.

In ANN modeling, we divided data into two subsets randomly including a training (70%) and a testing subset (30%). To avoid complexity, only one hidden layer was applied, therefore we used a 3-layer MLP to fit ANN model with 11 nodes in the input layer, 5 to 15 nodes in the hidden layer, and 1 node in the output layer. The response defined as a binary variable of status; therefore, the logistic transfer function was applied to the output layer. Feedforward algorithm was used for training data with the decay of 0.1 to 0.5. For determining important risk factors, the significant of the ordered variable was calculated for the chosen ANN model. In addition, concordance index and the area under the curve were calculated to compare the power of prediction in ANN and Cox models. In this study, SPSS software version 20.0 and R software version 2.14.0 were utilized for statistical analysis and the significance level was 0.05.

The protocol was approved by the ethics committee of Mashhad University of Medical Sciences (code number: 941205).

4. Results

The study was consisted of 91 (58%) men and 66 (42%) women. The mean and standard deviation of age was 56.4 ± 14.6 years. According to independent sample t-test, there was a significant difference in mean age of diagnosis between men (60.1 ± 14.3) and women (55.2 ± 13.6) (P = 0.03). We followed survival status of patients for 10 years and it revealed that 55 (35%) patients died and 102 (65%) were censored. Table 1 shows characteristics of CRC patients based on investigated variables in different subgroups. The results show that most patients were diagnosed with CRC in stage II and III, and 73.2% of them were over 50 years old. The first choice of treatment for 97 (61.8%) cases was surgery and 24.2% of patients had a family history of cancer.

Table 1.

Characteristic of CRC Patientsa

Male91 (58)
Female66 (42)
< 5042 (26.8)
≥ 50115 (73.2)
< 18.526 (16.6)
18.5 - 2590 (57.3)
25 - 3028 (17.8)
> 3013 (8.3)
Tobacco smoking
Yes29 (18.5)
No128 (81.5)
Opium or drug user
Yes19 (12.1)
No138 (87.9)
Family history
Yes38 (24.2)
No119 (75.8)
Tumor location
Colon79 (50.3)
Rectum78 (49.7)
First treatment
Surgery97 (61.8)
Radiotherapy60 (38.2)
Tumor grade
WD93 (59.2)
MD61 (38.9)
PD3 (1.9)
Tumor stage
I11 (7)
II65 (41.4)
III55 (35)
IV26 (16.6)
Yes37 (23.6)
No120 (76.4)

The mean ± SD of survival time was calculated 6.5 ± 4.3 years. The three-, five- and seven-year survival rates of patients were 0.67, 0.62, and 0.58, respectively. Furthermore, we calculated 5-year survival time in each stage that was 0.87 for patients with stage I, 0.75 for stage II, 0.59 for stage III, and 0.24 for stage IV. The lowest survival rate was observed in subjects with tumor stage IV while the patients who were diagnosed with CRC in stage I, had the highest survival rate.

To fit the ANN model, first, we divided data into training (70%) and testing (30%) subsets randomly. Based on the log-rank test, there was no significant difference between the estimated survival curve of training and testing data (P value = 0.482). In sum, 55 models were fitted (with the decay of 0.1 to 0.5 and 5 to 15 nodes in the hidden layer), and the best model was chosen based on the area under the ROC curve (AUC = 0.802) with 8 nodes in hidden layer and decay of 0.2.

After acknowledgment of proportional hazard assumption using log-minus-log plot, we fitted the Cox regression model with the backward conditional method. Table 2 shows the result of both models in determining the importance of independent variables. For this aim, normalized importance and probability value were utilized to identify the order of variables.

Table 2.

Prognostic Factors of CRC Patients’ Survival in ANN and Cox Regression Models

ANN ModelCox Regression
Ordered VariableNormalized ImportanceOrdered VariableP Value
Tumor stage0.187Tumor stage0.001
First treatment0.138Gender0.016
Family history0.135Relapse0.017
Opium or drug user0.111Family history0.066
Gender0.110Opium or drug user0.128
Age0.057First treatment0.445
Tobacco smoking0.051Tobacco smoking0.452
Tumor grade0.042Tumor grade0.623
Tumor location0.021Tumor location0.767

In the 3-layer ANN model, factors such as tumor stage, first treatment, family history, opium or drug user, and gender played a major role in survival prediction. The results show tumor stage (P = 0.001), gender (P = 0.016), and relapse (P = 0.017) were statistically significant in the Cox regression.

In the next step, we utilized testing data to calculate the accuracy of prediction in both models. Table 3 illustrates observed cases of censor, death and the percent of true prediction in ANN and CPH models. The area under the ROC curve was 0.759 for ANN and 0.544 for Cox regression models. As regards, ANN was more powerful in recognition of true cases and superior to Cox regression. According to the classification table, the accuracy of ANN and Cox regression was 70.8% and 50.0%, respectively. This amount was greater in ANN that represents more correct classification.

Table 3.

Accuracy of ANN and Cox Regression Models Based on Test Set in CRC Patientsa

GroupsObservedTrue Prediction
ANNCox Regression
Death199 (47.4)9 (47.4)
Censored2925 (86.2)15 (51.7)
Total4834 (70.8)24 (50.0)

5. Discussion

After cardiovascular disease and motor vehicle accidents, cancer is the third leading cause of death in Iran (26). CRC is one of the common gastrointestinal cancer that causes due to lifestyle and aging. Although the incidence of this disease is higher in Western countries, it is increasing in less developed countries as a result of changing their lifestyle (4). Regards to the incident rate of CRC in recent years in Iran and the necessity of carrying out more researchers in this field, this article is conducted to determine related factors to CRC patients’ survival using ANN and Cox regression models.

The results show the mean age of men and women were 60.1 and 55.2 years, respectively and 26.8 % were under 50 years old. In Iran, almost 20% of CRC cases occur under 40 years old while only 2% - 8% of patients are in this age group in the developed countries (27). Lifestyle changes and the young population in Iran are the reasons that make diagnosed patients younger in comparison with more developed countries.

The 5-year survival rate of CRC patients was estimated 0.48. Rasouli et al. (28) represented that 5-year survival was 0.33 in Kurdistan Province in Iran. In another research by Marely et al. (17) the 5-year survival was over 0.60 in the USA. Prevention methods such as screening and diagnosis at initial stages are the reasons for the high survival rate in the Western countries.

Stage of the tumor describes the extent of cancer in the body and it is one of the significant factors in deciding about the type of treatment (29). In this article, the tumor stage was significant in Cox regression and also was the most important item in survival prediction in the ANN model. In a study conducted by Gohari et al. (23) the pathologic stage was significant in rectum cancer and that was one of the important prognostic factors on patients’ survival in ANN model.

Relapse is particularly effective on the survival of patients with colorectal cancer (30). O’Connell et al. (31) showed that subjects with initial stage II had longer survival time versus stage III. In our study, relapse was significant in Cox regression but not located at the most important variables for survival prediction in the ANN model. This variable was also significant in a study conducted in Mashhad using Cox regression (32).

Gender and the variable opium or drug user were important in ANN model; gender was also statistically significant in Cox regression. Majek et al. (33) showed women had higher age-adjusted 5-year survival rate compared to men. Moreover, the effect of smoking and drug on the patients' survival was proved in other researches (23, 32, 34).

Family history was important only in survival prediction in the ANN model; this founding was acknowledged in a study by Jasperson et al. (35).

In the next step, we compared the ability of each model in survival prediction using the area under the ROC curve and accuracy criterion. These criterions were both higher in ANN that represents the power of model in predicting true cases. Plenty of studies highlighted ANN was superior to classical models.

Gohari et al. (23) reported ANN was a better approach for prediction and determining prognostic factors of colon and rectum cancers. Another study demonstrated neural network was more accurate and outperformed logistic regression in colon cancer patients (36). Ahmed (37) showed neural networks had better accuracy in classification and survival prediction of patients with colon cancer in comparison with other methods.

5.1. Conclusions

However, the Cox model estimates the association of variables in terms of HR but both models are comparable with regard to their accuracy in predicting as well as determining which variables are important in the model. This study supports the use of ANN model versus Cox regression in survival prediction of CRC patients. We can determine relevant prognostic factors by ordering the normalized importance variables. It would be also helpful to compare the result of ANN model with other survival analysis methods. In conclusion, ANN is more efficient and accurate, so it is recommended for predicting and determining risk factors for survival of CRC patients.


  • 1.

    Siegel R, Naishadham D, Jemal A. Cancer statistics, 2013. CA Cancer J Clin. 2013;63(1):11-30. [PubMed ID: 23335087].

  • 2.

    Globocan WHO. Estimated cancer incidence, mortality and prevalence worldwide in 2012. Lyon: WHO; 2012. Available from:

  • 3.

    Arnold M, Sierra MS, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global patterns and trends in colorectal cancer incidence and mortality. Gut. 2017;66(4):683-91. [PubMed ID: 26818619].

  • 4.

    Bishehsari F, Mahdavinia M, Vacca M, Malekzadeh R, Mariani-Costantini R. Epidemiological transition of colorectal cancer in developing countries: environmental factors, molecular pathways, and opportunities for prevention. World J Gastroenterol. 2014;20(20):6055-72. [PubMed ID: 24876728]. [PubMed Central ID: PMC4033445].

  • 5.

    Binefa G, Rodriguez-Moranta F, Teule A, Medina-Hayas M. Colorectal cancer: from prevention to personalized medicine. World J Gastroenterol. 2014;20(22):6786-808. [PubMed ID: 24944469]. [PubMed Central ID: PMC4051918].

  • 6.

    Ferlay J, Steliarova-Foucher E, Lortet-Tieulent J, Rosso S, Coebergh JW, Comber H, et al. Cancer incidence and mortality patterns in Europe: estimates for 40 countries in 2012. Eur J Cancer. 2013;49(6):1374-403. [PubMed ID: 23485231].

  • 7.

    Karsa LV, Lignini TA, Patnick J, Lambert R, Sauvaget C. The dimensions of the CRC problem. Best Pract Res Clin Gastroenterol. 2010;24(4):381-96. [PubMed ID: 20833343].

  • 8.

    Mohammadianpanah M. Colorectal Cancer Incidence: Does Iran Follow the West? Ann Colorectal Res. 2015;3(1).

  • 9.

    Chan AT, Giovannucci EL. Primary prevention of colorectal cancer. Gastroenterology. 2010;138(6):2029-43. [PubMed ID: 20420944]. [PubMed Central ID: PMC2947820].

  • 10.

    De Rosa M, Pace U, Rega D, Costabile V, Duraturo F, Izzo P, et al. Genetics, diagnosis and management of colorectal cancer (Review). Oncol Rep. 2015;34(3):1087-96. [PubMed ID: 26151224]. [PubMed Central ID: PMC4530899].

  • 11.

    Giovannucci E, Rimm EB, Stampfer MJ, Colditz GA, Ascherio A, Kearney J, et al. A prospective study of cigarette smoking and risk of colorectal adenoma and colorectal cancer in U.S. men. J Natl Cancer Inst. 1994;86(3):183-91. [PubMed ID: 8283490].

  • 12.

    Newcomb PA, Storer BE, Marcus PM. Cigarette smoking in relation to risk of large bowel cancer in women. Cancer Res. 1995;55(21):4906-9. [PubMed ID: 7585528].

  • 13.

    Paskett ED, Reeves KW, Rohan TE, Allison MA, Williams CD, Messina CR, et al. Association between cigarette smoking and colorectal cancer in the Women's Health Initiative. J Natl Cancer Inst. 2007;99(22):1729-35. [PubMed ID: 18000222].

  • 14.

    Karahalios A, Simpson JA, Baglietto L, MacInnis RJ, Hodge AM, Giles GG, et al. Change in weight and waist circumference and risk of colorectal cancer: Results from the Melbourne Collaborative Cohort study. BMC Cancer. 2016;16:157. [PubMed ID: 26917541]. [PubMed Central ID: PMC4768408].

  • 15.

    Kim E, Coelho D, Blachier F. Review of the association between meat consumption and risk of colorectal cancer. Nutr Res. 2013;33(12):983-94. [PubMed ID: 24267037].

  • 16.

    Giovannucci E. Modifiable risk factors for colon cancer. Gastroenterol Clin North Am. 2002;31(4):925-43. [PubMed ID: 12489270].

  • 17.

    Marley AR, Nan H. Epidemiology of colorectal cancer. Int J Mol Epidemiol Genet. 2016;7(3):105-14. [PubMed ID: 27766137]. [PubMed Central ID: PMC5069274].

  • 18.

    Kekelidze M, D'Errico L, Pansini M, Tyndall A, Hohmann J. Colorectal cancer: current imaging methods and future perspectives for the diagnosis, staging and therapeutic response evaluation. World J Gastroenterol. 2013;19(46):8502-14. [PubMed ID: 24379567]. [PubMed Central ID: PMC3870495].

  • 19.

    Ciccolallo L, Capocaccia R, Coleman MP, Berrino F, Coebergh JW, Damhuis RA, et al. Survival differences between European and US patients with colorectal cancer: Role of stage at diagnosis and surgery. Gut. 2005;54(2):268-73. [PubMed ID: 15647193]. [PubMed Central ID: PMC1774819].

  • 20.

    Verdecchia A, Francisci S, Brenner H, Gatta G, Micheli A, Mangone L, et al. Recent cancer survival in Europe: a 2000-02 period analysis of EUROCARE-4 data. Lancet Oncol. 2007;8(9):784-96. [PubMed ID: 17714993].

  • 21.

    Wang TN, Cheng CH, Chiu HW. Predicting post-treatment survivability of patients with breast cancer using Artificial Neural Network methods. Annual International Conference of the IEEE Engineering in Medicine and Biology Society IEEE Engineering in Medicine and Biology Society Annual Conference. Osaka, Japan. 2013. p. 1290-3.

  • 22.

    Oermann EK, Kress MA, Collins BT, Collins SP, Morris D, Ahalt SC, et al. Predicting survival in patients with brain metastases treated with radiosurgery using artificial neural networks. Neurosurgery. 2013;72(6):944-51. discussion 952. [PubMed ID: 23467250].

  • 23.

    Gohari MR, Biglarian A, Bakhshi E, Pourhoseingholi MA. Use of an artificial neural network to determine prognostic factors in colorectal cancer patients. Asian Pac J Cancer Prev. 2011;12(6):1469-72. [PubMed ID: 22126483].

  • 24.

    Zhu L, Luo W, Su M, Wei H, Wei J, Zhang X, et al. Comparison between artificial neural network and Cox regression model in predicting the survival rate of gastric cancer patients. Biomed Rep. 2013;1(5):757-60. [PubMed ID: 24649024]. [PubMed Central ID: PMC3917700].

  • 25.

    Egner JR. AJCC Cancer Staging Manual. Jama. 2010;304(15):1726.

  • 26.

    Saadat S, Yousefifard M, Asady H, Moghadas Jafari A, Fayaz M, Hosseini M. The most important causes of death in iranian population; a Retrospective Cohort study. Emerg (Tehran). 2015;3(1):16-21. [PubMed ID: 26512364]. [PubMed Central ID: PMC4614603].

  • 27.

    Ansari R, Mahdavinia M, Sadjadi A, Nouraie M, Kamangar F, Bishehsari F, et al. Incidence and age distribution of colorectal cancer in Iran: Results of a population-based cancer registry. Cancer Lett. 2006;240(1):143-7. [PubMed ID: 16288832].

  • 28.

    Rasouli MA, Moradi G, Roshani D, Nikkhoo B, Ghaderi E, Ghaytasi B. Prognostic factors and survival of colorectal cancer in Kurdistan province, Iran: A population-based study (2009-2014). Medicine (Baltimore). 2017;96(6). e5941. [PubMed ID: 28178134]. [PubMed Central ID: PMC5312991].

  • 29.

    American Cancer Society. Colorectal cancer. 2018. Available from:

  • 30.

    Ryuk JP, Choi GS, Park JS, Kim HJ, Park SY, Yoon GS, et al. Predictive factors and the prognosis of recurrence of colorectal cancer within 2 years after curative resection. Ann Surg Treat Res. 2014;86(3):143-51. [PubMed ID: 24761423]. [PubMed Central ID: PMC3994626].

  • 31.

    O'Connell MJ, Campbell ME, Goldberg RM, Grothey A, Seitz JF, Benedetti JK, et al. Survival following recurrence in stage II and III colon cancer: Findings from the ACCENT data set. J Clin Oncol. 2008;26(14):2336-41. [PubMed ID: 18467725].

  • 32.

    Parsaee R, Fekri N, Shahid sales S, AfzalAghaee M, Shaarbaf Eidgahi E, Esmaeily H. Prognostic factors in the survival rate of colorectal cancer patients. J North Khorasan Univ Med Sci. 2015;7(1):45-53.

  • 33.

    Majek O, Gondos A, Jansen L, Emrich K, Holleczek B, Katalinic A, et al. Sex differences in colorectal cancer survival: Population-based analysis of 164,996 colorectal cancer patients in Germany. PLoS One. 2013;8(7). e68077. [PubMed ID: 23861851]. [PubMed Central ID: PMC3702575].

  • 34.

    Walter V, Jansen L, Hoffmeister M, Ulrich A, Chang-Claude J, Brenner H. Smoking and survival of colorectal cancer patients: Population-based study from Germany. Int J Cancer. 2015;137(6):1433-45. [PubMed ID: 25758762].

  • 35.

    Jasperson KW, Tuohy TM, Neklason DW, Burt RW. Hereditary and familial colon cancer. Gastroenterology. 2010;138(6):2044-58. [PubMed ID: 20420945]. [PubMed Central ID: PMC3057468].

  • 36.

    Snow PB, Kerr DJ, Brandt JM, Rodvold DM. Neural network and regression predictions of 5-year survival after colon carcinoma treatment. Cancer. 2001;91(8 Suppl):1673-8. [PubMed ID: 11309767].<1673::aid-cncr1182>;2-t.

  • 37.

    Ahmed FE. Artificial neural networks for diagnosis and survival prediction in colon cancer. Mol Cancer. 2005;4:29. [PubMed ID: 16083507]. [PubMed Central ID: PMC1208946].