This is an ecological study conducted to determine the association between urbanization levels and the colorectal cancer incidence rate in Iran. To determine the urbanization levels, the researchers used the statistical yearbook data of Iran’s provinces in 2012 as reported in the 7th general consensus. They also used the data available in the National Cancer Registry System extracted from the Ministry of Health and Medical Education, Center for Disease Control and Management, in order to determine the colorectal cancer incidence rate in Iran in 2010. This is an ecological study conducted to assess the correlation between urbanization levels and the colorectal cancer incidence in Iran. In order to determine the levels of urbanization, the statistical data of the provinces of the country, which was reported by the Statistical Center of Iran in 2011 according to the 7th General Census of the country, was used. To obtain the incidence of colorectal cancer, data from the National Cancer Registry System of the Center for Disease Control and Management at the Ministry of Health and Medical Education in 2009 were used.
In order to level the provinces in terms of urbanization, the researchers used a series of variables based on their application in different studies for urbanization levels (
25,
28-
32), their impact on urbanization, and their availability when the study was being conducted. The variables were classified for the provinces and classified into 7 groups of indices, and totally included 33 variables.
The variables used to determine the urbanization levels include demographic indices (Population size, population relative density, average household size, urbanization factor, and annual growth rate of population), the human resource index (economic participation rate, unemployment rate, share employment in agricultural, industrial, and services sectors), the communication index (Internet penetration rate, penetration rate of mobile and landline phones, and percentage of villages with communications), the energy indices (the rate of electricity use per 1000 population, percentage of rural areas that have electricity, the gas consumption rate per 1000 people, percentage of villages and cities that have gas, and the water consumption rate per 1000 people), the health Indices (ratio of general practitioners per 1000, ratio of nurses per 1000, ratio of specialists per 1000, and ratio of fixed hospital beds per 1000), the human development index (life expectancy at t birth), training index, GDP per capita index, and the urban services and development indices (road density, railroad density, ratio of vehicles per population, ratio of vehicles with identification number per population, per capita green space, and the average area of residential buildings).
The data recorded for colorectal cancer cases in the National Cancer Registry System were extracted from the Ministry of Health and Medical Education, Center for Disease Control and Management (
33). Cancer is coded according to the International Classification of Diseases for Oncology (ICD-O) (second edition). The C18-21 code belongs to colorectal cancer. In this study, the incidence rate (age- standardized rate [ASR]) of colorectal cancer for both men and women in all provinces in 2009 was used. To get the standardized incidence of colorectal cancer, the researchers initially separated and classified the new cases by province and gender. Subsequently, after removing duplicates, they prepared the collected data for analysis and calculation of the standardized incidence rate. The standard population of the World Health Organization was considered the standard population and the incidence rate was directly standardized.
The researchers also used hierarchical clustering in order to level the provinces based on the variables. In this method, the number of clusters is not known in advance and the process is either agglomerative or divisive. Indeed, clustering analysis is a method for ranking regions, towns, and villages so that places located on the same level are very similar to each other, but have significant differences with places located at other levels (
34). In the agglomerative method, firstly, every observation is placed within a separate cluster and, then, clusters with the highest level of similarity to each other or the least difference are integrated; this process continues and it is repeated until all observations fall into a cluster. In order to perform the agglomerative hierarchical clustering process, we can use different algorithms, which are different from each other in terms of their definitions, gap between two observations, and ways of formation of clusters (
35). They used the Euclidean squared distance to determine the similarities or differences between the provinces. After clustering, they used the Ward’s minimum variance algorithm so that in-cluster variance could be minimized. Provinces, whose intrinsic cluster variance was the lowest in terms of scale, were found to be in the lowest cluster and provinces where their intra cluster variance was greater, were found to be in the highest cluster. They also used the analysis of the one-way ANOVA to investigate the relationship between the colorectal cancer incidence rate and urbanization levels. Moreover, they used the Tukey test in case of significance of the results in order to find the differences between the levels. All statistical analyses were done in SPSS version 23. The significance level was considered less than 0.05.