1. Background
The main goal of modeling is to determine relationships between variables as well as effective variables and predictions (1). According to Wyatt and Altman, a predictive model must be simple to calculate, have an apparent structure and be tested in independent data sets with evidence of generality. The most considered subject in medical and epidemiological researches is human health (2), and available methods of modeling and prediction in classic statistics are not practical enough due to their limitations; so, applying these sorts of models brings some problems and limitations (3). For example, in LR, the assumption of independence of errors and variables is essential (4). Under this condition, if the relevant data is complicated, the model’s assumptions may not be true anymore, thus providing methods that can predict with the lowest error and most confidence, and can also solve these kinds of problems seems to be beneficial and valuable. One way to solve such problems is using artificial neural model which has been used increasingly during last decades (5). These models are in use with different fields of medicine including cancer anticipation (6), prediction of death after gastric surgery (7), etc. Artificial neural networks (ANN) are adaptive models of analyzing data which are inspired by the functioning processes of the human brain (8). They are particularly suited for solving problems of the non- linear types to reconstruct the approximate rules that put a certain set of data-which describes the problem being considered - with a set of data which provides the solution (9).
In recent years, there has been significant improvement in head trauma surgery as well as in hospital services and trauma care and recovery. However, mild traumatic brain injury patients mostly show a collection of physical, psychological and nervous problems and also symptoms of psychology and weeks after the onset of injury (10-12). This problem can, in turn, lead to chronic disabilities; however, it doesn’t receive enough attention and appropriate treatment due to its biological, mental and social complexity. Normally, investigating traumatic brain injuries by an organized method is difficult because injury mechanisms can cause incongruous clinical manifestations due to distribution of brain injury (13). Thus, most of experiments have focused on identifying the nature and effect of consequent physical problems caused by mild traumatic brain injury. However, investigating the anticipation of psychological symptoms following mild traumatic brain injury is still in its infant stages and requires more growth and development. Thus, providing statistical models which enable us to model psychological symptoms is a matter of great importance.
2. Objectives
This study intends to model the psychological symptoms after mild traumatic brain injury using LR and neural network forecasting and offer comparison of the two models.
3. Materials and Methods
3.1. Patients and Study Design
This prospective cohort study was conducted on 100 mild traumatic brain injury patients (13 to 15 GCS), hospitalized in neurosurgery unit. Samples were taken from 15-to-65-year old individuals of both sexes via non-probability sampling method, and were studied after receiving the approval of ethic committee of the hospital and conscious agreement of patients. The exclusion criteria were as follows: patients with any evidence of spinal cord injury, or with communication problems, who use anti- epileptic drugs and have history of seizure and/or psychotic diseases as well as patients with mental retardation. In addition those who were not willing to contribute to the study were also excluded.
In this study, the variable such as having psychological symptoms (yes, no) as an output variable and 14 factors, including age, sex, marital status (married, unmarried), education level, financial status, smoking habits (smoker, nonsmoker) and having history of mental illness in the immediate family, hospitalization in neurosurgery unit, trauma, underlying disease, use of psychological drug, anesthesia, use of alcohol and drug substances , as predictor variables, were used to compare the performance of these two models.
In the first phase, hospitalized patients were examined by the neurosurgeon, and based on the defined criteria, if they were eligible for the study, their data, both demographic and clinical, were entered into the questionnaire. Then, after 6 months, they were asked to attend advising and neurological evaluation center to complete BSI questionnaire. In order to meet consistency in completion of the tests, questions were uttered one by one, then, their oral answers were recorded in the related section of the questionnaire by a psychology expert (MA).
In this study, Instrument that has been used was brief symptom inventory (BSI) questionnaire which is the short form of the SCI-90-R questionnaire. This questionnaire includes 53 questions which evaluate psychological symptoms, and it can be used to separate healthy individuals from the patients. In 1973, Hessel et al. (14) introduced this questionnaire which consists of 9 dimensions as following: somatization, obsessive-compulsive, interpersonal sensitivity, depression, anxiety, anger, anxiety phobia, paranoia, psychoticism. After completion of the questionnaires, the score for each questionnaire was calculated, and also, individuals were classified as having mental health problems with the T values of the global severity index (GSI) ≥ 63. The measure has been shown to be reliable and valid in mild traumatic brain injury patients. Internal consistency was determined by Cronbach’s alpha coefficient and estimated to be 0.86 (15) and in another study in Iran, reliability and internal consistency were 0.71 and 0.96 (16).
3.2. Statistical Analysis
The whole dataset was split into two parts, ANN and logistic models were developed using the first part training (n = 50) and they were validated (or testing) in the second part (n = 50) which the latter have been called the test set.
3.3. ANNs Modeling
The most popular ANN, a supervised multilayer perceptron comprised of an input, a hidden, and an output layer which was performed by MATLAB software. The number of nodes in the input and output layers is determined based on the data structure, but finding the optimum number of hidden node is a crucial step in the architecture of the neural network. The number of hidden nodes enhances the learning capability of the network and leads to over fitting of the training data. Moreover, increases nodes in the hidden layer may cause ANN to learn the training examples too correctly while they cannot be generalized to new cases. These methods need two distinct sets. The first part is training set that help to learn patterns presented in the data and the second part testing (or validation) set to evaluate over-fitting. Therefore, in the first part, the data was divided into two training (n = 50) and testing sets (n = 50), randomly. This procedure was repeated 300 times. In this study, the different networks trained by changing the number of nodes in the hidden layer and compared the performance of these networks by root mean square (RMS) in the testing set. We fitted 300 ANN models with 7 to 12 hidden nodes in our architecture. Finally, the network with minimum RMS was selected. The networks were trained by back-propagation algorithm with learning rate in the range of (0.1 - 0.5) and momentum value in the range of (0.1 - 0.3) on the training set.
3.4. LR Modeling
Procedure of splitting data was like ANN models. We fitted 300 LR models to our data. In this subsection, Stata software program (version.10.0) was used for modeling a binary response variable. Then, in the testing set, the ROC curve and Akaike information criterion (AIC) criteria were calculated to choose the best LR model. P values of 0.05 or less were considered significant.
4. Results
The modeling process in ANN was performed using training data set. Table 1 consists of architecture, RMS, area under ROC curve and percent of inaccurate prediction for six ANN models. The structure of six models was with architectures: (14/7/2), (14/8/2), (14/9/2), (14/10/2), (14/11/2) and (14/12/2) based on the 14 input variables, hidden layer (middle layer) that was chosen with 7 to 12 neurons and 2 neurons according to the output variable with two responses (having psychological symptom (yes, no)). The best model was selected based on minimum amount of RMS, maximum value of area under ROC curve and minimum percent of inaccurate prediction indices. By fitting different three-layer ANN models for six structures on the basis of 7 to 12 neurons in middle layer, the model with nine neurons in hidden layer, learning rate of 0.05, momentum of 0.9 and back-propagation learning algorithm was selected as the most suitable model for data prediction in which RMS was 0.1029, and accuracy rate of the model was 90.65 % (Table 1).
Architecture (Input/Middle/ Output) | RMS | Area Under ROC Curve | Percent of Inaccurate Prediction |
---|---|---|---|
(14/7/2) | 0.1151 | 0.802 | 11.16 |
(14/8/2) | 0.1072 | 0.831 | 10.10 |
(14/9/2)a | 0.1029 | 0.869 | 9.35 |
(14/10/2) | 0.1054 | 0.853 | 9.78 |
(14/11/2) | 0.1293 | 0.801 | 11.84 |
(14/12/2) | 0.1312 | 0.795 | 11.97 |
Choosing the Best Neural Network Model for Observations
Next, the LR model was fitted to the data and the appropriate model was selected. AIC criteria for the best model were 103.05. Stepwise LR model was used in order to select the significant variables. Only education (OR = 4.28), financial status (OR = 2.49) and age (OR = 1.07) were significant and have remained in the final model. However, based on the final neural network model, variables such as having previous history of trauma, drug use and level of education were the most affected (by the top 5% normal effect percent).
Level of Significance of Variables in Logistic Regression Model in Descending Order | Level of Significance of Variables in Neural Network Model in Descending Order |
---|---|
Level of educationa | History of traumab |
Financial statusa | History of using substanceb |
Agea | Level of educationb |
History of being hospitalized in neurosurgery unit | History of mental illness in the immediate familyb |
Sex | History of using alcohol |
Job | History of using psychological drug |
Marital status | Marital status |
History of mental illness in the immediate family | Sex |
History of trauma | Age |
History of underlying disease | Job |
History of using psychological drug | History of anesthesia |
History of anesthesia | Financial status |
History of using alcohol | History of underlying disease |
History of using drug substance | History of being hospitalized in neurosurgery unit |
Situation of Selected Variables in LR and ANN Models
One of the diagnosis criteria for comparing the models is the area under the ROC curve that for which values 0 to 0.5 show a random classification, and values 0.5 to 1 indicate the model’s total diagnosis capacity. According to table 3, the area under the ROC curve in the experimental set for logistic regression and neural network models were obtained as 69.5 % and 86.9 %, respectively.
Index | LR | ANN | P Value |
---|---|---|---|
Area under the ROC curve | 0.695 (0.571 - 0.820)a | 0.869 (0.785 - 0.952) | 0 |
Accuracy rate | 75.96 (75.23 - 76.35) | 90.65 (90.31 - 91.01) | 0 |
Results of Comparing 300 Pairs of Logistic Regression (LR) and Neural Network Models (ANNs)
Figure 1 illustrates the ROC curve for both models. Another index of fitness is the classification of accuracy rate that is the proportion of cases correctly classified in each group. Values 0.5 to 1 show the ability to recognize the models. The correct predicted proportion in Table 3 indicates the proper classification of the neural network model compared with LR. According to this table, the accuracy rate was 90.65 % for the final neural network model and 75.96 % for final LR model.
5. Discussion
Studies which have been conducted on statistical modeling of post-traumatic psychological symptom (15, 17-20) mostly aimed to examine the effective factors on psychological symptoms by using the LR model. However, in the present study, we aimed to investigate the prediction of accuracy and precision of LR and neural network models for the first time. The results show ANN performs well compared with its conventional alternative.
ANN is not limited in its formulated function and can capture complicated patterns more flexibly than the LR (21, 22). ANN is capable of finding patterns despite the missing data.
ANNs are semi parametric classifiers, which are more flexible than parametric models. A very copacetic characteristic of these models is their learning procedure, i.e. learning by examples. When there is little information of the actual connection, this specific makes them more powerful in pattern recognition. Moreover, since ANN has no constraint regarding its formulated function, it is more pliable and has more stability in simulating complicated patterns than LR (23). Another favorable feature of ANN is their capability to find patterns despite of missing data. They are powerful networks, which are tolerant in searching incomplete noisy patterns.
There are some problems in practical application of ANN: we should firstly note that designing a network is not so easy and knowing the fundamental theory is necessary. Secondly, no formal techniques are available to test the relative relevance of the input variables and to make the variable selection process in non-linear methods (24). Thirdly, we may find no etiologic interpretation for the calculated weights in the network as compared in the conventional models (25). No mathematical relationship is determined between target and input variables. And finally, we should know that learning an ANN is computationally time consuming which requires sophisticated software.
Despite all problems, there are some situations where the use of old-style models is impossible and we need some other alternatives. Using the old-style models requires many assumptions that may not be true in some real requests (26). Contravention of these assumptions may generate error in hypothesis testing and prediction. Since the logistic models use linear combination of variables therefore, they are not suitable for modeling multifaceted relationships.
As a general result, ANN may be the best choice when the complex dependencies and interactions exist in the dataset. In contrast, when the most important goal of modeling is causal conclusion among variables and we want to identify the effect of each variable on the response variable, LR is useful.
As a conclusion, neural network is an appropriate approach for predicting psychological symptoms in trauma patients. Thus, resulted predictions of this method can be used in classifying trauma patients which should be followed by recognizing the required healthcare resources according to patients’ needs. In addition, considering that almost all predictive models use linear and logistic methods for analysis, applying discovered nonlinear relationships in nervous system can be helpful in planning more efficient programs for screening individuals susceptible to psychological symptoms.