Diagnostic reliability and validity have an important role in the progress of science and the practice of clinical psychology and psychiatry, without which the accurate identification of risk factors, interpretation of psychopathology, and efficacy of treatment would be erroneous (
21,
22). Furthermore, cross-cultural and cross-national evaluation of a universal diagnostic interview, such as DSM, are milestones for further studies, particularly clinical trials, in clinical settings.
The findings of the current study are about the initial psychometric properties of SCID-5-RV categorical diagnosis among the Iranian adult population. Generally speaking, SCID-5-RV categorical diagnosis demonstrated good psychometric properties (i.e., excellent internal consistency, test-retest reliability, criterion validity, and acceptable sensitivity, specificity, and construct validity) which are comparable to the properties of the original SCID-5-RV categorical diagnosis (
6,
22).
Investigating the reliability showed satisfactory internal consistency, test-retest reliability, and agreement between two distinct examiners. It worth noting that, in the current study, intra-class kappa coefficient and test-retest reliability of diagnosis categories were higher than those in the previous field trial on categorical disorders that used DSM-5 (
22), as well as the DSM-IV among the Iranian population (
23). Usually, the test-retest method results in a lower kappa coefficient, as in the current study, compared to the joint or inter-rater method (
24). However, the results of the present study are in contrast with such a claim, since the obtained kappa coefficient was extremely higher than that of the studies which used two examiners who observed and rated the same interview independently. This may be related to the larger sample size, and higher homogeneity of the interviewers and clinicians who conducted the interviews (in terms of expertise, age, ethnicity, and other related variables which may affect the intended diagnosis). One of the strengths of the current study was the participation of different clinicians with various clinical disciplines, years of clinical practices, ethnicity/race, age, sex, and other characteristics from all around the country, which enhanced the generalizability of the findings.
Additionally, based on the results of the current study, major depressive disorder, acute stress disorder, and panic disorder had the least Kappa coefficient degree, which is as another important finding of this study that may be related to the lower sample size and comorbidity of these disorders, along with several more severe disorders, including other anxiety or mood disorders, which are common in these diagnosis categories (
25-
27).
In addition, evaluating the internal consistency of diagnosis categories is an important advantage for the present study, which yielded an excellent alpha coefficient for all disorders, indicating a high correlation between the items of each diagnosis. Reliability coefficients are high and it shows that the designed items measure the target construct with high consistency. Such results are not unexpected for a diagnostic tool, because the items are specifically designed for a specified purpose, and the higher the reliability value, the more reliable the measure (
28). Furthermore, the adequate internal consistency was observed even for some disorders that their criteria was changed in DSM-5, including the alcohol use or even several specifiers and MDD with melancholic features. The findings of the current study indicated that the changes from DSM-IV-TR to DSM-5 are acceptable, even in Iranian culture.
To assess the validity of the Persian version of the SCID-5, we compared the results obtained from the questionnaire with clinical diagnoses made consensually by psychiatrists. The results indicated the adequacy of agreement, including acceptable kappa coefficient, high specificity, and almost good sensitivity and likelihood ratio for nearly all diagnoses. The results of the current study are very similar to the findings of studies conducted using previous versions of SCID on Iranian samples (
23).
Sensitivity and Specificity should never be interpreted in isolation as a means for evaluating the clinical utility of a measure (
29). The sensitivity and specificity are directly related to the diagnosis of positive and negative cases, respectively. A test with 60% sensitivity, correctly classifies 60% of individuals as patients (true positives), and therefore it is unable to identify the remaining 40% of the patients (false negatives). A test with 60% specificity, correctly identify 60% of people who truly are not sick (true negative), but cannot identify 40% of them (false positive). The likelihood ratio is defined as the likelihood that a patient whose test is positive is really sick compared to patients whose test was negative (
30). The observed value of these indexes showed that the SCID is sensitive for positive diagnoses and can identify people without disorder correctly. The observed values of LR showed that these diagnoses are of high accuracy. Besides, there was evidence about the construct validity of SCID-5-RV diagnosis categories, representing significant differences between the clinical and non-clinical population in nearly all subscales of the BSI and MCMI-III, which confirm that the criteria of SCID-5-RV disorders could accurately discriminate clinical and non-clinical populations. The results, which represent the variance in the diagnosis criterion of SCID-5-RV categorical diagnoses, reflect the variance in the underlying construct (
31).
The present study has several practical and clinical implications. First, it was conducted across a variety of clinical settings, including academic and private clinical settings, thus a diverse clinical population is captured. Besides, it was conducted by interviewers from various fields. In addition, test-retest and multivariate methodological designs are highly effective methods that their application increases the generalizability of the findings. More importantly, the exclusion criteria were minimal and clinical participants were more similar to natural clinical settings. Also, SCID-5-RV categorical diagnoses are highly reliable and valid when applied to the Iranian population and have good cross-cultural psychometric properties. Thus, these categorical diagnoses can be used in different clinical settings by various interviewers.
Despite having several strengths, the current study has some limitations. First, the sample size was inadequate for all disorders, and calculation of some statistics was not possible for some disorders. Therefore, further studies should evaluate the cross-validation analysis of more diverse disorders with a larger sample size. Accordingly, specific pilot studies are recommended for each spectrum.
In the present study, first, validity and reliability of the Persian translation of SCID-5-RV were examined, and the construct validity of SCID-5-RV categorical diagnoses was assessed using the MANOVA method. Then, BSI and MCMI-III subscales were compared between clinical and non-clinical populations as an innovative and pioneer perspective to assess the construct validity of DSM-5, which can pave the way for further elaboration of categorical diagnoses. Overall, the SCID-5-RV categorical diagnosis had reliable and valid diagnoses for almost all diagnostic categories in clinical settings in Iran. These results contribute to the growing body of evidence supporting the reliability and validity of SCID-5-RV diagnosis categories and representing the cross-cultural use of the instrument. The authors suggest conducting further studies concerning testing and cross-validating the diagnostic criteria. Consistent with the National Institute of Mental Health, Research Domain Criteria project aims to identify the symptomatic and biological dimension of psychopathologies. Future studies are required to investigate cross-cutting dimensional measures of DSM-5 and to build a foundation for incorporating dimensional diagnoses into categorical diagnoses, which help to improve case formulation and treatment plan.