In our study, agreement between the faculty radiologists was fair-to-good for all criteria; however, between residents, agreement was poor-to-moderate. Therefore, investigative reporting of breast US by residents is inadvisable. At the academic medical centres under study, US of the breast, as well as the abdominal organs, are examined by radiology residents and experienced faculty members. The residents actually operate the ultrasonic equipment themselves and then interpret the images that they, themselves, have produced, in a preliminary form. At the end, the final report is confirmed by the attending faculty member. The variability in agreement regarding image assessment between radiology residents and skilled faculty members has been previously studied, and these reports (
19,
26-
28) showed that the degree of agreement was greater than 90% between radiology residents and faculty members in the assessment of head and pulmonary angiography computerized tomography (CT). However, to date, the discrepancies between faculty radiologists and residents in the assessment of breast US images have not been fully established; this was the motivation for our research. We found that, overall, the agreement between the two faculty radiologists was greater than for the residents. The comparison of diagnostic performance was also significantly higher for the faculty radiologists, yet there was no significant difference between junior and senior residents.
Previous reports have demonstrated that certain breast US features are reliable for differentiation of benign and malignant breast lesions (
29-
33). However, reader discrepancy in mass assessment by US is responsible for differences in lesion detection and variation in lesion description and subsequent management. Several researches have reported inter-observer variability in the assessment of breast masses with the use of the BI-RADS US lexicon, fourth edition (
7,
11). The fifth edition of BI-RADS for US was published in 2013 (
13). However, to the best of our knowledge, observer variability, using the BI-RADS new lexicon for US, fifth edition (2013), has not been widely studied.
Our results using the BI-RADS fifth edition are similar to previous studies using the fourth edition. In the new BI-RADS lexicon for US, the boundary term was removed, yet in our data, this seemed to have little effect in determining a final category. The agreement for the final category of faculty radiologists was fair-to-moderate in our study. This result was similar to that in studies by Elverici et al. (
7), Lee et al. (
10), and Berg et al. (
8), using the BI-RADS fourth edition. According to a study by Elverici et al. (
7), the observer agreement was good for orientation, moderate for shape, echo pattern and posterior feature, and was fair for margin and final category for two experienced radiologists. Regarding our results, data of the faculty members was similar to that of Elverici et al. (
7), yet our results for residents were poor. Also, as in other studies (
34), the agreement of all three subgroups in the current study was lowest for margin and highest for orientation. We assume the reason is that the terminology for characterization of the mass margin is multiple, and even overlapped. Our study showed that the disagreement among readers, when labelling mass margin and echo pattern, was more with a heterogeneous background echo-texture than homogeneous background. This is probably because the observers were uncertain and confused when trying to detect and classify abnormalities in heterogeneous tissue composition with posterior shadowing. This confusion is not minor, because the designation of circumscribed margin can encourage observers to decide that the lesion is benign and thus produce a false negative. In contrast, the designation of a not-circumscribed margin may contribute to a false positive lesion and an unnecessary biopsy.
Interestingly, following the education session, the agreement was one level higher for all of the criteria except orientation in the senior residents; however, there was no improvement in the degree of agreement for the six criteria in the junior residents. The agreement for the final category was one level higher in the senior residents after the education session, yet the degree of agreement was still only fair. These data imply that a single education session was not adequate to improve the agreement level and performance of the residents. Therefore, we suggest that attending radiologists need a more careful review and confirmation of the preliminary interpretation. The successful training of residents would appear to require clinical experience through the practice of breast US, continuous consensus reading, and steady feedback and correlation between US findings and the pathology results.
Our study had several limitations. First, this study included only benign masses that underwent biopsies. Thus, typical ‘probably benign’ or ‘benign’ lesions on US findings were not included in our study; this exclusion may have led to lower specificity. In real clinics, the specificity may be higher. Second, readers did not actually operate the ultrasonic equipment themselves, and were instead provided by at least two static US images. We think it is very difficult to measure performance discrepancy. Thus, we divided faculty members, senior residents and junior residents into three subgroups, and then we analyzed inter-observer agreement and ROC curves for each subgroups. Finally, there was selection bias because only images particularly chosen by an investigator were evaluated. Our results are limited due to our small sample size. Additional studies are required with a larger series of patients.
Our study showed that the reader agreement for sonographic BIRADS lexicon was higher among faculty radiologists than among residents. In addition, there was significantly higher diagnostic accuracy for the faculty members when compared to the senior and junior residents. Therefore, we recommend continued professional resident training to improve the degree of agreement and performance for breast US.