1. Context
Breast cancer is the most prevalent cancer among females worldwide, and it is increasing in developing countries (1, 2). The most common treatment for women with low-grade breast cancer is breast-conserving surgery (BCS) (Lumpectomy or Partial Mastectomy) with radiation therapy (RT) (3). However, based on previous reports, 20 to 25 percent of patients undergoing BCS will need secondary surgeries to reach positive resection margins (4, 5).
Sometimes, some techniques such as clinical examination, ultrasonography, mammography, and biopsy using fine-needle aspiration (FNA), cannot allow accurate diagnosis in a patient with a breast tumor. In this case, a wide local excision and pathology examination will be the final solution. The optimal length of surgical margins varies in different countries and ranges between 2 mm to 10 mm (4, 6, 7).
To perform a successful BCS, the surgeon needs to identify the tissue’s distances and margins correctly. In this regard, one of the most effective approaches is the frozen section procedure, which plays a guiding role in the next stages of surgery and prevents patients from re-operation (8-10). However, it is difficult to get a precise diagnosis using a frozen section due to a lack of expert pathologists, especially in developing countries (11, 12).
According to reports, using the frozen section procedure during breast surgery reduces the rate of error and the need for re-surgery (13, 14). Nevertheless, to our knowledge, a large number of surgeons in hospitals in Iran have moved away from this approach, mainly because of the surgery’s cost burden and length.
2. Objectives
We aimed at performing a comprehensive systematic review and meta-analysis to provide reliable evidence on the diagnostic value of frozen section procedures in breast-conserving surgery. We hope our findings could advocate this technique and help the surgeons to decide more accurately.
3. Data Sources
The preferred reporting items for systematic reviews and meta-analyses (PRISMA) guideline was used for study design, search strategy, screening, and reporting. We performed a systematic search in PubMed, Embase, Cochrane Library, and Web of Science databases up to May 2019. The search strategy included MeSH descriptors and free keywords as follows: (all available MeSH terms for “Breast-Conserving Surgery”) AND (“frozen section” OR “frozen sections”). Our search was limited to studies published in English but was not limited to a specific date. Only diagnostic studies on humans were entered into the study.
4. Criteria Study Selection
Two researchers (A.SH and K.H) selected the studies independently and disagreements were resolved through discussion with the third party (R.AN). Studies that met the following criteria were included: (1) human diagnostic studies used the frozen section in breast-conserving surgery, and (2) studies that reported the sensitivity and specificity of the frozen section in BCS or contained data that could help in calculating the desired parameters (3) English-language studies. Excluded studies were: (1) conference abstracts, letters, comments, case reports, reviews, animal studies, cross-sectional studies, ecological studies, and in vitro studies; (2) duplicate publications, and (3) studies with insufficient data.
5. Data Extraction & Quality Assessment
Two investigators (A.SH and K.H) independently evaluated the quality of publications and extracted data from included articles. The supervisor (Gh.G) resolved any disagreements regarding quality assessment. Data were extracted using a checklist containing the following items: the name of author, publication year, number of patients, mean age, true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) of the frozen section, clinicopathological features, and correlations. The Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool was used to assess the quality of included papers.
6. Data Analysis
Sensitivity, specificity, positive predictive values (PPV), negative predictive values (NPV), positive likelihood ratio (PLR), negative likelihood ratio (NLR), and accuracy and 95% confidence interval were calculated in MedCalc. We used the STATA v.11 software for data analysis. The I-square (I2) test was used for heterogeneities assessment. Due to the high heterogeneity, the random-effects model was used for the pooled estimation. The possible publication bias was assessed using Egger’s asymmetry test. P-values less than 0.05 were considered statistically significant.
7. Results
7.1. Study Selection Process
Our primary search resulted in 844 studies. After removing duplicated articles, we used title and abstract in order to screen 482 remaining papers. Finally, among 181 papers considered by full text, 35 articles entered into the meta-analysis. The PRISMA flow diagram for selecting eligible studies is presented in Figure 1.
7.2. Study Characteristics
Out of selected studies, a total of 10,100 patients with breast cancer aged 25 - 93 years old were included in the current study. Twenty studies used paraffin block, 12 studies used permanent sections, one study used histopathological examinations as a reference method, and in 2 studies, the reference method was unclear. Characteristics of studies that entered into meta-analysis are displayed in Table 1.
Author | Year | Country | No. Patients | Reference Method | TN | TP | FN | FP |
---|---|---|---|---|---|---|---|---|
Kaufman et al. (15) | 1986 | Israel | 242 | PB | 166 | 71 | 5 | 0 |
Cox et al. (16) | 1991 | USA | 114 | PS | - | - | - | - |
Sauter et al. (17) | 1994 | USA | 107 | PS | 292 | 52 | 6 | 9 |
Bianchi et al. (18) | 1995 | Italy | 672 | PB | 356 | 267 | 24 | 3 |
Noguchi et al. (A) (19) | 1995 | Japan | 87 | PB | 63 | 20 | 4 | 10 |
Noguchi et al. (B) (20) | 1995 | Japan | 95 | PS | 64 | 23 | 1 | 12 |
Ikeda et al. (21) | 1997 | Japan | 56 | PS | 34 | 17 | 1 | 4 |
Sultana and Kayani (12) | 2005 | Pakistan | 319 | PB | 29 | 287 | 1 | 2 |
Cendan et al. (22) | 2005 | USA | 97 | PB | - | - | - | - |
Olson et al. (23) | 2007 | USA | 292 | PB | 1228 | 57 | 21 | 5 |
Cabioglu et al. (24) | 2007 | USA | 264 | PS | - | - | - | - |
Weber et al. (25) | 2008 | Switzerland | 78 | PB | - | - | - | - |
Rusby et al. (26) | 2008 | UK | 115 | PB | 495 | 39 | 8 | 15 |
Bellolio et al. (27) | 2009 | - | 337 | - | - | - | - | - |
Fukamachi et al. (28) | 2010 | Japan | 122 | PS | - | - | - | - |
Jensen et al. (29) | 2010 | USA | 416 | PB | 287 | 79 | 50 | 0 |
Jaka et al. (30) | 2011 | India | 114 | Histopathology | 36 | 75 | 3 | 0 |
Caruso et al. (31) | 2011 | Italy | 52 | PB | 44 | 5 | 1 | 3 |
Barakat et al. (32) | 2012 | Jordan | 440 | PB | 285 | 135 | 26 | 0 |
Sabel et al. (33) | 2012 | USA | 139 | PS | 121 | 26 | 2 | 0 |
Arlicot et al. (34) | 2013 | France | 672 | PB | - | - | - | - |
Francissen et al. (35) | 2013 | Netherland | 628 | PB | 447 | 101 | 78 | 2 |
Poling et al. (36) | 2014 | USA | 1940 | PB | - | - | - | - |
Banuelos-Andrio et al. (37) | 2014 | Spain | 370 | PB | 326 | 32 | 16 | 0 |
Kikuyama et al. (38) | 2014 | Japan | 220 | PB | - | - | - | - |
Jorns et al. (39) | 2014 | USA | 46 | PS | 28 | 12 | 6 | 0 |
Duarte et al. (40) | 2015 | Brazil | 68 | PB | - | - | - | - |
Abuoglu et al. (41) | 2016 | Turkey | 100 | PB | 61 | 19 | 4 | 1 |
Ahmed and Ahmad (10) | 2016 | Pakistan | 76 | PB | 15 | 59 | 2 | 0 |
Kim et al. (42) | 2016 | South Korea | 25 | PS | 23 | 3 | 2 | 1 |
Jorns and Kidwell (A) (43) | 2016 | USA | 134 | PS | 42 | 8 | 2 | 0 |
Jorns and Kidwell (B) (43) | 2016 | USA | 116 | PS | 64 | 10 | 5 | 0 |
Du et al. (44) | 2017 | China | 976 | - | - | - | - | - |
Ko et al. (45) | 2017 | Korea | 509 | PS | 338 | 120 | 24 | 1 |
Mahadevappa et al. (46) | 2017 | India | 62 | PB | 28 | 33 | 0 | 1 |
Characteristics of Studies Entered into the Meta-analysis
7.3. Quality Assessment & Publication Bias
According to the QUADAS-2 tool’s quality assessment, 35 papers received the eligibility score and were included in the meta-analysis. Egger’s test showed a significant publication bias for sensitivity (P < 0.001), specificity (P < 0.001), PPV (P < 0.001), NPV (P = 0.010), PLR (P = 0.023), NLR (P = 0.005), and accuracy (P = 0.001).
7.4. Main Outcomes
7.4.1. Sensitivity
Meta-analysis showed a high sensitivity for the frozen section in BCS (Sensitivity: 83.47, 95%CI 79.61 - 87.32). Significant heterogeneity was observed (I2 = 95.1%, P < 0001, Table 2).
Diagnostic Parameter | Number of Studies | I Square | P-Value | Effect Size (95% Confidence Interval) |
---|---|---|---|---|
Sensitivity | 35 | 95.1 | 0.00 | 83.47 (79.61 – 87.32) |
Specificity | 35 | 62.8 | 0.00 | 99.29 (98.89 – 99.68) |
Positive predictive value | 22 | 88.4 | 0.00 | 93.26 (91.25 – 95.27) |
Negative predictive value | 33 | 95.1 | 0.00 | 92.17 (90.22 – 94.11) |
Positive likelihood Ratio | 22 | 38.5 | 0.03 | 7.99 (6.01 – 9.96) |
Negative likelihood Ratio | 33 | 95.0 | 0.00 | 0.18 (0.14 – 0.23) |
serum-HER2 accuracy | 35 | 90.2 | 0.00 | 93.77 (92.45 – 95.10) |
Pooled Estimates of Diagnostic Parameters
7.4.2. Specificity
Meta-analysis findings revealed a significant specificity for the frozen section in BCS (Specificity: 99.29, 95%CI 98.89 - 99.68). The heterogeneity was substantial (I2 = 62.8%, P < 0001, Table 2).
7.4.3. PPV & NPV
Considering the diagnostic test performance, meta-analysis indicated the PPV of 93.26 (95%CI 91.25 - 95.27) and the NPV of 92.17 (95%CI 90.22 - 94.11) for the frozen section in BCS. The heterogeneity was found to be significant for both PPV and NPV (I2 = 88.4%, P < 0001, I2 = 95.1%, P < 0001, respectively). The PPV was 100 in 13 studies, which did not enter the meta-analysis due to the incalculable CI, and the value obtained in this meta-analysis is underestimated. The NPV was 100 in 2 studies, which did not enter the meta-analysis due to the incalculable CI, and the value obtained in this meta-analysis is underestimated (Table 2).
7.4.4. PLR & NLR
The meta-analysis showed that performing a diagnostic test resulted in sensational PLR 7.99 (95%CI 6.01 - 9.96) and NLR 0.18 (95%CI 0.14 - 0.23) for the frozen section. A low heterogeneity was observed for PLR and a high heterogeneity for NLR (I2 = 38.5%, P = 0.03, I2 = 95.0%, P < 0001, respectively) (Table 2).
7.4.5. Accuracy
We found an accuracy of 93.77 (95%CI 92.45 - 95.10) for this procedure by examining sensitivity and specificity. Significant heterogeneity was observed (I2 = 90.2%, P < 0001) (Table 2).
8. Discussion
In recent years, BCS has been recognized as the standard surgical procedure in patients with early-stage breast cancer. One of the complications of BCS is the risk of local recurrence, in which one of the leading causes is the microscopic involvement of lumpectomy margins (47). According to different studies, the probability of re-operation due to microscopic involvement of margins varied from 24 to 40% (23, 28, 45). The probability of residual tumor in the re-excised specimen varied between 32 - 65% (38). Several parameters are involved in the probability of residue in the margins, such as ductal carcinoma in situ (DCIS), especially extensive intraductal component, patient age, type of tumor pathology (e.g., invasive lobular carcinoma), pathologic tumor size (e.g., PT3), breast density, as well as lymph vascular invasion (48-51). Younger patients are more likely to have marginal involvement. Patients with invasive lobular carcinoma pathology are at increased risk for marginal involvement and recurrence because this type of breast cancer usually is multifocal and multicenter (52-54).
Pre-operative imaging examinations to check the tumor size include mammography, ultrasound, magnetic resonance imaging, and computed tomography scan are not frequent. Due to the limitations of pre-operative imaging and low-quality sonography, it may be difficult to estimate the tumor’s extent. All of these factors increase the probability of marginal involvement and increase the need for re-operation. Therefore, intraoperative examination of the margins is needed to reduce the probability of marginal involvement and reduce the need for re-operation (55).
Since there are several techniques for examining the surgical margins in breast cancer treatment, it is crucial to choose the method that has the most diagnostic value in the shortest time while being cost-benefit. At present, methods such as gross examination, imprint cytology, frozen section analysis, near-infrared fluorescence, micro-computed tomography, margin probe diffraction system, high-frequency ultrasound, and cavity shave margin are used, which have their strengths and weaknesses, and so far, the specific method has not been accepted as an international gold standard (56, 57).
According to the controversies over the diagnostic value of frozen sections in studies, we carried out a meta-analysis to combine the available data on the subject and to calculate the test’s accuracy. The frozen section method’s sensitivity varied from 43.58% (35) to 100% (16, 46) among the reviewed studies. This method’s sensitivity for detecting margins with tumor tissue during surgery for breast cancer tumors was estimated to be 83.11% in our meta-analysis. Also, while specificity was ranged from 84.21% to 100% in different studies, pooling data resulted in a very high specificity (99.29%) for the frozen section procedure. Our findings regarding the frozen section’s sensitivity and specificity in BCS were in agreement with systematic reviews of Esbona et al. (14) and St John et al. (58). Also, many cohort studies and national databases have examined the diagnostic value of this method (59-61).
In addition, to find the number of actual patients among cases tested positive (62), the meta-analysis indicated the PPV 94.61 (95%CI 92.92 - 96.31), which is very high like sensitivity for the frozen section method and suggest acceptable performance for this method compared to the reference method. The other way around, NPV was also significantly high (NPV = 92.12; 95%CI 90.24 - 94.01). It demonstrates the frozen section’s satisfactory performance in detecting actual healthy individuals among cases tested negative in comparison to the reference method.
Overall, to find the method’s capacity in classifying true positive and negative cases among all cases (63, 64), findings showed an accuracy of 93.77 (95%CI 92.45 - 95.10) for the frozen section method, which is substantially high and suggests acceptable performance in this regard.
The likelihood ratio describes the test results probability in cases with the condition to the probability of cases without the condition (65). PLR more than 10 and NLR less than 0.1 are reported to offer strong evidence for diagnosis (66). Herein, the meta-analysis showed the PLR 7.99 (95%CI 6.01 - 9.96) and NLR 0.18 (95%CI 0.14 - 0.23) for the frozen section, which indicates a significant relationship with the presence and absence of the condition, respectively.
Nevertheless, this method is not widely used in the U.S. due to several reasons, including false-positive frozen section analysis in cases such as ductal hyperplasia (e.g., mistaken for DCIS) and in lesions such as micro glandular adenosis, sclerosing adenosis, radial scar, intracystic papilloma, and fat necrosis. Also, the frozen section analysis study showed a potential for overestimation that leads to unnecessary resection and even mastectomy. There is also the possibility of false-negative and lower estimation in lesions such as tubular carcinoma, invasive lobular carcinoma, DCIS, and lesions caused by morphological changes after chemotherapy (9, 67, 68). Therefore, in addition to having the necessary frozen section analysis techniques, a skilled and experienced pathologist is essential.
The next problem is the prolongation of surgery time, which in most studies, extended between 20 and 50 minutes (69-72). According to a reported meta-analysis, the cost-effectiveness of this method depends on the extent of the margin being positive and the need for re-operation without this method. The results showed that it was cost-effective when positive margins were more than 25%, and the probability of re-operation was less than 15% (39, 73). A study examining frozen section analysis reported that this method is cost-effective and cost-saving is $ 400 to $ 600 per patient with breast cancer (33).
Regarding other methods, imprint cytology is one of the approaches for the rapid evaluation of benign and malignant tissues during surgery, which is used to evaluate tissue margins in cases such as sentinel lymph nodes surgery (74), breast mass surgery, and parathyroid (75). Shortcomings of this method include the inability to analyze deep infiltrations (76) and distinguish progressed tumors from dense stromal fibrosis (77). Also, based on available documents, the sensitivity and specificity of intraoperative ultrasound are 59% and 81%, respectively (58), which is significantly weaker in detecting tissue margins than the frozen section and cytology.
Also, the radiography evaluation did not significantly improve the re-operation rate (78). However, it might help decide whether the margins of calcified lesions are correct (79). The results of a meta-analysis study comparing radiographic and pathological evaluation methods of tissue margins showed that the efficiency of this method was lower than pathological methods (80). Another method of examining tissue margins during surgery is radiofrequency spectroscopy, which has also been approved for use in the United States. A randomized trial indicated that this method significantly reduced the re-operation rate compared to the control group (81). In general, based on the available evidence, margin probe tools are more effective in detecting positive margins (81, 82).
Finally, since different diagnostic values are reported for different approaches, and there is no international gold standard yet, considering various factors such as cost, availability of experts to perform the process, and the time required to obtain test results, the most appropriate method for evaluating surgical margins should be accurately chosen. However, due to high sensitivity of the frozen section method for evaluating lumpectomy margins in breast cancer, it is a good choice for low-income countries because of its cost efficiency (73).
8.1. Conclusions and Limitation
Due to the inclusion criteria, English-language studies that assessed the diagnostic value of intraoperative frozen section to evaluate lumpectomy margins in breast cancer surgery were included. We recommended considering more inclusive criteria to include original studies in other languages.
Our systematic review and meta-analysis showed that intraoperative frozen section analysis has high sensitivity and specificity to evaluate lumpectomy margins in patients with early-stage breast cancer and significantly reduce the need for re-operation. Also, re-operation costs are not imposed on the patient and reduce the patient’s anxiety. Based on this study, it can be accepted that some patients who have a lower risk of positive marginal lumpectomy benefit less from this method, so this percentage of patients can be excluded. However, in patients who are more likely to have a positive margin based on pre-operative examinations, such as young patients, dense breast, DCIS, invasive lobular carcinoma pathology, presence of microcalcification, and lymph vascular invasion, use of this marginal screening method can significantly reduce re-operation and subsequently reduce the risk of recurrence.