1. Background
While the scientific literature is rife with reports of high burnout amongst health care professionals and in particular GPs (1, 2), there is surprisingly little consensus on the conceptual definition of ‘burnout’, the exact components of it and hence its measurement. However, we know that burnout is a multifaceted phenomenon marked by a gradual depletion of a person’s physical, emotional and cognitive resources in response to his or her work.
While there is considerable debate over the dimensions of burnout, most researchers agree that emotional exhaustion appears to be at the heart of burnout (3-5) or that it represents the only component of burnout (6, 7) and that it is related to a host of occupational and personal outcome measures such as intention to quit, medical errors, mental quality of life, and self-rated health (2, 8-11). Indeed, the ICD-10 defines burnout as a “state of vital exhaustion” (12).
The maslach burnout inventory (MBI) (13) consisting of 3 subscales measuring emotional exhaustion (EE), depersonalisation (D), personal accomplishment (PA), has traditionally been the most commonly used tool for the measurement of burnout, with its’ widespread use being self-perpetuated by researchers’ need to compare their findings to those of others. However, the usefulness of the MBI as a screening tool is limited by its length, and the less than intuitive underpinnings of its development. A critique by Kristensen et al. (2005) (7) highlights the problem of having the singular construct ‘burnout’ measured by three distinct and different scales, which scores cannot be combined to yield a single burnout score. Additionally, this multi-dimensional measurement of burnout raises the important issue of interpretation; e.g., is a person scoring low on emotional exhaustion and low on depersonalisation actually burnt out, or is high emotional exhaustion a prerequisite for a “burnout diagnosis”. Such inherent problems similarly exist with West et al.’s (10, 14) proposed use of 2 key items from the EE (“How often do you feel burned out from your work?”) and D subscale (“How often do you feel you’ve become more callous towards people since you took this job?”) as a brief measure of burnout.
It appears from the literature that using a person’s own definition of burnout may be a valid way of assessing burnout (15, 16). Pick and Leiter (17) also found in their qualitative study that nurse’s self-definition of burnout was strongly related to emotional exhaustion but not depersonalisation or personal accomplishment, suggesting that these may not be salient aspects of the lived experience of burnout. Indeed, some researchers (e.g. 7) have argued that depersonalisation may rather represent a coping strategy applied in situations of burnout and personal accomplishment a consequence of burnout rather than parts contained within the construct. This idea is fully consistent with Leiter and Maslach’s (18) own model as well as that proposed by Lee and Ashforth (19) in which EE is conceptualised as the first burnout dimension to develop, with D and PA developing as a direct result of EE. Further testing of these models carried out by Taris et al. (20) confirmed that EE triggers D, and that D in consequence affects PA.
Previous validation research has assessed single item burnout measures amongst physicians on a nine point rating scale. The study found that physicians assess their global burnout in terms of emotional exhaustion (21). Previous validations of a single item global measure of burnout initially developed for the physician work life study (22) have been carried out against the MBI EE only and have yielded promising results (15, 16, 23) but previously not assessed in terms of association with relevant outcome measures. This measure asked respondents to rate their current level of burnout by endorsing one of five statements describing the gradual development of (self-defined) burnout. However, the limitations imposed by the restricted number of response categories may affect its usefulness in situations where detection of minor changes in burnout or over shorter time frames is required. Therefore, allowing respondents to define and label their experience of burnout on an 11-point verbal numeric rating scale may prove to be a quick and sensitive measure with applicability in human resource or screening contexts, as well as research.
A single zero to ten global burnout items is brief and easily administered, thereby increasing survey completion rates. By not applying an artificial cut-off point, it allows an assessment (and timely intervention) of the gradual development of burnout.
2. Objectives
This paper reports on the validation of such an item against a comprehensive emotional exhaustion measure (MBI-EE), its sensitivity, specificity and positive and negative predictive value, as well as association with a range of outcome measures (early retirement intentions, psychological distress, and self-rated general health).
3. Patients and Methods
3.1. Subjects and Recruitmen
Participants in this study were rural GPs who were members of the Northern Rivers general practice network (NRGPN) and practicing in the Northern Rivers region of NSW, Australia. Potential participants received a study package from NRGPN containing a covering letter, a participant information sheet, and the anonymous survey. All 165 eligible participants received two reminders 2 and 4 weeks after the initial invitation. Data collection took place between October 2011 and February 2012. The study protocol conforms to the ethical guidelines of the 1975 Declaration of Helsinki as reflected in a priori approval by The University of Sydney Human Research Ethics Committee (approval number: 14112).
3.2. Instruments
The survey included the following measures:
3.2.1. Demographic and Work Factors
Age, gender, retirement intentions (planned age of retirement from direct patient care in general practice).
3.2.2. Professional Burnout
Burnout was assessed by two different measures;
3.2.2.1. Maslach Burnout Inventory (Human Services Survey) (13) 9-Item Emotional Exhaustion subscale
The MBI is a very well-validated scale with sound psychometric properties and considered, by many, to be a benchmark measure of burnout, with the EE subscale often used as a stand-alone measure. Each item is rated on a seven-point Likert Scale. MBI-EE sub-scale burnout score categories are: low (< 18), medium (19 - 26) and high (≥ 27) as recommended by the scale developers.
3.2.2.2. The Single Item Burnout measure (SIB)
SIB was developed by the authors for this study, asking respondents to rate their current level of burnout on a scale from zero to ten (“not at all burnt out” to “extremely burnt out”).
3.2.2.3. Psychological Distress
Psychological Distress: The six-item Kessler Psychological Distress Scale (K-6) (24) was included as a brief measure of non-specific psychological distress.
3.2.2.4. General Health
General Health: The global health question from the SF-36 was included to measure self-rated general health (“In general, would you say your health is;” rated on a five-point Likert scale ranging from “poor” to “excellent”). This item has consistently been found to possess strong psychometric properties compared to validated multi-item measures (25) and be a good predictor of mortality and health care utilization (26, 27).
3.3. Statistical Analyses
Analyses were conducted using SAS 9.3. Mean scores and prevalence of burnout on the MBI-EE and the SIB were calculated. An ANOVA analysis compared SIB scores to MBI-EE sub-scale score categories of low (< 18), medium (19 - 26) and high (≥ 27) with the data displaying normally distributed residuals and homogeneity of variance. A Pearson correlation coefficient was also calculated as a measure of association.
A Bland-Altman analysis was used to assess the level of agreement between the two methods to compare the shorter single item technique to the established emotional exhaustion subscale of the MBI. A range of agreement was defined as mean bias ± 2 SD. A scatterplot of the average of the two scales against the difference between the two scales for each participant was used. The EE scale was transformed to be the same as the SIB scale (0 to 10).
The plot was used as a visual check that the magnitudes of the differences were constant throughout the range of measurement, with the expectation that approximately 5% of the points would lie outside the limit lines if the differences were normally distributed (28).
Raw SIB scores were examined for their association with a number of outcome measures. Early retirement was treated as a nominal variable and defined as a planned retirement age before 65 years of age, psychological distress (K6) was categorized into low, moderate, high and very high applying cut-off values recommended by the scale developers (24) while self-rated general health similarly was treated as an ordinal variable retaining its five response categories.
Sensitivity, specificity, positive and negative predictive value, and prevalence and bias adjusted weighted Kappa coefficients were calculated using the MBI-EE as the standard.
4. Results
A total of 92 GPs completed the survey, representing a response rate of 56%, with all completing both the index and reference test. This constitutes a high response rate in this population given the relatively sensitive natures of the survey and the well-known difficulty in recruiting time-poor GPs (29) with recent studies with Australian GPs having reported response rates of between 12% and 59% (30-32). The mean age was 51.3 years (SD = 10.7 years; 95% CI: 48.75 to 53.84). Sixty percent (95% CI: 50 to 70) were male which is slightly lower than the general rural GP population which is 71% male (33). The level of burnout of the GPs in this study as measured on the MBI-EE (Mean = 18.9, SD = 13.5 (95%CI: 15.67 - 22.11), were slightly lower than published norms (Mean = 22.19, SD = 9.53, (95%CI: 21.55 to 22.83) from 1104 physicians and nurses in the USA (13). A quarter of our sample (26%) was identified as having high levels of burnout.
The mean score on the SIB was 3.1 (SD = 2.5). An ANOVA analysis showed that the Mean SIB scores increased with increased level of burnout as per MBI-EE burnout categories: Mean SIB scores (SD) in the low, average and high burnout categories on the MBI-EE were 1.6 (1.7), 3.5 (1.7), and 6.0 (2.0) respectively (P < 0.0001). The Pearson correlation coefficient was r = 0.8 (P < 0.0001).
The Bland-Altman analysis indicated that the 95% limits of agreement between the two methods ranged from -2.78 to 3.73. The difference of the mean bias was 0.48 (SD = 1.62). The mean difference was different from zero (P = 0.0069). A visual check demonstrated that the magnitudes of the differences were reasonably constant throughout the range of measurement. The differences were approximately normally distributed, and as expected about 5% of the points lay outside the limit lines.
Construct validity was demonstrated by examining the SIB for its association with a number of salient outcome measures (Table 1) and showed high positive associations with early retirement intentions and psychological distress, and a high negative association with self-rated general health.
Outcome | No. | SIB Mean | SIB SD | Test Statistic | P value |
---|---|---|---|---|---|
Early retirement intentions | t (90) = 2.68 | 0.0089 | |||
Yes | 52 | 2.5 | 2.4 | ||
No | 40 | 3.9 | 2.5 | ||
Psychological distress | F (3, 88) = 16.23 | < 0.0001 | |||
Low | 64 | 2.2 | 2.0 | ||
Medium | 20 | 4.5 | 2.5 | ||
High | 6 | 6.5 | 1.2 | ||
Very high | 2 | 8.0 | 0.0 | ||
General health | F (4, 85) = 8.83 | < 0.0001 | |||
Poor | 3 | 6.3 | 2.9 | ||
Fair | 17 | 4.5 | 2.3 | ||
Good | 24 | 4.0 | 2.9 | ||
Very good | 26 | 1.9 | 1.8 | ||
Excellent | 20 | 1.6 | 1.3 |
Associations of the Single Item Burnout Measure (SIB) with Retirement Intentions, Psychological Distress and General Health Outcomes
Characteristics of the cut-off values on the SIB, applied to the 24 GPs who displayed high burnout on the MBI-EE (score ≥ 27), are reported in Table 2.
Burnout Prevalence | Observed | Prevalence and bias adjusted kappa | Positive | Negative | MBI-EE (High) | SIB | Agreement | Sensitivity | Specificity | predictive Value | predictive Value | SIB Score | % | 95% CI | % | 95% CI | % | K | 95% CI | % | 95% CI | % | 95% CI | % | % | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
26 | (17 to 35) | ||||||||||||||||||||||||||||||||||||||||
≥ 0 | NA | NA | 100 | (100 to 100) | 26 | -0.48 | (-0.68 to -0.27) | 100 | (100 to100) | NA | NA | NA | NA | ||||||||||||||||||||||||||||
≥ 1 | NA | NA | 85 | (78 to 92) | 41 | -0.17 | (-0.38 to -0.03) | 100 | (100 to100) | 21 | (11 to 30) | 31 | 100 | ||||||||||||||||||||||||||||
≥ 2 | NA | NA | 64 | (54 to 74) | 62 | 0.24 | (0.0.3 to 0.44) | 100 | (100 to100) | 49 | (37 to 60) | 41 | 100 | ||||||||||||||||||||||||||||
≥ 3 | NA | NA | 48 | (38 to 58) | 78 | 0.56 | (0.36 to 0.80) | 100 | (100 to100) | 71 | (60 to 81) | 55 | 100 | ||||||||||||||||||||||||||||
≥ 4 | NA | NA | 38 | (28 to 48) | 79 | 0.59 | (0.38 to 0.79) | 83 | (68 to 98) | 78 | (68 to 88) | 57 | 93 | ||||||||||||||||||||||||||||
≥ 5 | NA | NA | 30 | (11 to 27) | 85 | 0.70 | (0.49 to 0.90) | 79 | (63 to 95) | 87 | (79 to 95) | 68 | 92 | ||||||||||||||||||||||||||||
≥ 6 | NA | NA | 19 | (7 to 12) | 82 | 0.63 | (0.43 to 0.83) | 50 | (30 to 70) | 93 | (86 to 99) | 71 | 84 | ||||||||||||||||||||||||||||
≥ 7 | NA | NA | 12 | (5 to 19) | 84 | 0.67 | (0.47 to 0.88) | 42 | (21 to 61) | 99 | (96 to 100) | 91 | 83 | ||||||||||||||||||||||||||||
≥ 8 | NA | NA | 9 | (3 to 15) | 83 | 0.65 | (0.45 to 0.86) | 33 | (15 to 52) | 100 | (100 to 100) | 100 | 81 | ||||||||||||||||||||||||||||
≥ 9 | NA | NA | 2 | (0 to 5) | 92 | 0.52 | (0.31 to 0.73) | 2 | (0 to 5) | 100 | (100 to 100) | 100 | 76 | ||||||||||||||||||||||||||||
≥ 10 | NA | NA | 0 | NA | 74 | 0.47 | (0.27 to 0.68) | NA | NA | 100 | (100 to 100) | NA | 74 |
Prevalence of Burnout According to the Single Item Burnout Measure (SIB) and MBI-EE, Prevalence and Bias Adjusted Weighted Kappa Coefficient, Sensitivity, Specificity, and Positive and Negative Predictive Value of the SIB Compared to Those With a High Burnout Score on the MBI-EE (≥ 27) Amongst 92 GPs
The proportion of observed agreement between high MBI-EE and the various SIB cut-off scores was highest between scores of 3 and 9, ranging from 78% to 92%. Kappa showed good agreement at a cut-off score of 5.
Sensitivity declined and specificity increased with increasing SIB cut-off scores. Similarly, positive predictive values increased with higher SIB scores, whereas negative predictive values decreased.
Generally, the trade-off between sensitivity and specificity reached most optimal levels at a score of 5 or more, yielding sensitivity of 79%, specificity of 87%, positive predictive value of 68%, and negative predictive value of 92%, indicating that 79% of GPs truly were burnt out, and 87% were truly not burnt out, according to the MBI-EE when using a SIB cut-off score of 5 or more.
5. Discussion
Due to high prevalence of burnout in health care providers and the significant associations it has with important variables relating to mental and physical health and intention to leave, it is imperative that a brief, sensitive screening measure is available which allows for early identification of burnout. The results of this current study indicate that the SIB has significant potential to fulfill this gap due to its brevity, ease of administration and sound psychometric properties. Firstly, scores on the SIB were found to be highly associated with several person outcome variables previously documented to be related to burnout; self-rated general health, psychological distress, and early retirement intentions, thus lending evidence to its construct validity. Upon examination of its performance against the reference standard (maslach burnout inventory-emotional exhaustion subscale) (13), concurrent validity of the SIB was evidenced by findings of a high positive association between SIB and MBI-EE scores. This association was in this case higher than those reported by Hansen and Girgis (15) and Rohland et al. (16) while almost identical to that found by Dolan et al. (23) in their validation study amongst a large sample of primary care staff. The results from the bland-altman analysis similarly indicated that only a minimal bias exists of nearly half a point between the SIB and the EE, with EE scores being higher. However, a difference of this magnitude has minimal clinical implications in a screening context and suggest that the SIB and the EE provide reasonably similar measures across the scale. The sensitivity and specificity analyses confirmed the SIB’s ability to correctly identify highly burnt out participants with a high degree of accuracy, when using the full MBI-EE subscale as the standard.
Some potential limitations to the current study should be noted. Firstly, a multi-faceted assessment of burnout incorporating the inclusion of objective measures of burnout (e.g. third-party assessment, absenteeism) in addition to the MBI-EE would be a stronger standard against which to test the utility of the SIB. Secondly, no inferences about the test-retest reliability of the SIB can be made on the basis of the current study, as it involved assessment at only one time point. Thirdly, the small sample size needs to be acknowledged.
However, the promising results from this study lend support to the utility of the Single Item Burnout Measure, with potential applicability in both human resource (including organizational scans) and research contexts, and calls for further testing of this tool in a larger sample across other health care settings.