This study provides a new tool to assess addiction severity and treatment response for opioid-dependent patients on MMT. Results from the application of this tool within a sample of 117 MMT patients show the GREAT tool to have strong measurement properties such as high test-retest reliability (0.95), as well as modest internal consistency within subdomains (substance use: 0.33, health risk: 0.55, health: 0.54, and personal/social functioning: 0.32). The tool also performed well in the criterion validation, where GREAT substance use domain scores modestly predicted continued opioid abuse among MMT patients.
The G-study is the central component of the tool’s reliability analysis. Performing the G-Study allowed us to determine whether we can draw accurate inferences about MMT patient’s treatment response across test administrations, subscales, and items from their individual scores on the GREAT. The reliability coefficients constructed from our G-Study suggest we can generalize from an individual’s scores across test administrations with a higher degree of certainty, where we found excellent test-retest reliability of the GREAT (consistently 0.94 or higher). The G-study also showed we had fair internal consistency within separate subdomains (
Table 2), with g-coefficients ranging from 0.3 to 0.54. We found the Personal and Social Functioning subscale to have the poorest internal consistency (G-coefficient: 0.33). This subscale may have suffered from pooling very different questions together. This subdomain included items assessing criminal activity, employment, conflict in relationships, and family history of addiction or mental illness. While these items are important to understanding personal/social functioning, participants’ answers indicate we may actually be measuring attributes other than personal and social functioning with this collection of items.
The results of the G-Study suggest the existence of subdomains. In a sort of “quasi-confirmatory factor analysis,” we showed that the overall GREAT has poor internal consistency across subdomains, meaning we are limited in our ability to generalize from one subscale to another (G-coefficient: 0.20). This finding confirms our selection of the health, personal/social functioning, substance use, and health-risk behavior domains, suggesting we are selecting different attributes. If our internal-consistency across subscales was high (example: G-coefficient 0.85), this would indicate we are able to generalize well across subdomains, or in other words we are not measuring very different aspects of addiction severity across domains and therefore there is no need for the multiple domains.
The predictive criterion validation suggests that participants’ score on the GREAT substance use domain is highly correlated with participants’ three month history of opioid and poly-substance use. This confirms the reliability of self-reported drug use among patients treated with methadone. This suggests the GREAT substance use domain could serve as a proxy measure for substance abuse during time-sensitive occasions or for instances when urine toxicology reports may prove expensive or inaccessible at point of care. We chose not to evaluate the global GREAT score against urine toxicology screening because the additional domains (health-risk behavior, health, and personal/social functioning) were not intended to be predictive of opioid abuse. We maintain the purpose of this tool was to provide a complete picture of how participants are functioning across different life spheres by including domains that measure high-risk behavior (e.g. sharing drug consumption paraphernalia), as well as a patient’s social functioning.
In comparison to original MAP, the GREAT has shorter completion time and higher test-retest reliability. Overall the MAP was found to have a high response rate, with only 23 item non-responses across 16 participants and an average completion time of 11.7 minutes (SD = 3.8) (
12). After 3 days of initial testing, the MAP was readministered to patients, where they found test-retest reliability for all substances was high (0.88 for clients reporting use), however there was variability for the ICC among different substance user groups (
12). Variability in MAP test-retest reliability between interviewer groups was high (0.84 average) (
12). The internal reliability of the anxiety and depression scales were good (alpha = 0.88 and 0.86 respectively), while the health scale internal reliability was satisfactory (alpha = 0.79) (
12). Both the depression and health scale were not sustained for the modified GREAT, inhibiting our ability to comment on both these sections. Consistent with the primary field investigation undertaken by the original authors to assess the reliability and validity of the MAP, other studies confirm the original study report showing the internal and test-retest reliabilities of the MAP are satisfactory and that the instrument is adequate for health service evaluation in addition to other appropriate research purposes (
29). In contrast with the MAP, the GREAT reviews patient’s responses over a 3-month time frame, allowing for a broader scope to capture treatment response. Due to the chronic remitting-relapsing nature of opioid dependence it was necessary to evaluate patient’s behavior over a longer time frame than 30 days. In comparison to the fragmented scoring structure of the MAP, the GREAT also provides a unified global score, an important feature that will improve its utility as a measurement scale for research purposes.
This study is limited by the small sample of participants available to partake in the G-Study, where only 21 participants underwent multiple testing administrations to confirm the reliability properties of the tool. An increased sample size would have improved our confidence in the estimates generated from the G-Study.
To be included in the regression analysis participants require complete data for all variables selected the regression model. At any point participants have missing data for a single variable they will be dropped from the regression analysis. While 117 patients were recruited for inclusion into the study, we were only able to perform analyses on 107 of these patients due to 10 patients lacking data on variables selected for inclusion to the model. Systematic bias posed by missing data is a potential limitation of the majority of studies. Within this study there remains less than 10% missing data, which is within the commonly accepted threshold by which studies with < 10% missing data are unlikely to be significantly confounded (
30).
In addition, the sample used to create this tool may be an important factor limiting the results of this study. We generated a new tool using GENOA participants’ responses to the MAP, and as such this new tool is impacted by the generalizability of the GENOA sample. For instance, the use of amphetamines was not reported among the GENOA sample. Since it is known the preference and impact of amphetamines is greater in certain areas of the U.S. (
31) the GREAT may not perform well in capturing concurrent substance use among such populations. Nevertheless, the tool can be modified to include additional illicit substances not currently included (example: amphetamines, LSD).
5.1. Conclusions
Assessment of addiction severity and MMT response is a pertinent topic to clinicians and researchers. A modified tool to assess a patient’s progression through methadone treatment in conjunction with their addiction severity serve to identify high-risk patients for relapse, as well as look beyond the results of a urine test, by including the relevant physical and psycho-social domains affecting patients with opioid use disorders. The future directions of this study will be to develop a risk score to quantify the risk for relapse from the GREAT. The GREAT will serve as a useful adjunct to regular opioid testing, allowing physicians to comprehensively assess the functioning of patients across different domains.