Journal of Consulting and Clinical Psychology
Journal of Consulting and Clinical Psychology
Journal of Consulting and Clinical Psychology
Manuscript version of
Funded by:
• National Institutes of Health
© 2021, American Psychological Association. This manuscript is not the copy of record and may not exactly
replicate the final, authoritative version of the article. Please do not copy or cite without authors’ permission.
The final version of record is available via its DOI: https://dx.doi.org/10.1037/ccp0000552
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
CONTINGENCY MANAGEMENT META-ANALYSIS 1
Abstract
Objective: Contingency management (CM) is often criticized for limited long-term impact. This
meta-analysis focused on objective indices of drug use (i.e., urine toxicology) to examine the
that reported outcomes up to one year after the incentive delivery had ended. Using random
effects models, odds ratios (OR) were calculated for the likelihood of abstinence. Meta-
The overall likelihood of abstinence at the long-term follow-up among participants who received
therapies or protocol-based specific therapies) was OR = 1.22; 95% CI [1.01; 1.44], with low to
moderate heterogeneity (I2 = 36.68). Among 18 moderators, longer length of active treatment
benefit in reducing objective indices of drug use, above and beyond other active, evidence-based
treatment. These data suggest that policymakers and insurers should support and cover costs for
CM, which has hundreds of studies demonstrating its short-term efficacy, and now additional
treatment using objective indices of drug use. Contingency management was found to be more
efficacious than either standard care or other evidence based approaches up to one year following
Many patients with substance use disorders successfully reduce or cease drug use while
involved in treatment programs (Davis et al., 2016; Dutra et al., 2008; Irvin et al., 1999; Lee et
al., 2015; Lundahl et al., 2010; Magill et al., 2019; Magill & Ray, 2009; Roozen et al., 2004;
Sayegh et al., 2017). However, relapse rates are high (McLellan et al., 2000), and substance use
disorders are widely conceptualized as chronic relapsing conditions (Arria & McLellan, 2012;
McLellan et al., 2000). As such, treatment effects often diminish following the conclusion of
most active treatments (Benishek et al., 2014; Magill et al., 2009), and efforts to identify
treatments with lasting impact are essential for improving overall susbtance use outcomes.
change (e.g., submission of drug negative urine samples). CM offers immediate positive
consequences for choosing not to use substances that compete with the positive aspects of drug
use, providing a bridge to the more substantial, but often very delayed benefits of recovery (i.e.,
employment, improved relationships; Petry, 2012). CM has the largest effect size of any
psychosocial treatment for reducing drug use during treatment (g = 0.54; Dutra et al., 2008), yet
remains one of the least likely evidence-based substance use disorder treatments to be offered in
clincal settings (Benishek, Kirby, Dugosh, & Padovano, 2010; Herbeck, Hser, & Teruya, 2008;
Despite strong support for the efficacy of CM during the active treatment phase, many
professionals in the substance use treatment field question the durability of CM’s effect once
reinforcers are discontinued (Petry et al., 2017; Rash et al., 2012, 2014). Some meta-analyses
CONTINGENCY MANAGEMENT META-ANALYSIS 5
support this concern, finding that effect sizes decrease when reinforcers are discontinued
(Benishek et al., 2014, k = 19, 42% overlap [i.e., 42% of the studies in this meta-analysis are also
included in the current review]; Prendergast et al., 2006, k = 47, 4% overlap with current review;
Sayegh et al., 2017, k = 35 CM studies, 31% overlap with current review). However, a recent
systematic review reported nearly one-third of CM studies (across all drug and reinforcer types)
evidenced significant reductions in drug use after cessation of reinforcers (Davis et al., 2016, k =
including CM, may be heavily impacted by variation in outcome measures. No single metric for
substance use treatment outcomes exists (Carroll et al., 2014; Donovan et al., 2012; Korte et al.,
2011; Tiffany et al., 2012). Objective indices (i.e., biological measures based on urine toxicology
screens) remove the risk of bias inherent in self-report (Del Boca & Noll, 2000; Hjorthøj et al.,
2012; Mangura & Kang, 1996; Schuler et al., 2009). However, most reviews of psychosocial
treatments for substance use disorders focus on self-reported drug use (Dutra et al., 2008;
Prendergast et al., 2002) or include objective biologically verified outcomes mixed in with
studies that only provide self-report (e.g., self-reported frequency of use) and/or randomly
biologically verified self-report outcomes (e.g., participants are told they could be drug tested to
Past meta-analytic reviews have provided strong evidence of CM’s efficacy during
treatment and immediately post-treatment (Ainscough et al., 2017, k = 22, 9% overlap with
current review; Dutra et al., 2008, k = 14 CM studies, 21% overlap with current review; Griffith
et al., 2000, k = 30, 3% overlap with current review). Some meta-analyses have reported on
longer term outcomes (Benishek et al., 2014; Prendergast et al., 2006; Sayegh et al., 2017), but
CONTINGENCY MANAGEMENT META-ANALYSIS 6
these past reviews were limited because they either did not report biological outcomes
(Prendergast et al., 2006) or they combined biologically verified outcomes with self-report
and/or randomly biologically verified self-report outcomes (Benishek et al., 2014; Sayegh, et al.,
2017). Further, many reviews of CM focus on its efficacy when applied to only a single drug of
abuse (e.g., cocaine, Farronato et al., 2013, k = 8, 38% overlap with current review; nicotine,
Notley et al., 2019, k = 33, 0% overlap with current review) or do not comprehensively consider
samples recruited from outpatient settings as well as medication assisted treatment clinics
(Benishek et al., 2014). Additionally, no reviews that included long-term outcomes examined
how critical CM parameters found to influence efficacy during the active treatment phase (i.e.,
frequency, immediacy, escalation, fading, magnitude of reinforcement, and form of CM; e.g.,
prize-based CM, Beniskek et al., 2014 or voucher-based CM, Higgins et al., 2019) moderated
The current systematic review and meta-analysis evaluated the efficacy of CM up to one
year after the reinforcer delivery ended using objectively verified substance use treatment
outcomes (i.e., urine drug screens). Only studies assessing outcomes following removal of the
incentives were included in the present analyses. The overall aim of the study was to determine
the relative efficacy of CM after reinforcers were discontinued compared to the long-term impact
of other psychosocial treatment approaches in reducing substance use. The secondary aim was to
understand critical moderators of the long-term efficacy of CM. This aim allows for an
integrated evaluation of several variables only examined in isolation or collapsed, and possibly
illicit drug of abuse). This study seeks to directly address prior criticism of CM’s lack of durable
CONTINGENCY MANAGEMENT META-ANALYSIS 7
efficacy and includes comprehensive evaluation of factors that may enhance or detract from that
efficacy.
Method
Studies were included if they: 1) evaluated the efficacy of CM for treating illicit
stimulant, opioid, or polydrug use; 2) randomly assigned participants to two or more conditions;
3) enrolled at least 25 participants per condition (as recommended by Chambless & Hollon,
1998); 4) had at least one long-term follow-up (i.e., outcomes measured after the protocol
treatment was ended and the initial response to treatment had been determined); 5) presented
substance use outcomes based on urine toxicology; and 6) were published in English.
Studies were excluded if: 1) 25% or more of enrolled participants were under 18 years of
age; or 2) they presented secondary data from another trial included in the meta-analysis. When
two studies presented data from the same trial, we chose the study with the largest sample size.
Studies in which participants were recruited from a setting that provided medication assisted
treatment (e.g., methadone, buprenorphine) were included if they evaluated CM’s efficacy for
treating illicit substance use in that sample (e.g., CM to increase abstinence from stimulants in
persons also receiving methadone for an opioid use disorder) but were excluded if the CM
Search Strategy
Studies published in any year through July 2020 were reviewed based on PRISMA
guidelines (Moher et al., 2009). Figure 1 provides a diagram of the implementation of these
guidelines. Searches were performed in PubMed and PsycInfo using the following combination
"relapse prevention" OR "twelve step" OR "12 step" OR "twelve step facilitation" OR "12 step
the search terms (k = 17); 2) CM review articles identified in the Cochrane Database of
Systematic Reviews (k = 5), and 3) reference lists of studies that met our inclusion/exclusion
Screening Abstracts
Titles and abstracts captured by the search strategy were screened in a multistep process.
One author imported all identified records into reference management software (Mendeley
Desktop, 2017). Confirmed duplicates were deleted, and the remaining records were screened
individually. Clearly non-relevant records (e.g., relapse prevention for a physical health
condition, smoking studies) were removed, as were records for which abstracts indicated the
paper would not meet inclusion criteria (e.g., small sample, non-random assignment). Articles
for which the abstract appeared to meet inclusion, or for which eligibility could not be
determined, were moved to full-text review. One author independently conducted full-text
review to determine inclusion of studies. Two additional authors completed a second full-text
CONTINGENCY MANAGEMENT META-ANALYSIS 9
review on a randomly selected sample of one-third of the 439 excluded articles as well as 100%
of studies classified by the first reviewer as meeting inclusion but not exclusion criteria.
agreement overall for inclusion and exclusion of articles was 97%, k = 0.88. Discrepancies were
resolved through consensus, and when needed, discussion with a third reviewer.
Data Extraction
The primary outcome extracted from each study was urine toxicology results at the
longest available follow-up, which were either the percentage of negative urine toxicology
results (i.e., the number of negative urines samples provided as a ratio to the total number of
urine samples provided over the duration of the follow-up period with samples collected at
several time points) or the point prevalence rates of abstinence (i.e., whether a urine sample was
negative for a target drug at the single time point of the follow-up assessment). If both outcomes
were available, percentage of negative urine toxicology was prioritized for extraction, because
this outcome is more predictive of long-term outcomes (Preston et al., 1998; Stitzer et al., 2009).
If a study had multiple follow-up periods, results from the latest follow-up were reported. Results
from intent-to-treat samples were extracted whenever they were available. If the study did not
use intent-to-treat analyses, coders adjusted urine toxicology results with missing data assumed
positive (for k = 6 studies, 7 samples). If data were missing such that calculations using intent-to-
treat analyses were not possible, authors were contacted directly with a 100% response rate (k =
2; Chuzynski et al., 2015; Petry et al., 2015). Intent-to-treat calculations and author
correspondence allowed for inclusion of all studies identified by our search in the calculations of
and CM treatment variables. The study variables were publication year, comparison condition
therapy; see next paragraph for further information about comparison condition categories),
targeted drug (stimulants only, opioids only, or polysubstance use), whether study recruitment
was conducted in a medication assisted treatment clinic (yes/no), outcome type (percentage of
negative urine samples or point prevalence of abstinence), and when long-term drug use
outcomes were evaluated since discontinuation of CM, measured in weeks. Participant variables
were demographic characteristics such as mean age, the percentage identifying as female, and
the percentage identifying as White. The CM treatment variables were: 1) escalation, i.e.,
participants could earn rewards of escalating value for consecutive negative urine toxicology
samples (yes/no), 2) fading, i.e., a design feature where reinforcers are reduced or become more
variable over time (faded versus not faded), 3) frequency (the number of times reinforcement
was earned per week) 4) immediacy (immediate reinforcement delivery versus delayed), 5)
maximum reinforcer magnitude available in average number of dollars available per participant,
6) CM delivery method (prize versus voucher), and 7) the duration of the CM protocol measured
in weeks.
“treatment as usual” or “standard care” where participants had assistance from providers related
to substance use treatment, mental health care, and other psychosocial needs but it was on an
included structured programs of care with more frequent contacts (e.g., intensive outpatient
CONTINGENCY MANAGEMENT META-ANALYSIS 11
therapies when the studies employed a specific recognized and/or manualized efficacious
The Cochrane Risk of Bias tool was used to assess for possible bias in the included
studies (Higgins & Green, 2011). The following four criteria were assessed: random assignment,
allocation concealment, masking of outcome assessors, and incomplete outcome data. Selective
outcome reporting was not rated because most psychosocial treatment trials are still not
prospectively registered (Bradley et al., 2017). Each risk of bias criterion was designated as high,
low, or unclear risk of bias. We computed overall quality for each study; studies with three or
more indicators of low risk of bias were designated high quality, and studies with two or fewer
For each study included in the final review, at least two authors extracted data using a
standardized form. A third author independently extracted data from a randomly selected third of
studies to ensure 3-way reliability. Inter-rater agreement for data extraction was 96%, and
differences were resolved through consensus or review by a third coder when necessary.
Analytic Plan
Descriptive data are provided for each study individually, with each study grouped by
whether or not participants were receiving medication assisted treatment (Table 1).
ORs, which represented the likelihood of a participant who received CM achieving abstinence
over a participant who received a comparison treatment, were calculated for each treatment
comparison group across the 23 studies with respect to urine toxicology outcomes. If studies
contained multiple CM groups compared to one control comparison, then the OR effect sizes
CONTINGENCY MANAGEMENT META-ANALYSIS 12
were combined and averaged into one OR effect size. Effect sizes were combined by aggregating
binary data of positive and negative urine toxicology outcomes from the treatment groups and
comparing the aggregated data to the positive and negative urine toxicology outcomes of the
comparison condition (Borenstein et al., 2009). A one-study removed analysis was conducted to
Calculations used natural log transformations of the OR with inverse variance weighting
to account for differences in sample sizes across studies. Results were then converted back to
ORs via inverse natural log transformations (Bland & Altman, 2000; Lipsey & Wilson, 2001).
Studies were expected to be at least moderately heterogeneous given the differing lengths of
follow-up periods and different target drugs, with variability not solely due to sampling error. As
such, overall ORs were calculated using random effects models (Neyeloff et al., 2012).
Cochran’s Q statistic estimated heterogeneity in effect sizes, and the inconsistency index (I2) was
reported as an estimate of variance due to heterogeneity (Higgins et al., 2003; Neyeloff et al.,
2012).
Multiple tests were conducted to test for possible publication bias, including the
examination of a funnel plot, the Egger regression test for asymmetry (Egger et al., 1997), and
publication bias. The Egger test uses linear regression to assess the relation between error and
study effect sizes. A fail-safe N determines the number of non-significant studies not identified
1979).
follow up. Meta-regressions were conducted to test for potential moderators with long-term
CONTINGENCY MANAGEMENT META-ANALYSIS 13
outcomes where the log OR effect sizes were regressed onto continuous variables detailed in the
data extraction section. Subgroup analyses tested for differences in long-term outcomes based on
the categorical variables detailed in the data extraction section. Subgroup analyses were
conducted using a mixed effects model, and significance was tested with the fixed effects model.
Variables were only tested as moderators if there were at least 10 studies reporting sufficient
statistical information (Higgins & Green, 2011). One-study removed analyses were conducted to
gauge the impact of each study moderator on the findings from meta-regressions and subgroup
Results
The search yielded 5,510 records, which led to the identification of 23 independent
studies (see Figure 1 for the complete details of the search). Table 1 displays the 23 studies with
24 CM treatment-to comparison treatment contrasts at long-term follow-up along with the design
Across the 23 studies, a total of 3,320 participants were allocated to study conditions. All
studies were conducted in the United States of America. Publication dates ranged from 1997 to
2015. Sample sizes across the 23 studies ranged from 52 to 388 (M = 138.3, SD = 74.2, median =
118.0). The mean age of participants was 39.1 (SD = 5.8, median = 37.6). Approximately 42.1%
(SD = 15.7, median = 44.1) of the sample identified as female, and 45.2% (SD = 20.5, median =
46.4) identified as White. Of the 23 studies, 38% included a sample of participants recruited
from a medication assisted treatment clinic (100% of which were methadone clinics, but
substances targeted by CM in these studies varied, see Table 2 for more details).
the voucher method. Almost all treatments utilized escalating reinforcers (91%) that were
CONTINGENCY MANAGEMENT META-ANALYSIS 14
delivered immediately (84%) upon submission of a negative urine screen. Few treatments (25%)
utilized a reinforcement schedule that faded over time. The frequency of reinforcement that
could be earned per week ranged from 1 and 7. The average maximum magnitude of
reinforcement available per participant per treatment episode was $914.46 (median = $466.0).
Table 2 displays the primary drug that participants used, the type of comparison group, how the
urinalysis outcome was reported (point prevalence vs percent negative), when the long-term
follow-up occurred relative to when CM was discontinued, and the risk of bias assessment for
each study criterion. Most studies examined the effect of CM on stimulant use only (67%),
followed by polysubstance use (29%) and opioid use only (4%). Slightly more than half the
studies (54%) utilized nonspecific therapy comparison groups, 33% used community-based
long-term follow-up, about 58% of studies examined the point prevalence of abstinence and 42%
the percentage of negative urine samples submitted over the follow-up period. These outcomes
were assessed between 6 and 52 weeks (median = 24.0) since CM was discontinued.
Results of the Cochrane Risk of Bias tool indicated 54% of the studies were designated
high quality and about 46% were low quality. Approximately two-thirds (67%) of studies
provided an adequately detailed description of the method used to generate a random sequence
(e.g., used a computer program or a random table of numbers) and about 33% did not (e.g.,
simply stated participants were “randomized” with no further description of procedures). One
study (4%) described an adequate method to conceal the allocation of participants to conditions
from study investigators, while the remaining studies described no method to conceal allocation.
All (100%) studies were determined to have adequate masking of assessors, as they used the
objective assessment of urine toxicology to assess abstinence from drug use. Approximately 62%
CONTINGENCY MANAGEMENT META-ANALYSIS 15
of studies reported appropriate procedures to analyze outcome data (e.g., no missing outcome
data) and about 38% did not (e.g., completer analysis; imputed outcomes).
Long-Term Outcomes
Figure 2 displays a forest plot of the overall OR effect size of each study at long-term
follow-up and the weighted average OR effect size. At long-term follow-up, the weighted
average OR effect size was 1.22, 95% CI [1.03, 1.44], p = .02 (k = 23), with low to moderate
heterogeneity (I2 = 36.68). The one study removed analysis resulted in ORs ranging from 1.18 to
1.26, indicating that the results were not highly influenced by any single study. This weighted
effect size indicates that participants who received CM were 1.22 times more likely to be
of the incentive delivery than participants who received a nonspecific therapy, a nonspecific
comprehensive therapy, or a specific therapy comparison condition. The funnel plot of the OR
effect sizes was symmetrical, and the Egger’s regression test did not indicate publication bias (p
> .05). The fail-safe N indicated that 37 unpublished studies with nonsignificant results would be
the possible moderating effect of several participant demographics, publication year, reinforcer
magnitude, the number of weeks elapsed since the discontinuation of CM, and the duration of the
CM protocol (Table 3). There were significant moderating effects of publication year and
treatment duration on CM outcomes up to one year after the discontinuation of reinforcers. The
meta-regressions indicated a one-year increase in publication year was associated with a 0.04
decrease in the log OR (p = .04; k = 23). For treatment duration, a one week increase in the
CONTINGENCY MANAGEMENT META-ANALYSIS 16
duration of CM was associated with a 0.03 increase in the log OR (p = .04, k = 23). The log OR
effect size also increased as reinforcer magnitude increased (p = .05; k = 23). However, for
reinforcer magnitude, a one study removed analysis indicated the relation appeared mostly
driven by one study (i.e., Silverman et al., 2004) and became nonsignificant when that study was
removed from the meta-regression (p = .59; k = 22). Subgroup analyses tested the moderating
effects of several categorical variables (Table 4). No other moderators were significant, including
the CM delivery method (i.e., prize versus vouchers) and CM parameters (i.e., escalating, fading,
and immediacy) (all p’s > .05); however, this might be due to homogeneity in the types of CM
Discussion
substance use treatments. This meta-analysis included 23 randomized trials of CM that had large
(>25/condition) samples of adult participants and reported urine toxicology results at long-term
follow-ups. This study focused specifically on CM, a psychosocial treatment model with strong
efficacy for reducing substance use during the active treatment period (Dutra et al., 2008) that
has faced skepticism about its durability (Petry et al., 2017). Despite this criticism, the overall
OR for CM at long-term follow up was significant, and participants who received CM evidenced
a 22% greater likelihood of abstinence at a median of 24 weeks after reinforcement ended than
participants receiving comparison treatments. These results provide support of lasting benefits of
CM after reinforcers have been discontinued using objective indices of drug use outcomes.
treatment, type of comparison condition, drug(s) used, outcome type, and study quality did not
CONTINGENCY MANAGEMENT META-ANALYSIS 17
significantly moderate CM efficacy at long-term follow-up. Study publication year was found to
relate to a signficant decrease the efficacy of CM, with newer studies showing smaller effect
sizes. Additionally, longer treatment duration was associated with better long-term outcomes.
Longer treatment duration may allow for greater opportunity to establish durations of continuous
abstinence, a metric which has been consistently associated with better long-term outcomes
(Preston et al., 1998, Stitzer et al., 2009). Type of reinforcement (chance to win prizes versus
vouchers for each negative drug screen) did not impact long-term outcomes, providing further
support for the efficacy of both CM approaches (Petry et al., 2005; Petry et al., 2015).
Clinical Applications
CM’s costs are one of the foremost barriers to CM adoption (e.g., Benishek et al., 2010;
Kirby et al., 2006; Rash et al., 2012; Rash et al., 2014), and costs are often directly related to the
duration of CM treatment (i.e., longer CM protocols increase costs). Clinics will struggle to
implement CM with fidelity without external support to fund CM, which is especially important
in light of findings that CM is no more effective than standard care when reinforcement
magnitude drops below certain levels (Petry et al., 2004). To be effective, CM’s “dose” must be
in the effective range, and a significant proportion of addiction treatment providers in community
settings report maximum available reinforcement per patient well below effective “doses” (Rash
et al., 2013; 2020). The Veterans Affairs’ national implementation of CM in its IOP programs
fidelity, including to parameters such as magnitude and duration (DePhilippis et al., 2018; Petry
et al., 2014; Rash & DePhilippies, 2019). Outside the Veterans Affairs, new pathways to fund
CM by payers (i.e., reimbursement) have yet to be made available despite CM’s clear evidence
of efficacy relative to other commonly used psychosocial treatments (e.g., CBT, MI, relapse
CONTINGENCY MANAGEMENT META-ANALYSIS 18
prevention; Dutra et al., 2008; Petry et al., 2017). These funding issues represent critical barriers
to the accessibility of CM that not only affect patients, but also limit potential societal benefits of
successful substance use disorder treatment in the form of improved employment and
productivity indices, reduced criminal activity, reduced risk behavior and spread of disease, and
improved family functioning (Petry et al., 2017). Despite these barriers, CM has some distinct
advantages beyond its clinical superiority. It can be integrated with wide variety of platform
therapies; it works with most client populations; and it can be readily adapted to clinic and client
needs. Further, both clinical and non-clinical staff can be trained to deliver CM, which may open
additional options for accessing treatment in non-traditional settings (e.g., housing programs,
Long-Term Benefits
many of the trials contained design features that could have reduced the likelihood of uncovering
therapy), and 33% of “standard care” conditions were intensive outpatient treatment. Thus, many
participants, not only those in active CM, were engaged in robust, high intensity treatment and
likely received many of the indirect and often unmeasurable benefits of psychotherapy treatment
(Wampold, 2015). Effect sizes are larger when “passive” or no or minimal comparison
(Prendergast et al., 2002), yet this meta-analysis found benefits of CM even with rigorous
interventions as comparators. Further, effect sizes of CM were not significantly different for
participants who were or were not on medication assisted treatment. This finding suggests that
CONTINGENCY MANAGEMENT META-ANALYSIS 19
CM can be equally effective for those recruited from a medication assisted treatment treatment, a
result that may become even more salient as public health efforts seek to increase capacity for
Despite the heterogeneity in effect sizes at long-term follow up, few clinical moderators
were significant. However, prior research indicates that several CM parameters are significantly
associated with enhanced efficacy of CM, including immediacy of reinforcement (Griffith et al.,
2000; Lussier et al., 2006), frequency of reinforcement (Griffith et al., 2000), and escalation of
reinforcement magnitude (Roll et al., 1996). We suspect that some of these null effects are driven
by the high quality designs used in the majority of included studies, resulting in little
Limitations
However, we set rigorous inclusion and exclusion criteria, only included randomized trials
reporting objective indices of drug use, and conducted risk of bias assessment to increase
confidence in the results. No evidence of publication bias was found across multiple metrics (i.e.,
examination of funnel plots, the Egger’s regression test, and the calculation of the Fail-safe N).
Additionally, some statistical tests of moderators may have resulted in nonsignificant results due
to lack of power. For example, several CM parameters (e.g., immediate and escalating
reinforcers; Lussier et al., 2006; Roll et al., 1996) known to enhance abstinence were not
significantly associated with long-term outcomes in the present meta-analysis. This may be due
to few included studies utilizing CM protocols with delayed and non-escalating reinforcers.
Collectively, these measures constitute an improvement on past studies, especially those utilizing
Focusing on long-term outcomes led to the removal of studies without objective indices
of drug use at follow-ups. To create variables of meaningful moderators with groupings large
enough to analyze, some of the nuances of specific study variables may have been lost and may
contribute to unmeasured heterogeneity in the analyses. For example, polysubstance was used as
an overall label for studies assessing abstinence from more than one drug, but the number and
types of drugs covered by this classification ranged widely by study. We also could not examine
the influence of other key variables, such as comorbid mental health diagnoses, because few
studies reported sufficient details. Finally, the median length of follow-up was only about 6
months after treatment ended and, given the chronic nature of substance use disorders, it will be
Urine toxicology provides an objective index of substance use, but it is not without
limitations. First, most urine toxicology tests only capture drug use in the few days preceding the
sample collection. Though CM protocols are designed with this limitation in mind, follow-ups
are often limited to a snapshot of abstinence. Second, sensitivity and specificity of toxicology
tests vary by drug of abuse (Peace et al., 2000). The studies primarily assessed abstinence from
stimulants or multiple substances concurrently. Persons with different drug use disorders may
respond differentially to CM and other treatments. For example, those with multiple drug use
disorders have more difficulties in achieving abstinence from all substances, resulting in lower
overall effect sizes of treatments for polysubstance users (Dutra et al., 2008).
Although no method for measuring drug use outcomes is standard across studies (Carroll
et al., 2014; Donovan et al., 2014; Tiffany et al., 2012), biological indices such as toxicology
testing should be prioritized. In assessing outcomes for chronic medical conditions such as
diabetes and heart disease, for example, most if not all studies include measurement of A1c
CONTINGENCY MANAGEMENT META-ANALYSIS 21
levels and blood pressure; reliance on self-reports when objective indices are available is
considered unacceptable in clinical trials targeting other medical conditions. The relatively small
number of trials that included toxicology testing in this review underscores that evaluation of
substance use disorders lags behind other chronic conditions. The lack of standardization of
outcome measures is not limited to the evaluation of CM and also may impact findings in other
reviews evaluating long-term effects of cognitive behavioral, relapse prevention, and other
addiction treatments (Burke et al., 2003; Magill & Ray, 2009; Ray et al., 2020; Sayegh et al.,
2017). To improve quality of research, and ultimately treatment, researchers and clinicians
should prioritize objective indices of drug use in evaluating both short- and long-term efficacy of
treatment approaches.
Conclusion
Past meta-analytic reviews of CM established its efficacy for improving substance use
outcomes during treatment and immediately post-treatment (Ainscough et al., 2017; Dutra et al.,
2008; Griffith et al., 2000). In addition, several meta-analyses focused on the long-term impact
of CM (Benishek et al., 2014; Prendergast et al., 2006; Sayegh et al., 2017), but were limited
because they did not report objective biological outcomes (Prendergast et al., 2006), or they
merged biologically verified outcomes with self-report and/or randomly biologically verified
self-report outcomes (Benishek et al., 2014; Sayegh, et al., 2017). Results of the current meta-
analysis provide new information to the field. Specifically, focusing on urine toxicology results,
this meta-analysis found a significant long-term effect for CM, directly addressing the common
concern that the effects of CM disappear once reinforcers are no longer provided. Of note, other
evidence-based psychosocial treatments for substance use disorders (e.g., cognitive behavioral
therapy; Magill et al., 2019) do not face this same level of criticism, yet none have demonstrated
CONTINGENCY MANAGEMENT META-ANALYSIS 22
significant long-term effects when only objective indicators of substance use are examined
(Burke et al., 2003; Magill & Ray, 2009; Magill et al., 2019). Further, the effect CM was robust
across a wide range of demographic and clinical moderators. Overall, CM increased odds of
abstinence across multiple investigative teams, participant demographics, and drugs of abuse.
Benefits of CM were present across rigorously designed trials, including those with comparison
groups using established, active treatment elements (Magill et al., 2019; Magill & Ray, 2009).
Patients with substance use disorders deserve access to treatments with the greatest evidence of
efficacy, and private and public insurers and society should support such treatments (Petry et al.,
2017; Rash et al., 2017; Roll et al., 2009). These results provide novel evidence that CM has
long-term efficacy in reducing drug use. However, no insurer or public payer, other than the
Veterans Administration (DePhilippis et al., 2018), presently covers costs of CM. It is time that
other healthcare systems and policy support this efficacious psychosocial intervention.
CONTINGENCY MANAGEMENT META-ANALYSIS 23
References
Ainscough, T. S., McNeill, A., Strang, J., Calder, R., & Brose, L. S. (2017). Contingency
management interventions for non-prescribed drug use during treatment for opiate
addiction: a systematic review and meta-analysis. Drug and Alcohol Dependence, 178,
318-339.
*Alessi, S. M., Hanson, T., Wieners, M., & Petry, N. M. (2007). Low-cost contingency
Arria, A. M., & McLellan, A. T. (2012). Evolution of concept, but not action, in addiction
Benishek, L. A., Dugosh, K. L., Kirby, K. C., Matejkowski, J., Clements, N. T., Seymour, B. L.,
& Festinger, D. S. (2014). Prize‐ based contingency management for the treatment of
Benishek, L. A., Kirby, K. C., Dugosh, K. L., & Padovano, A. (2010). Beliefs about the
Bland, J. M., & Altman, D. G. (2000). The odds ratio. BMJ: British Medical Journal, 320(7247),
1468.
Borenstein, M., Hedges, L., Higgins, J., & Rothstein, H. (2013). Comprehensive Meta-Analysis
Borenstein, M., Hedges, L. V., Higgins, J., & Rothstein, H. R. (2009). Chapter 25: Multiple
comparisons within a study (pp. 239 – 242). John Wiley & Sons, Ltd.
Bradley, H. A., Rucklidge, J. J., & Mulder, R. T. (2017). A systematic review of trial registration
*Brooner, R. K., Kidorf, M. S., King, V. L., Stoller, K. B., Neufeld, K. J., & Kolodner, K.
(2007). Comparing adaptive stepped care and monetary-based voucher interventions for
Burke, B. L., Arkowitz, H., & Menchola, M. (2003). The efficacy of motivational interviewing: a
Carroll, K. M., Kiluk, B. D., Nich, C., DeVito, E. E., Decker, S., LaPaglia, D., Duffey, D.,
Chambless, D. L., & Hollon, S. D. (1998). Defining empirically supported therapies. Journal of
*Chudzynski, J., Roll, J. M., McPherson, S., Cameron, J. M., & Howell, D. N. (2015).
Davis, D. R., Kurti, A. N., Skelly, J. M., Redner, R., White, T. J., & Higgins, S. T. (2016). A
Del Boca, F. K., & Noll, J. A. (2000). Truth or consequences: the validity of self‐ report data in
DePhilippis, D., Petry, N. M., Bonn-Miller, M. O., Rosenbach, S. B., & McKay, J. R. (2018).
CONTINGENCY MANAGEMENT META-ANALYSIS 25
Veterans Affairs: Attendance at CM sessions and substance use outcomes. Drug and
Donovan, D. M., Bigelow, G. E., Brigham, G. S., Carroll, K. M., Cohen, A. J., Gardin, J. G.,
Hamilton, J. A., Huestis, M. A., Lindblad, R., & Marlatt, G. A. (2012). Primary outcome
and measurement of drug use end‐ points in clinical trials. Addiction, 107(4), 694-708.
Dutra, L., Stathopoulou, G., Basden, S. L., Leyro, T. M., Powers, M. B., & Otto, M. W. (2008).
Egger, M., Davey Smith, G., Schneider, M., & Minder, C. (1997). Bias in meta-analysis detected
*Epstein, D. H., Hawkins, W. E., Covi, L., Umbricht, A., & Preston, K. L. (2003). Cognitive-
behavioral therapy plus contingency management for cocaine use: findings during
treatment and across 12-month follow-up. Psychology of Addictive Behaviors, 17(1), 73.
Griffith, J. D., Rowan-Szal, G. A., Roark, R. R., & Simpson, D. D. (2000). Contingency
*Hagedorn, H. J., Noorbaloochi, S., Simon, A. B., Bangerter, A., Stitzer, M. L., Stetler, C. B., &
Herbeck, D. M., Hser, Y. I., & Teruya, C. (2008). Empirically supported substance abuse
Higgins, J. P., & Green, S. (Eds.). (2011). Cochrane Handbook for Systematic Reviews of
Higgins, S. T., Kurti, A. N., & Davis, D. R. (2019). Voucher-based contingency management is
42(3), 501-524.
Higgins, J. P., Thompson, S. G., Deeks, J. J., & Altman, D. G. (2003). Measuring inconsistency
Hjorthøj, C. R., Hjorthøj, A. R., & Nordentoft, M. (2012). Validity of timeline follow-back for
self-reported use of cannabis and other illicit substances—systematic review and meta-
*Iguchi, M. Y., Belding, M. A., Morral, A. R., Lamb, R. J., & Husband, S. D. (1997).
alternative for reducing drug use. Journal of Consulting and Clinical Psychology, 65(3),
421.
Irvin, J. E., Bowers, C. A., Dunn, M. E., & Wang, M. C. (1999). Efficacy of relapse prevention:
Jones, C. M., Campopiano, M., Baldwin, G., & McCance-Katz, E. (2015). National and state
CONTINGENCY MANAGEMENT META-ANALYSIS 27
treatment need and capacity for opioid agonist medication-assisted treatment. American
*Jones, H. E., Wong, C. J., Tuten, M., & Stitzer, M. L. (2005). Reinforcement-based therapy: 12-
month evaluation of an outpatient drug-free treatment for heroin abusers. Drug and
Kirby K. C., Benishek L. A., Dugosh K. L., & Kerwin. M. E. (2006). Substance abuse treatment
Korte, J. E., Magruder, K. M., Chiuzan, C. C., Logan, S. L., Killeen, T., Bandyopadhyay, D., &
Lee, E. B., An, W., Levin, M. E., & Twohig, M. P. (2015). An initial meta-analysis of
Acceptance and Commitment Therapy for treating substance use disorders. Drug and
Lipsey, M. W., & Wilson, D. B. (2001). Practical Meta-analysis. SAGE publications, Inc.
Lundahl, B. W., Kunz, C., Brownell, C., Tollefson, D., & Burke, B. L. (2010). A meta-analysis
Lussier, J. P., Heil, S. H., Mongeon, J. A., Badger, G. J., & Higgins, S. T. (2006). A meta‐
Magill, M., & Ray, L. A. (2009). Cognitive-behavioral treatment with adult alcohol and illicit
CONTINGENCY MANAGEMENT META-ANALYSIS 28
Magill, M., Ray, L., Kiluk, B., Hoadley, A., Bernstein, M., Tonigan, J. S., & Carroll, K. (2019).
Magura, S., & Kang, S. Y. (1996). Validity of self-reported drug use in high risk populations: a
*McDonell, M. G., Srebnik, D., Angelo, F., McPherson, S., Lowe, J. M., Sugar, A., Short, R. A.,
management for stimulant use in community mental health patients with serious mental
McGovern, M. P., Fox, T. S., Xie, H., & Drake, R. E. (2004). A survey of clinical practices and
McLellan, A. T., Lewis, D. C., O'Brien, C. P., & Kleber, H. D. (2000). Drug dependence, a
Moher, D., Liberati, A., Tetzlaff, J., & Altman, D. G. (2009). Preferred reporting items for
Neyeloff, J. L., Fuchs, S. C., & Moreira, L. B. (2012). Meta-analyses and Forest plots using a
Notley, C., Gentry, S., Livingstone‐ Banks, J., Bauld, L., Perera, R., & Hartmann‐ Boyce, J.
(2019). Incentives for smoking cessation. Cochrane Database of Systematic Reviews, (7).
Peace, M. R., Tarnai, L. D., & Poklis, A. (2000). Performance evaluation of four on-site drug-
*Peirce, J. M., Petry, N. M., Stitzer, M. L., Blaine, J., Kellogg, S., Satterfield, F., Schwartz, M.,
Krasnansky, J., Pencer, E., Silva-Vazquez, L. & Kirby, K. C. (2006). Effects of lower-
*Petry, N. M., Alessi, S. M., Barry, D., & Carroll, K. M. (2015). Standard magnitude prize
*Petry, N. M., Alessi, S. M., Carroll, K. M., Hanson, T., MacKinnon, S., Rounsaville, B., &
*Petry, N. M., Alessi, S. M., & Ledgerwood, D. M. (2012b). A randomized trial of contingency
Petry, N. M., Alessi, S. M., Olmstead, T. A., Rash, C. J., & Zajac, K. (2017). Contingency
management treatment for substance use disorders: How far has it come, and where does
*Petry, N. M., Alessi, S. M., Marx, J., Austin, M., & Tardif, M. (2005). Vouchers versus prizes:
*Petry, N. M., Barry, D., Alessi, S. M., Rounsaville, B. J., & Carroll, K. M. (2012a). A
Petry, N. M., DePhilippis, D., Rash, C. J., Drapkin, M., & McKay, J. R. (2014). Nationwide
Petry, N. M., Tedford, J., Austin, M., Nich, C., Carroll, K. M., & Rounsaville, B. J. (2004). Prize
reinforcement contingency management for treating cocaine users: How low can we go,
*Petry, N. M., Weinstock, J., & Alessi, S. M. (2011). A randomized trial of contingency
*Petry, N. M., Weinstock, J., Alessi, S. M., Lewis, M. W., & Dieckhaus, K. (2010). Group-based
randomized trial of contingencies for health and abstinence in HIV patients. Journal of
Prendergast, M., Podus, D., Finney, J., Greenwell, L., & Roll, J. (2006). Contingency
CONTINGENCY MANAGEMENT META-ANALYSIS 31
Preston, K. L., Silverman, K., Higgens, S. T., Brooner, R. K., Montoya, I., Schuster, C. R., &
*Preston, K. L., Umbricht, A., & Epstein, D. H. (2002). Abstinence reinforcement maintenance
contingency and one-year follow-up. Drug and alcohol dependence, 67(2), 125-137.
Rash, C. J., Alessi, S. M., & Zajac, K. (2020). Examining implementation of contingency
Rash, C. J., DePhilippis, D., McKay, J. R., Drapkin, M., & Petry, N. M. (2013). Training
Rash, C. J., Petry, N. M., Kirby, K. C., Martino, S., Roll, J., & Stitzer, M. L. (2012). Identifying
Rash, C. J., Stitzer, M., & Weinstock, J. (2017). Contingency management: New directions and
*Rawson, R. A., Huber, A., McCann, M., Shoptaw, S., Farabee, D., Reiber, C., & Ling, W.
*Rawson, R. A., McCann, M. J., Flammino, F., Shoptaw, S., Miotto, K., Reiber, C., & Ling, W.
Ray, L. A., Meredith, L. R., Kiluk, B. D., Walthers, J., Carroll, K. M., & Magill, M. (2020).
Combined Pharmacotherapy and Cognitive Behavioral Therapy for Adults With Alcohol
Roll, J. M., Higgins, S. T., & Badger, G. J. (1996). An experimental comparison of three
Roll, J. M., Madden, G. J., Rawson, R., & Petry, N. M. (2009). Facilitating the adoption of
contingency management for the treatment of substance use disorders. Behavior Analysis
*Roll, J. M., Petry, N. M., Stitzer, M. L., Brecht, M. L., Peirce, J. M., McCann, M. J., Blaine, J.,
MacDonald, M., DiMaria, J., Lucero, L., & Kellogg, S. (2006). Contingency
management for the treatment of methamphetamine use disorders. The American Journal
Roozen, H. G., Boulogne, J. J., van Tulder, M. W., van den Brink, W., De Jong, C. A., &
reinforcement approach in alcohol, cocaine and opioid addiction. Drug and Alcohol
CONTINGENCY MANAGEMENT META-ANALYSIS 33
Rosenthal, R. (1979). The ‘file drawer problem’ and tolerance for null results. Psychological
Sayegh, C. S., Huey Jr, S. J., Zara, E. J., & Jhaveri, K. (2017). Follow-up treatment effects of
Schuler, M. S., Lechner, W. V., Carter, R. E., & Malcolm, R. (2009). Temporal and gender
trends in concordance of urine drug screens and self-reported use in cocaine treatment
*Shoptaw, S., Reback, C. J., Peck, J. A., Yang, X., Rotheram-Fuller, E., Larkins, S., ... & Hucks-
HIV-related sexual risk behaviors among urban gay and bisexual men. Drug and Alcohol
*Silverman, K., Robles, E., Mudric, T., Bigelow, G. E., & Stitzer, M. L. (2004). A randomized
who inject drugs. Journal of Consulting and Clinical Psychology, 72(5), 839.
Stitzer, M. L., Peirce, J., Petry, N. M., Kirby, K., Roll, J., Krasnansky, J., Cohen, A., Blaine, J.,
Vandrey, R., Kolodner, K., & Li, R. (2009). Abstinence-based incentives in methadone
maintenance: Interaction with intake stimulant test results. Experimental and Clinical
Tiffany, S. T., Friedman, L., Greenfield, S. F., Hasin, D. S., & Jackson, R. (2012). Beyond drug
Wampold, B. E. (2015). How important are the common factors in psychotherapy? An update.
Willenbring, M. L., Kivlahan, D., Kenny, M., Grillo, M., Hagedorn, H., & Postier, A. (2004).
Table 1.
Description of Studies and Study Conditions Included in the Meta-Analysis on Contingency Management
Escalation
Total Comparison condition CM Prize or with Immediate Reinforcer
Study N Treatment conditions (n) (n) Duration Voucher Reset Fading Frequencya Reinforcers Magnitudeb
Participants recruited from a setting other than a medication assisted treatment clinic
Participants recruited from a medication assisted treatment clinic (all were methadone maintenance clinics)
Brooner, 2007 118 CM (59) SC (59) 24 wks V Y N 1 Y $3201
Epstein, 2003 96 CM (47) SC + NC vouchers (49) 12 wks V Y N 3 Y $1155
Iguchi, 1997 62 CM (27) SC (35) 12 wks V N N 3 N $180
Peirce, 2006 388 CM (190) SC (198) 12 wks P Y N 2 Y $400
Petry, 2012b 130 CM (71) SC (59) 12 wks P Y N 3 Y $381
CM (63) V Y N Y $900
CM (62) P Y N Y $900
Petry, 2015 240 CM (58) SC (57) 12 wks P Y N 3 Y $300
Preston, 2002 110 CM take home doses (55) NC take home doses (55) 12 wks V N N 3 N $360
Rawson, 2002 54 CM (27) SC (27) 16 wks V Y N 3 Y $1278
Silverman, 2004 52 CM take home doses (26) SC (26) 52 wks V Y N 3 Y $5800
Notes. aFrequency refers to the maximum number of times that reinforcement could be earned per week; bReinforcement magnitude
refers to the maximum monetary value that could be earned per participant during treatment; CBT = cognitive behavioral therapy; CM
= contingency management; IOP = intensive outpatient treatment; NC = noncontingent incentives; nr = not reported; SC = standard
care or treatment as usual; TSF = twelve-step facilitation;
CONTINGENCY MANAGEMENT META-ANALYSIS 37
Table 2.
Design Features of Studies Included in the Meta-Analysis and Assessment of their Study Quality
Long-Term Outcome Random Allocation Blinding of Overall
Comparison Type (weeks since Sequence Concealme Outcome Complete Study
Study Drug Condition CM discontinuation) Generation nt Assessors Data Quality
Participants recruited from a setting other than a medication assisted treatment clinic
Table 3
Results from Meta-Regressions of Several Possible Moderators with Contingency Management
Point
Moderator (k) estimate 95% CI Z-value p-value
Participant age (24)
Slope 0.00 -0.03, 0.03 0.06 0.95
Intercept 0.16 -1.02, 1.34 0.27 0.78
Participant gender: Percentage female (24)
Slope 0.01 -0.00, 0.02 1.23 0.22
Intercept -0.14 -0.70, 0.42 -0.49 0.63
Participant race: Percentage White (21)
Slope -0.01 -0.02, 0.00 -1.11 0.26
Intercept 0.48 -0.07, 1.03 1.70 0.09
Publication year (24)
Slope -0.04 -0.07, -0.00 -2.03 0.04
Intercept 73.52 2.88, 144.17 2.04 0.04
Reinforcer frequency (24)
Slope -0.02 -0.19, 0.14 -0.29 0.77
Intercept 0.27 -0.20, 0.74 1.11 0.27
Reinforcer magnitude (24)
Slope 0.00 0.00, 0.00 1.98 0.05
Intercept 0.06 -0.19, 0.30 0.44 0.66
Time of follow up in weeks (24)
Slope -0.01 -0.02, 0.01 -1.22 0.22
Intercept 0.42 0.02, 0.82 2.08 0.04
Treatment duration (24)
Slope 0.03 0.00, 0.06 2.05 0.04
Intercept -0.25 -0.68, 0.18 -1.14 0.25
Note. Point estimates reflect the amount of increase or decrease in the log odds ratio effect sizes.
CONTINGENCY MANAGEMENT META-ANALYSIS 40
Table 4
Results from Subgroup Analyses of Several Possible Moderating Variables with Contingency
Records excluded
Records screened
(k = 3331)
(k = 4319)
(k = 988)
Figure 1.
Flowchart of records identified and reviewed
CONTINGENCY MANAGEMENT META-ANALYSIS 43
Study name Subgroup within study Statistics for each study Odds ratio and 95% CI
Odds Lower Upper
ratio limit limit Z-Value p-Value
Alessi 2007 CM 2.350 1.028 5.371 2.026 0.043
Brooner 2007 CM 1.780 0.841 3.766 1.508 0.132
Chudzynski 2015 Combined 0.912 0.675 1.230 -0.605 0.545
Epstein 2003 CM 2.270 0.941 5.476 1.825 0.068
Hagedorn 2013 CM 1.020 0.480 2.169 0.051 0.959
Iguchi 1997 CM 1.140 0.332 3.919 0.208 0.835
Jones 2005 CM living expenses 1.110 0.520 2.370 0.270 0.787
McDonell 2013 CM 1.570 0.858 2.873 1.463 0.143
Peirce 2006 CM 1.150 0.739 1.789 0.620 0.535
Petry 2005 Combined 1.633 0.893 2.985 1.592 0.111
Petry 2006 CM 0.470 0.159 1.390 -1.364 0.172
Petry 2010 CM 1.080 0.590 1.978 0.249 0.803
Petry 2011 CM 1.120 0.662 1.895 0.422 0.673
Petry 2012a Combined 1.099 0.555 2.176 0.271 0.786
Petry 2012a 2nd sample CM 0.820 0.484 1.390 -0.737 0.461
Petry 2012b CM 0.750 0.374 1.505 -0.810 0.418
Petry 2015 Combined 0.894 0.588 1.358 -0.527 0.598
Preston 2002 CM take home doses 1.340 0.632 2.840 0.764 0.445
Rawson 2002 CM 4.420 1.402 13.932 2.537 0.011
Rawson 2006 CM 0.860 0.360 2.052 -0.340 0.734
Roll 2006 CM 1.520 0.888 2.603 1.526 0.127
Roll 2013 Combined 1.712 1.108 2.644 2.424 0.015
Shoptaw 2005 CM 0.600 0.208 1.727 -0.947 0.344
Silverman 2004 CM take home doses 6.420 1.722 23.931 2.770 0.006
1.219 1.032 1.441 2.334 0.020
0.1 0.2 0.5 1 2 5 10
Figure
Meta2. Analysis
Forest plot of effect sizes for contingency management treatment versus comparison conditions at long-term follow up
Notes. Values are truncated after the third decimal point. The last row in the forest plot represents the overall odds ratio effect size of
all studies. “Combined” indicates a study with multiple CM conditions that were summarized into one odds ratio effect size. CM =
contingency management