A Quantitative Review of
the Outcome Research.
Paul J. Woods,
Hollins College, Virginia.
From:Lyons, Larry C. & Woods, Paul J. (1991).The efficacy of rational emotive therapy: A quantitative review of the outcome research.Clinical Psychology Review, 11, 357-369.
We thank Wendy A. Morris for her assistance in preparing this manuscript. Without her inestimable help, this project would not have come to fruition.
Larry C. Lyons, MA was a Research Associate at the Hollins Communications Research Institute when this article was published in Clinical Psychology Review. He is currently both a consultant and with a private firm in Manassas, Virgina.
Paul J. Woods Ph.D.is a Professor of Psychology (Emeretus) at the Department of Psychology at Holllins College, and director of the Institute for Rational Therapy and Behavioral Medicine in Roanoke Virginia
Address request for reprints of this article, coding manual or list of studies used in this quantitative review to the second author at:
Paul J. Woods Ph.D. Institute for Rational Therapy and Behavioral Medicine, 5541 Florist Road Roanoke, Virginia
Abstract
RET) outcome studies were reported. There were 236 comparisons of RET to baseline, control groups, Cognitive Behavior Modification, Behavior Therapy, or other psychotherapies were examined.
The results from a meta-analysis of 70 Rational Emotive Therapy (The results indicated that subjects receiving RET demonstrated significant improvement over baseline measures and control groups. The results showed no significant differences in effect size between those studies which used psychotherapy clients or students as subjects. Effect size was significantly related to therapist experience and to duration of the therapy. The results indicated that those comparisons which were rated high in internal validity had signficantly higher effect sizes than medium validity studies. Outcome measures rated as low in reactivity had significantly higher effect sizes than more reactive measures.
Contrary to other reviews using the narrative review method, RET was found to be an effective form of therapy. However, this conclusion was tempered by methodological flaws of the studies reviewed, such as lack of follow-up data and information regarding attrition rates.
Rational Emotive Therapy (RET) has become one of the most accepted forms of Cognitive Behavior Modification since its initial development in the late 1950's and early 1960's. (Ellis, 1957; 1962; Gregg, 1973). Narrative reviews have either supported RET's efficacy or criticized its usefulness. Using a quantitative review method (meta-analysis), this study examines the efficacy of RET and addresses many of the criticisms advanced by previous reviews of RET.
Ledwidge (1978) reviewed 13 outcome studies comparing Cognitive Behavior Modification (CBM) to Behavior Therapy. Of the 13 studies, six were based on either RET or Systematic Rational Restructuring. He concluded that although these studies suggest no differences between behavior therapy and Cognitive Behavior Therapy (and by extension RET), this was only apparent because none of these studies used a clinical sample. On the basis of these findings, Ledwidge concluded that behavior therapy was the superior form of treatment.
DiGiuseppe and Miller (1977) reviewed 22 studies which examined the effectiveness of RET or a closely related therapy. These studies were categorized in two sets: comparative studies, which compared RET to some other psychotherapy; and non-comparative research which examined RET to baseline scores or against some form of control. DiGiuseppe and Miller concluded that many of the studies had a variety of methodological problems typical of most psychotherapy research. They found RET to be considerably more effective than no treatment control and baseline scores.
Zettle and Hayes (1980) reviewed 35 studies which investigated the theoretical assumptions, individual treatment components, and overall effectiveness of RET. They claimed all available outcome research were either unsystematic case studies or consisted of analogue research. Less than half (16) of the 35 studies were outcome experiments. Based on these 16 studies, the authors concluded that the clinical efficacy of RET was not demonstrated. Prochaska (1984) surveyed eight studies which examined the effectiveness of RET. From this limited sample, he concluded that these studies demonstrated only equivocal results and the most positive application of RET was for reducing common anxiety. Based on their review of 47 RET outcome studies, McGovern and Silverman (1984) concluded that these studies supported the efficacy of RET.
Reviewers of RET outcome literature have raised many criticisms regarding RET outcome research. One such criticism involves the issue of "elegant" vs. "inelegant" RET (Ellis, 1980). Ellis describes inelegant RET as other forms of cognitive behavior modification, while "elegant" RET refers to the specific approach advocated by Ellis and others. Similarly, Wessler (1983) claimed the researcher frequently misunderstands the therapeutic approaches of RET, or else mislabels or misrepresents the procedure until it is no longer RET. Thus, the question to be investigated is : Does the degree of similarity of the treatment to RET influence effect size.
Several reviewers have criticized the methodological quality of RET outcome studies (e.g., DiGuiseppe and Miller, 1977; Ledwidge, 1978; Prochaska, 1984; Zettle and Hayes, 1980). Their concerns include low percentage of male subjects; subject solicitation; students vs psychotherapy clients; the number of subjects per comparison; subject and therapist assignment to treatment or control groups; therapist training; treatment duration; individual vs group therapy; and reactivity and type of outcome measure.
Another area of investigation is the issue of using students versus clinical subjects. Ledwidge (1978) and Zettle and Hayes (1980) note in their reviews that most RET studies used students volunteers as subjects. They question the external validity of these studies, since the problems assessed may not be applicable to the typical client. While RET may indeed prove effective with students suffering from test anxiety, it is possible that RET may be relatively ineffective when dealing with more serious clinical problems, like depression, or agoraphobia. In other words, are there differences in effect sizes among studies which use students as opposed to psychotherapy clients?
Reviewers disagree on the therapeutic effectiveness of RET. This point constitutes the core of the present review, what is the therapeutic efficacy of RET? This point is difficult to answer in the context of an individual study or the narrative review, because reviewers can reach very different conclusions given the same evidence. The quantitative review format directly address the question of the therapeutic efficacy of RET without reviewer bias.
Method
SelectionofStudies:
ThePsychologicalAbstractsandDissertationAbstractsInternationalfrom 1972 to 1988 were searched for relevant studies. In addition, the reference lists of the obtained studies and the previously mentioned reviews were scanned for additional material. A list of the studies used in this meta-analysis can be seen in the appendix following this article.
Each study had to meet the following inclusion criteria:
(1) At least one treatment group received Rational Emotive Therapy, or a treatment which used elements of RET.
(2) The study compared RET to a baseline measure, a control group, or other type of therapy.
(3) The study used a quantitative statistic which could be converted to an effect size estimate.
(4) The study gave the number of subjects in each treatment or control group.
Seventy studiesmet these criteria, yielding236 comparisonsbetween RET and a baseline measure, a control group, or other form of therapy.
Studies were rejected because of uninterpretable statistics, or because insufficient information regarding treatments, or experimental procedures were presented. Single subject designs and case studies were also excluded.
EffectSizeEstimation
Each comparison of RET to the baseline assessment, control, or treatment groups was expressed in terms of the Standardized Difference between Mean Scores (d; Cohen, 1977). Some researchers (e.g., Glass 1976, 1977; Smith & Glass, 1977; Smith et al, 1980) advocate using the comparison group standard deviation as the denominator ford. However, there are two reasons against using the comparison group standard deviation, first, the within-subjects standard deviation has approximately half the sampling error of the comparison group; second, the within-subjects standard deviation generally provides a more accurate estimate of the populations(Hunter et al, 1982). Afterdis calculated for each study, the effect sizes are averaged.
Where possible, effect sizes were obtained by directly calculatingdfrom the means and standard deviations reported in the individual study. Otherwisedwas calculated fromt,f,r, or a probability value using procedures taken from Cohen (1977) or Hunter et al (1982). If the comparison was taken from a two-way ANOVA, the statistic was first converted to theetastatistic using an algorithm taken from Hasse (1983). This correlation coefficient was then converted todusing procedures outlined by Cohen (1977) and Hunter et al (1982).
When the exact probability level was given, this value was converted into azscore and then converted to a point-biserial correlation. Adwas then estimated using the previously mentioned procedures outlined by Cohen (1977) and Hunter et al (1982). Unfortunately in some studies, an exact statistic was not available. An approximation procedure was used to estimated. Where nonparametric statistics or multiple comparison procedures (e.g., Duncan's Multiple Range test or Newman-Keul's test) were used, the associated probability value (.05, .01, or .001) was converted first to azscore and then to adstatistic using the previously discussed procedures.
If the study only stated a significant difference, the associated probability value was set to .05, and the appropriatedstatistic was calculated using the above conversion algorithms. Similarly, where the term no significant difference was reported, the effect size was assumed to be 0.
In comparisons ofd's derived from test statistics (e.g.,t,F, or means and standard deviation) to those derived from probability values taken from the same statistics, the effect size estimates of the test statisticdwere less conservative estimates than those derived fromp.
The majority of the studies included in this meta-analysis used more than one outcome measure. Other meta-analyses (e.g., Smith et al, 1980) included all nonredundant measures in the analysis. The main problem with this approach is the analysis is based on on dependent effect sizes biasing the results by giving more weight to those studies with multiple effect sizes. To avoid this bias, the present study averaged the effect sizes to produce a single statistic within each comparison of RET, preserving the independence of each comparison.
CodingProcedures
After converting the study statistics to effect sizes, study characteristics were examined and coded using a 28 variable coding scheme. A complete list of the articles and dissertations used in the present meta-analysis is available from the author for a $5 fee to cover printing, postage, and handling. These coding variables included the year of the comparison (either publication or acceptance date), and level of therapist training.
Client or subject problem diagnosis was also coded, based on a coding procedure originally present in Smith et al (1980).
- Neurotic: subjects had the following problems: personal growth, achievement problems, social anxiety, excessive anger, assertion, depression, students with behavior problems, speech anxiety, potential school and college, dropouts, etc.
- Phobic: subjects were diagnosed as suffering from some form of phobia, including simple and complex phobias, agoraphobia, etc.
- Normal: participants had no immediately discernible problem
- Emotional/Somatic: subjects with asthma; sexual dysfunction; insomnia; obesity; migraine headaches; chronic heart disease; and individuals on home dialysis
- Unknown/Unclassified: participants who could not be classified into any of the previous categories, it also included those problems where numbers did not merit a separate classification group.
To determine the effectiveness of RET against other therapies, comparison groups were coded according to the therapies they received:
- Baseline comparison: the pretreatment measure taken at the start of therapy.
- No Treatment Control (NTC): comparison against a group not receiving therapeutic intervention.
- Waiting List Control (WLC): groups whose participants did not receive therapy but who were given the expectation they were on a waiting list for treatment.
- Attention Control/ Placebo (ACP): groups where relaxation or a placebo treatment was given in order to control for non-specific treatment effects.
- Cognitive Behavior Modification (CBM): therapies using the techniques and theories of behavior and personality change based on Bandura (1977), Meichenbaum (1977), Mahoney (1974) and others.
- Behavior Therapy: therapies using principles and procedures based on Learning Theory, including treatments based on Systematic Desensitization, exposure techniques, behavior modification, and other conditioning based procedures.
- Other/Unclassified: other therapies including psychodynamic, Gestalt, humanistic, Adlerian, Reality therapy, vocational/personal development counseling, and undifferentiated counseling. There were very few studies using these therapies, making it impossible to discern any reliable difference between groups.
To determine whether the effectiveness of RET was a function of the degree of similarity to strict Rational Emotive Therapy, two separate coding schemes were used. First, RET studies were classified into comparisons using strict RET methods; Systematic Rational Restructuring or other similar therapy; or CBM treatment procedures which relied on many RET techniques. Second, a rating scheme was derived to assessed the degree of similarity of the treatment group's therapy to RET. This rating scheme was a six point likert scale from 0 (no elements of RET), to 5 (all elements of RET). The studies were rated on various salient features of RET, such as identification, disputation, and modification/replacement of irrational beliefs, homework assignments, and collaborative empiricism between therapist and client, etc. Both the treatment and comparison groups were coded in this manner.
Subject and therapist assignment to treatment and comparison groups were also coded. For the subjects, the categories included
- Random Assignment, where the participants were assigned randomly to either the treatment or comparison groups
- Matching, where a subject was matched with another from the opposite group
- Non-Random Assignment, where ex-post facto matching, covariance adjustments, and equating on pretest scores, or order of participation in the study was used
- Unknown, including those studies which did not mention the method of subject assignment to groups.
Therapist assignment used a similar coding procedure with an additional category of comparisons using a Single Therapist for each treatment condition.
Subject recruitment was also coded, using procedures adapted from Smith et al (1980). Studies were classified according to the following criteria:
- The participants recognized the existence of a problem and sought help
- The subjects responded to an advertisements
- Subjects were directly solicited by the therapist, typically by offering treatment to psychology students with extreme scores on a criterion measure
- Individuals were referred for treatment by a third party
- Participants were committed for therapy, with no choice, as in court ordered treatments.
A rating scheme was developed to assess the effect of variations in internal validity on effect size.
- High validity: random assignment, for subjects and therapists, had a low estimated attrition rate (¾ 15% dropout rate), and used outcome measures deemed low in reactivity
- Medium internal validity: random assignment, or matching for participants and therapists, but had high estimated attrition rates (>15%), or did not mention the dropout rate, and used outcome measures rated medium or low in reactivity
- Low internal validity: non-random assignment (other than matching), had a high estimated dropout rate or did not mention the attrition rate and medium or highly reactive outcome measures.
To assess outcome measure characteristics of the sample, two coding schemes were used according to the type of test used for assessment. Coded outcome measures included:
- Fear and anxiety measures, like Behavioral Approach Tests, and anxiety questionnaires
- Standardized test and measures in common use, like the Irrational Beliefs Test (Jones, 1968), and the Beck Depression Inventory (Beck, 1978)
- Physiological measures, like galvanic skin response, heart rate, and EEG;
- Unclassified including those measures which could not be assigned to any of the previous categories, or there were not enough comparisons to merit a separate classification.
Test reactivity refers to the degree of similarity between the treatment and the test measures. The reactivity of the outcome measures was assessed using a rating scheme adapted from Smith et al (1980).
- Highly reactive measures revealed or had a direct and obvious relationship with the treatment. This category also included nonblind symptom ratings by the therapist ; behavioral approach tests assessed by the experimenter, and in the case of RET oriented treatments, irrational beliefs tests
- Medium reactive measures were defined as standard test and measures with a minimal connection to the therapy. Examples included the MMPI and the Beck Depression Inventory and the State-Trait Anxiety Inventory (Spielberger, Gorsuch, & Lushene, 1970) for therapies not explicitly treating depression or anxiety
- Low reactive measures included those test and measures not easily influenced by the parties involved. Examples include galvanic skin response and other physiological measures; grade point average; blind ratings and decisions; and blind discharge from hospital.
Results
Table 1 presents several demographic characteristics of the studies examined in the present quantitative review. Year of publication ranged from 1970 to 1988, with a median of 1978.5. Publication year was not related to effect size. The median training level for the therapists was 5.000 (PhD candidate or psychiatric resident). Therapist training was significantly related to effect size. The number of subjects per comparison was also significantly related to effect size, smaller number of participants in each comparison group was related to larger effect sizes. The mean age of the subjects, and the percentage of male subjects were not related to effect size
Table 1.DemographicCharacteristicsoftheSample
Variable | M | SD | Range | rd |
---|---|---|---|---|
Publication Year | 1978.975 | 3.345 | 1970 - 1988 | .002 |
Number of Therapists | 2.261 | 1.763 | 1 - 8 | .031 |
Subjects | 26.657 | 18.841 | 5 - 115 | -.224** |
Therapist Training1 | 5.22 | 0.619 | 4 - 6 | .313*** |
Age | 25.014 | 11.280 | 9 - 70 | .059 |
% Male Subjects | 42.385 | 3.345 | 0 - 100 | .082 |
1Therapist Training Ratings: 4 = Master of Arts degree; 5 = PhD Candidate or Psychiatry Resident; 6 = Phd therapist or Psychiatrist with at least 1 year experience beyond the granting of a degree. **p< .01 ***p< .001 |
No significant differences were found in terms of effect sizes between studies using psychotherapy clients or students as participants (t(234) = 0.710, ns), as shown in Table 2.
TypeofSubjectsinRETOutcomeStudies
Subject Category | Nes1 | Percentage of Sample | d2 | Sd |
---|---|---|---|---|
Student | 142 | 60.2 | .914a | .914 |
Psychotherapy | ||||
Client | 94 | 39.8 | 1.002a | .959 |
1Number of Effect Sizes 2Effect sizes with different superscript letters are significantly different at thep< .05 level using Duncan's multiple Range Test. |
Significant differences were found between the different diagnostic groups (F(4,231) = 6.443,p< .001). as shown in Table 3. Comparisons using participants with emotional-somatic problems tended to have significantly higher effect sizes than those comparisons in the other diagnostic categories. Comparisons in the neurotic category had significantly higher effect sizes than those comparisons using normal subjects. No other difference between comparisons was significant.
SubjectDiagnosis
Diagnosis | Nes1 | Percentage of Sample | d2 | Sd |
---|---|---|---|---|
Neurotic | 105 | 44.5 | .989a | 1.022 |
Phobic | 85 | 36.0 | .821ab | .725 |
Normal | 21 | 8.9 | .523b | .287 |
Emotional/ | ||||
Somatic | 16 | 6.8 | 1.922c | 1.260 |
Unclassified | 9 | 3.8 | .953ab | .836 |
1Number of Effect Sizes 2Effect sizes with different superscript letters are significantly different at thep< .05 level using Duncan's multiple Range Test. |
Table 4. Table 4 shows other study characteristics associated with the sample. The degree of similarity of the therapy to RET was found not to be related tod, for both the treatment and comparison groups. Duration of therapy was found to be significantly related to effect size, in terms of the number of hours and weeks of therapy .
OtherTreatmentCharacteristics
Variable | M | S | Range | rd |
---|---|---|---|---|
Experimental | ||||
Group Similarity | 4.165 | .964 | 2 - 5 | .101 |
to RET Rating | ||||
Comparison | ||||
Group Similarity | 0.199 | .743 | 0 - 5.0 | .135 |
to RET Rating | ||||
Therapy Duration | 10.205 | 9.018 | 1 - 45.0 | .335*** |
(Hours) | ||||
Therapy Duration | 6.171 | 4.078 | 1 - 18.5 | .188** |
(Weeks) |
Table 5 presents the effect size estimates broken down by comparison groups. To facilitate the analysis and understanding of these results a Binomial Effect Size Display (BESD; Rosenthal & Ruben, 1982) was also employed.
The BESD displays the change in improvement rate (or success rate, survival rate, etc.) attributable to a certain treatment intervention. In other words, the BESD is the estimated difference in the probabilities of improvement between the treatment and control, or between pre- and post intervention. It is defined as BESD = (.50-r/2) to (.50+r/2), whereris a point biserial correlation. For example, an effect size ofd= .872, (r= .40), when expressed as a BESD, shows that the improvement, or improvement rate prior to intervention is 30%, while after the intervention the improvement rate increases to 70%.
RET Vs. | Nes1 | Percentage of Sample | d2 | Sd |
---|---|---|---|---|
Baseline | 88 | 37.3 | 1.371a | .874 |
NTC3 | 31 | 13.1 | .975bc | .914 |
WLC | 28 | 11.9 | 1.024bc | .852 |
ACP | 21 | 8.9 | .803bc | .738 |
CBM | 13 | 5.5 | .137d | .347 |
Behavior Therapy | 38 | 16.1 | .298d | .580 |
Unclassified | 17 | 7.2 | .848bc | 1.309 |
Overall d | 236 | 100 | .949 | .933 |
1Number of Effect Sizes 2Effect sizes with different superscript letters are significantly different at thep< .05 level using Duncan's multiple Range Test. | ||||
3NTC = No Treatment Control; WLC = Waiting
List Control; ACP = Attentional Control/ Placebo; CBM = Cognitive Behavior Modification |
The overall effect size was .949. Using the BESD, 27.2 % of the sample would have demonstrated significant improvement before therapy. Following RET, 72.8 % of the sample demonstrated significant improvement over those subjects not receiving RET.
In comparing RET to all other treatment conditions, a one way ANOVA indicated significant differences in effect sizes between the various comparisons (F(6,229) = 9.624,p< .001). A Duncan's Multiple Range test indicated that comparisons with baseline conditions had significantly higher effects sizes than all other comparison groups, except for those comparisons with a waiting list control group. Compared to baseline, the mean effect size was 1.371. Using the BESD indicator, the pretherapy clinical improvement rate was 21.5 %. Following RET intervention the improvement rate was 78.5 %.
Comparisons with therapies using CBM or Behavior Therapy demonstrated the lowest mean effect sizes of the sample. A Duncan's Multiple Range test indicated that CBM and Behavior Therapy had significantly lower effect sizes than any of the other treatment conditions.
Table 6 shows the mean effect sizes for individual and group therapy formats. No significant differences were found between comparisons using an individual or a group therapy format (t(234) = 0.940, ns).
Table 6.TreatmentModeMode | Nes1 | Percentage of Sample | d2 | Sd |
---|---|---|---|---|
Individual | 29 | 12.3 | 1.102a | .826 |
Group | 207 | 87.7 | .927a | .947 |
1Number of Effect Sizes 2Effect sizes with different superscript letters are significantly different at thep< .05 level using Duncan's multiple Range Test. |
Tables 7 to 9 present the analyses of the methodological characteristics of the sample. Table 7 presents the subject and therapist assignments to treatment and comparison groups. No significant differences were found among any of the subject (F(3,232) = 0.580, ns). or therapist assignment categories (F(4,231) =1.804,ns).
Table 7.SubjectandTherapistAssignmenttoTreatmentandControlGroupsAssignment Category | Nes1 | Percentage of Sample | d2 | Sd |
---|---|---|---|---|
A) Subject Assignment | ||||
Random | 189 | 80.1 | .988a | .970 |
Matching | 19 | 8.1 | .746a | .858 |
NonRandom | ||||
Assignment | 17 | 7.2 | .838a | .806 |
Unknown | 11 | 4.7 | .801a | .478 |
B) Therapist Assignment | ||||
Random | 31 | 13.1 | 1.250a | 1.420 |
Matching | 33 | 14.0 | 1.062a | 1.236 |
NonRandom | 16 | 6.8 | .816a | .534 |
One Therapist | 77 | 32.6 | .991a | .676 |
Unknown | 79 | 33.5 | .770a | .800 |
1Number of Effect Sizes 2Effect sizes with different superscript letters are significantly different at thep< .05 level using Duncan's multiple Range Test. |
Table 8.SubjectSolicitation. Table 8 presents the subject solicitation data. Significant differences between solicitation categories were found (F(4,226) =3.020,p< .02). Participants referred by a third party had significantly higher effect sizes than any other solicitation category. No other differences were found to be significant.
Solicitation Category | Nes1 | Percentage of Sample |
---|