The Current State of the Empirical Evidence for Psychoanalysis: A Meta-analytic Approach : Harvard Review of Psychiatry

Journal Logo


The Current State of the Empirical Evidence for Psychoanalysis

A Meta-analytic Approach

de Maat, Saskia PhD; de Jonghe, Frans PhD; de Kraker, Ruth MSc; Leichsenring, Falk PhD; Abbass, Allan MD; Luyten, Patrick PhD; Barber, Jacques P. PhD; Van, Rien MD, PhD; Dekker, Jack PhD

Author Information
Harvard Review of Psychiatry 21(3):p 107-137, May/June 2013. | DOI: 10.1097/HRP.0b013e318294f5fd



As a therapeutic discipline, psychoanalysis encompasses both short- and long-term treatment modalities, as presented schematically in Figure 1.

Figure 1:
Psychoanalytic treatment: Modalities.

The two long-term variants are long-term psychoanalytic psychotherapy (LTPP) and psychoanalysis. The criterion most frequently used to differentiate between the two long-term modalities is the therapeutic setting, with the main features being the frequency of the sessions and the physical positions of the patient and the therapist. It is understood that in LTPP both patient and therapist are sitting on chairs facing each other, whereas in psychoanalysis the patient is lying on a couch, and the therapist is sitting on a chair behind him or her. LTPP sessions usually occur once or twice a week; in psychoanalysis the frequency ranges from two to five sessions a week. In this article we concentrate on psychoanalysis proper.

Research has fairly well established the efficacy of LTPP; for example, Shedler,1 Leichsenring,2 and colleagues have conducted meta-analyses, pooling evidence from multiple studies and calculating pooled effect sizes (ESs). Though considerably less evidence is available concerning the efficacy of psychoanalysis, the effectiveness of psychoanalysis has been researched repeatedly. Several reviews and overviews have shown large ESs and conclude that between 60% and 90% of the patients for whom psychoanalysis is indicated derive clinically significant change.3–8 Nevertheless, no meta-analysis has been performed that systematically pooled data specifically on psychoanalysis. Given that psychoanalysis is a long-term, intensive, and expensive treatment, such an analysis of the available empirical data is urgently needed. In this article we present the first meta-analysis of studies examining the effectiveness of psychoanalysis.

Characteristics of Psychoanalysis

Although psychoanalysis is considered a therapy requiring very frequent sessions, there is no universal agreement on the number of sessions. The International Psychoanalytic Association endorses three psychoanalytic training models.9 According to the Eitington model, a frequency of four to five sessions per week is required; in the French model, the session frequency is decided by the analyst and the patient; and in the Uruguayan model, a minimum of three sessions a week is required. In Germany, a frequency of two to three sessions a week is commonly employed, with the patient lying on the couch. This format is called Analytische Psychotherapie, and face-to-face, long-term psychoanalytic psychotherapy is known as Tiefenpsychologisch fundierte Psychotherapie.10 To respect this international variance we opted for a broad definition, including studies in which (1) the patient is lying on a couch, with (2) two to five sessions a week. We performed separate sub-analyses of studies based on treatment frequency (divided into two groups): two to three sessions per week (on average, less than three per week), and three or more sessions per week (on average, three or more per week).

All psychoanalytic therapies, including psychoanalysis, are rooted in the psychoanalytic theories. Gabbard11 outlines the basic principles as follows: much of mental life is unconscious; childhood experiences, in concert with genetic factors, shape the adult; the patient’s transference to the therapist is a primary source of understanding the patient’s character and pathology; the therapist’s countertransference provides valuable information about what the patient induces in others; the patient’s resistance to the therapy process is a major focus of the therapy; symptoms and behaviors serve multiple functions and are determined by complex and often unconscious forces; and the therapist assists the patient in achieving a sense of authenticity.

Despite these “common grounds,” there is presently no single, all-encompassing psychoanalytic theory—but only many partial theories. These theories can be roughly classified into “classical” and “post-classical” views. The classical views (Sigmund Freud and the “Freudians,” Melanie Klein and the “Kleinians,” and the “British Independents”) see intrapersonal conflict as central. Whether referred to as ego psychology, a structural model, a drive-defense model, or a one-person psychology, these approaches concentrate on the triadic relationships of the “oedipal situation,” characterized by sexual and aggressive needs. The post-classical views (with such forerunners as Ferenczi, Balint, and Sullivan, the leaders of relational, interpersonal, intersubjective psychoanalysis, respectively) are developmental theories that focus on “developmental needs,” including the needs to feel connected, seen, understood, loved, appreciated, and protected. Also referred to as a two-person psychology, these approaches concentrate on the dyadic relationships of infancy. In present-day psychoanalysis, the classical and postclassical views coexist. They are not only compatible, but also complementary, to each other.

Personality pathology is a crucial concept in psychoanalytic thinking.12 Psychoanalytic diagnostics basically differentiate between two main forms of personality problems: developmental pathology and conflict pathology (see, for example, Fonagy & Moran).13 Broadly speaking, these two types of problems differ in two ways. The first difference concerns the dating of the origins of the pathology: development pathology relates to problems stemming from early childhood (before the fifth year), whereas conflict pathology relates to problems originating in childhood (around the fifth year and later). The second difference concerns the sort of innate human needs that the pathology mainly pertains to: development pathology focuses on developmental needs such as attachment needs, the need to be valued, seen, and loved, whereas conflict pathology considers the needs of sexuality and aggression. The two kinds of personality pathology do not exclude one another. Most patients present with both developmental pathology and conflict pathology.

Fundamental personality change is considered the goal of psychoanalysis, although its conceptualization depends on the theoretical approach used. It can be summed up as personality growth leading to more differentiation (e.g., of self vs. other, or fantasy vs. reality) and greater integration (of aspects of the self). In psychoanalytic terms, the changes in personality are described as “structural change,” “personality change,” “personality reconstruction or construction,” or the development of a “cohesive,” “adult,” “integrated” self, resulting, among other things, in a greater sense of inner freedom. The purpose of this fundamental change is ultimately for the patient to achieve symptom reduction, prevention of recurrence, better social functioning, and higher quality of life (all persisting after treatment termination).

Psychoanalysis is indicated for patients with “complex mental disorders”14—usually a combination of long-standing, often unsuccessfully treated DSM-defined Axis I disorders (most often, mood disorders) and Axis II personality disorders.15 Several studies show that patients for whom psychoanalysis is indicated suffer from these complex mental disorders.14,16–19

For psychoanalytically trained clinicians, a DSM diagnostic classification is insufficient for a complete diagnosis and treatment choice. These clinicians aim to describe the personality structure of patients in terms of essential psychoanalytical concepts such as defense mechanisms, conflicts, internal object relations, and intrapsychic functioning. In addition, an attempt is made to offer hypotheses explaining the development, maintenance, and recurrence of pathology. In broad psychoanalytic terms, psychoanalysis is useful for moderate to severe conflict pathology and mild developmental pathology.


Research in psychoanalysis is complex to conduct (see de Jonghe et al.).20 The treatments are of considerable length, making it difficult to randomize patients to control conditions that are substantially different from psychoanalysis (see also the discussion section). Study periods that include follow-up are long; the research requires significant funding; and the number of patients is limited. In addition, it is difficult to capture—whether in questionnaires, self-reports, or interviews—the process and outcomes that are considered relevant by psychoanalysts. Some analysts would even argue that doing so is impossible and consider the researcher an “unwanted third” in the treatment. Especially due to the problems of randomization, almost no randomized, controlled trials (RCTs) have been conducted in the field of psychoanalysis. Most studies on psychoanalysis follow a cohort of patients for whom psychoanalysis is indicated, and present pre/post changes. Early research in this field often defined outcome in terms of the therapist’s clinical judgment, reflecting a judgment concerning improvements in personality structure and growth. More recently, measurement instruments have become more common, as are RCTs.


Search Strategy

An extensive literature search was conducted using different search methods. First, we searched the electronic databases PubMed, PsycInfo, Embase, Cochrane Database of Systematic Reviews, and the Cochrane Central Register of Controlled Trials. The time frame was between January 1970 and December 2009. The following search terms were used: psychoanalysis (OR psychoanalytic OR analytic), psychodynamic (OR dynamic OR interpretive OR insight-oriented), therapy (OR psychotherap* OR counseling), long-term (OR open-ended OR LTPP) and treatment outcome (OR outcome OR effective* OR efficacy). The complete search terms are available on request. No limits were set on language. Second, an Internet database of controlled and comparative outcome studies on psychological treatments of depression was searched.21 Third, a manual search was performed on the Open Door Review6 and other reviews and meta-analyses.1–5,7,14 Cross-references in the retrieved publications were tracked down. For the time period of 2010–11, we did not perform the literature search again; instead, we contacted authors of studies that were known to us but whose findings had not yet been published. This third process resulted in two extra studies.

Selection of Studies

The following inclusion criteria were applied:

  • The studies were “outcome-intervention studies.” The outcomes had to be measured in terms of symptom reduction or personality change. Issues such as process variables were excluded from this review. Outcome measures had to be reliable and valid, as supported by at least one study on its reliability.
  • Studies had to report on completed treatments; studies in which large proportions (more than 25%) of treatments were still ongoing were excluded.
  • Studies had to provide ESs; means and standard deviations on measurements; or percentages of patients achieving clinically significant change.
  • The studies were required to be RCTs; prospective, pre/post cohort studies (with or without comparison groups); or cross-sectional studies that included a minimum of ten subjects. Case studies or case series were excluded, as were retrospective studies such as surveys.
  • The studies were required to include adult patients (18 to 65 years of age).
  • The studies had to include only patients with the most “common” (i.e., the most frequently seen in clinical practice) indications for psychoanalysis (i.e., DSM diagnoses [Axis I or II] or psychoanalytically specified symptoms or personality problems). Studies focusing on purely somatic or psychotic disorders were excluded.
  • The treatment was psychoanalysis, characterized as follows: (1) patients were lying on the couch, with (2) two to five therapy sessions a week. Whenever it was uncertain whether the treatment was psychoanalysis (so defined), we contacted the authors of studies to determine the type of treatment.

Identification of Relevant Publications and Quality Assessment

Using the selection criteria, two independent judges (SdM and FdJ) reviewed the titles and abstracts generated from the searches. Disagreement was discussed and resolved by consensus. In case of continued disagreement, a third reviewer was consulted (JD). Titles and abstracts identified as potentially relevant were retrieved for full-text review. Two independent raters then examined whether the full-text articles met the inclusion criteria. Disagreement was discussed and resolved by consensus. Studies with unresolved disagreement were reviewed by a third rater.

Two reviewers (SdM and FdJ) evaluated the quality of the studies independently using a Research Quality Score rating system (see Appendix 1). This rating system (developed by the reviewers SdM and FdJ) follows the research criteria postulated by the Cochrane Collaboration and other researchers.22,23 This system assesses aspects of the study design, patients included, interventions, outcome data, statistics, dropout, and follow-up, and reflects the current standards of evidence-based medicine. Maximum scores and cutoff scores are mentioned in Appendix 1. Studies with unresolved disagreement were reviewed by a third rater (JD). We did not calculate interrater reliability.


We performed different meta-analyses, assessing pre- to post-treatment change and pre-treatment to follow-up change, applying these analyses to measurements of overall functioning, symptoms, and personality and psychosocial functioning. The pre- to post-treatment ES was calculated by subtracting the average post-treatment score from the average pre-treatment score and by then dividing the result by the pooled standard deviations of both groups. The pre-treatment to follow-up ES was calculated by subtracting the average follow-up score from the average pre-treatment score and by then dividing the result by the pooled standard deviations of both groups. ESs of ≥0.20 are considered small; ≥0.5, medium; and ≥0.80, large (see Cohen),24 but these qualifiers applied originally to between-group ESs. Furthermore, Cohen also stated explicitly that the qualifiers were based on his experience and were not empirically defined. Finally, pre-to-post ESs are usually larger than between-group ESs. For these reasons, we avoid using the qualifiers.

An overall mean ES was calculated on the basis of all outcome measures used in a study. ESs for symptom measures and for personality and psychosocial functioning measures were calculated separately. For calculating the overall ES, a mean ES was calculated for each study that presented more than one ES. The ESs or the means and standard deviations (whichever was presented) of individual studies were used, in turn, as the basis for calculating an overall mean ES, with the individual study ESs weighted to reflect the study’s sample size. To calculate the pooled mean ESs, we used the statistical computer program Comprehensive Meta-analysis.25 We computed the pooled mean ESs using the random-effects model because considerable heterogeneity of the included studies was expected.26 In the random-effects model, the included studies are seen as a sample drawn from a population of studies, rather than replications of each other, so that not only the random errors within the studies, but also the true variations of ESs from one study to the next, are taken into account. The random-effects model therefore results in broader 95% confidence intervals (CIs) and more conservative results. Most studies did not report within-group correlations (correlations across time points). Therefore we used Cohen’s d for the repeated-measures comparisons, as recommended by Dunlap and colleagues.27

Finally, we calculated between-group ESs, comparing posttest means of psychoanalysis groups with means of nonclinical norm groups (when the latter were available).

Tests for heterogeneity were calculated by using the Q-statistic.28 A significant Q-value rejects the null hypothesis of homogeneity. We also calculated the degree of heterogeneity in percentages, using the I2-statistic.29 A value of 0% indicates no observed heterogeneity; a value of 25%, low heterogeneity; and values of 50% and 75%, moderate and high heterogeneity, respectively.30

Publication bias was tested according to Duval and Tweedie’s trim-and-fill procedure31 using Comprehensive Meta-analysis. This procedure uses funnel plots (a distribution of the expected studies in a field, based on study sizes and their expected ESs) to estimate the number of “missing studies” in a meta-analysis and the effect that these studies may have had on its outcome. The method yields an estimate of the ES after publication bias has been taken into account, meaning that the ESs expected to belong to the “missing studies” are taken into account. Adjusted values of the pooled mean ESs and 95% CIs are then calculated and compared to the original findings of the meta-analysis. In this procedure, we also used the random-effects model.

A secondary outcome measure for the meta-analysis was clinically significant change measured at treatment termination (pre/post treatment) and at follow-up (pre/follow-up). The definitions of clinically significant change are mentioned in Table 8.

Subgroup Analyses

Subgroup analyses were carried out using Comprehensive Meta-analysis.25 Studies were divided into two or more subgroups. Initially, a pooled mean ES was calculated for each subgroup. It was then determined whether the pooled mean ESs differed significantly between subgroups. The mean pooled ESs were computed using the mixed-effects method of subgroup analyses, which pools studies within subgroups according to the random-effects model, but tests for significant differences between subgroups according to the fixed-effects model.

The following subgroup analyses were conducted, based on the following:

  • Study quality: studies with higher quality scores versus studies with lower quality scores
  • Study design: prospective studies versus studies that had a cross-sectional design (the latter included different patient groups at the beginning and at the end of treatment)
  • Continent of study: Europe versus North America
  • Frequency of sessions: studies with two to three sessions (with an average below three) per week and studies with three or more sessions per week (with an average of three or more)
  • Duration of follow-up: studies with follow-up of up to one year versus studies with follow-up of more than one year
  • Symptom-specific sub-analyses: for all studies, only instruments for measuring depression
  • Across all studies, patient ratings versus therapist ratings versus observer ratings


Results of the Literature Search: Trial Flow

A flow chart showing the process of study selection is given in Figure 2. After screening titles and abstracts, 164 titles were requested in full text and screened by two raters. Three studies were excluded based on language barriers, and 134 more based on full-text screening. Most important reasons for exclusion were that the studies addressed theoretical issues or presented case descriptions. Twenty-seven studies remained, of which 13 were excluded for methodological or other reasons. The remaining 14 studies16,32–70 were included in the meta-analysis. Table 1 presents the study characteristics of the included studies. Ten studies presented data to calculate mean ESs. Table 2 presents the characteristics and reasons of exclusion of the 13 excluded studies.48(retrospective part of study),71–83 We contacted some authors for additional data and received data from the following: Caspar Berghout and Jolien Zevalkink; Dorothea Huber and Günther Klug; Henriette Löffler-Stastka; Rolf Sandell; and Paul Knekt. Nine studies presented percentages of clinically improved patients and were therefore included in the secondary outcome measures.

Figure 2:
Flow chart of studies identified for review.
Table 1:
Included Studies on Psychoanalysisa
Table 2:
Excluded Studies on Psychoanalysisa

Study Characteristics

Of the 14 studies (total n = 603) included in our meta-analysis, 13 were prospective cohort studies, and 1 an RCT.51,52 The study of Knekt and colleagues60,84 included an RCT for three types of psychotherapy and a prospective cohort design for psychoanalysis. The number of patients in the studies varied from 17 to 92. The number of sessions (for any particular patient) ranged from 234 to 971, and the duration of analysis from 2.5 to 6.5 years. Five studies were conducted in the United States, the remainder in Europe; 5 of the 9 European studies were conducted in Germany. Three studies16,32–47 applied a cross-sectional design, assessing different patient groups at pre-treatment, post-treatment, and follow-up.

The quality of the studies varied. First, the sample included only one RCT,51,52 and although some studies followed both a psychoanalysis group and a psychotherapy group, these groups were not controlled against other treatment groups. Second, the measurement instruments varied considerably. Appendix 2 contains a list of all the instruments used. Third, outcome measures varied. In one study, only the therapist rated post-treatment outcome.47 In another study, both patients and therapists rated post-treatment outcome.67 In all other studies, including all follow-up measurements, only patient and independent ratings were included. Analyses were also performed without the studies that did not include independent raters (see section “Pre/Post Effectiveness of Psychoanalysis” below). In a separate sub-analysis we compared patient ratings with independent ratings and therapist ratings (Table 6). Fourth, six studies did not present follow-up results, and of those studies that did, the follow-up periods were relatively short (between 1 and 3.5 years). Fifth, treatment was not manualized in any form, and treatment adherence was not monitored in any study. Systematic descriptions of treatments (mean number of sessions, plus duration) were missing in three studies and had to be estimated. Finally, five studies did not report on dropouts systematically, and all studies but one provided completers-only outcome analyses.60 Overall intention-to-treat analyses were therefore not possible to calculate.

Diagnostic Characteristics

Ten studies presented DSM-III/IV or ICD-9/10 diagnoses. Two studies applied a form of psychoanalytic diagnostic criteria such as the Structural Interview of Kernberg68 or neurotic or non-neurotic personality organization.53 Two studies mentioned only that the patients were “suitable” for psychoanalysis48,63—meaning (at least in general) that a psychoanalyst, after careful clinical evaluation of a patient, believed that the strengths and weaknesses of a patient’s personality structure warranted psychoanalysis.

The diagnostic characteristics of the patients in the included studies matched those found in the research of Doidge,17,18 Caligor,19 and their colleagues. Patients suffered from comorbid Axis I and Axis II disorders. Depressive disorders (range, 27%–100%) and anxiety disorders (range, 39%–100%) were found most frequently. On average, 77% of the patients in this meta-analysis suffered from a depressive disorder, and 50% from an anxiety disorder. Between 20% and 100% of all patients met criteria for a personality disorder, with an average of 47%. Other diagnoses included eating disorders, sexual and relational disorders, work problems, obsessive-compulsive disorders, psychosomatic complaints, and substance abuse. Four studies reported on earlier treatments, if any, of patients.16,32–46,48,60 On average, 73% of those patients had tried previous treatments. Two studies defined the concept of a “clinical case” as patient who scored in the worst 10% clinical range on several measurement instruments.16,32–46 These two studies found that at baseline, 91% and 88%, respectively, of all psychoanalysis patients met clinical case criteria.

Refusal and Dropout Rates

Nine studies reported data on the number of patients that refused to participate in the study, did not start treatment, or dropped out of treatment (Table 3). Four studies reported how many patients refused to participate in the study (range, 13%–40%). Dropout rates ranged from 3% to 33%.16,32–46,53,60

Table 3:
Refusal and Dropout Rates with Psychoanalysis

Pre/Post Effectiveness of Psychoanalysis

Ten studies provided data for pre/post analyses. Four studies49,51,52,61,67 used a frequency of two to three sessions a week, and six studies16,32–47,60,62,63 a frequency of three or more sessions a week. The ESs and 95% CIs of the studies are plotted in Figure 3. The mean pre/post ES (Cohen’s d)24 of psychoanalysis across all studies and all measurement instruments (Table 4) is 1.27 (95% CI, 1.03–1.50; p < .01), indicating a robust effect. This effect remains fairly stable when the one-study-removed method is followed. Heterogeneity of the overall analysis is moderate and not statistically significant (I2 = 38.80%). The study of Huber and Klug51,52 seems an outlier, with larger ESs than the other studies. Removing this study lowers the heterogeneity (I2 = 20.20%) and also the overall ES (1.20; 95% CI, 0.98–1.41; p < .01). The study by Rudolf and colleagues67 is an outlier on the lower end of the range, with smaller ESs than the other studies. Removing this study raises the mean ES to 1.34 (95% CI, 1.12–1.56; p < .01). Removing both studies that did not use independent ratings (Cogan & Porcerelli47 and Rudolf et al.),67 yields a mean ES of 1.36 (95% CI, 1.11–1.60; p < .01, I2 = 29.38%).

Figure 3:
Meta-analysis pre/post effect sizes, overall.
Table 4:
Meta-analyses of Studies Examining the Pre- to Post-treatment Change with Psychoanalysis

The mean pre/post ES of psychoanalysis across all studies that included only symptom instruments is 1.52 (95% CI, 1.20–1.84; p < .01), indicating a robust effect. This effect remains stable when the one-study-removed method is followed. Heterogeneity of the overall analysis is moderate to large and statistically significant (I2 = 65.57%), and remains so when the one-study-removed method is applied. The two outliers in this analysis are the studies of Huber and Klug (mean symptom ES = 2.27)51,52 and Rudolf (mean symptom ES = 0.87).67 Removing these two studies does not change the mean ES across the remaining studies but lowers heterogeneity to I2 = 41.74% (not significant).

The mean pre/post ES of psychoanalysis across all studies that included only personality and psychosocial functioning instruments is 1.08 (95% CI, 0.89–1.26, p < .01), indicating also a robust effect, albeit somewhat lower than the ES of studies using only symptom measures. The ESs remain similar when the one-study-removed method is applied. Heterogeneity of the overall analysis is very low (I2 = 8.76%) suggesting a very similar outcome across studies.

Sub-analyses showed that the difference between higher-quality studies and lower-quality studies was statistically significant (p = .01), with the former showing higher ESs (the quality score of studies was not determined by the magnitude of the ESs). No difference was found between studies using a cross-sectional design and prospective cohort studies (p = .14). We also found no significant differences in effects between studies from Europe and studies performed in the United States (p = .53). Finally, we found no differences between the four studies49,51,52,61,67 with a lower session frequency (two to three sessions a week; mean number of sessions across studies = 266) and the six studies16,32–47,60,62,63 with a higher session frequency (three to five sessions a week; mean number of sessions = 793) (p = .41 for all instruments’ p = .52 for symptoms instruments; p = .73 for personality and psychosocial functioning instruments).

Three studies used specific depression instruments.32–46,51,52,60 The mean ES was 1.85 (95% CI, 1.13–2.58; p < .01). Heterogeneity in this sub-analysis was high (I2 = 78.97%).

Heterogeneity in the sub-analyses seemed the highest among the group of studies using lower session frequency (all German studies). This finding could be explained by the contrast between the studies of Leichsenring,61 Huber and Klug,51,52 and Grande and colleagues,49 on the one hand, and the study of Rudolf,67 on the other. Mean ESs of these studies (using all instruments) were 1.65, 1.86, 1.38, and 0.87, respectively. The first three studies mentioned, which are more recently performed and use more diverse, international measurement instruments (by contrast, Rudolf’s study uses only a German measurement instrument), consistently present higher ESs. It is not clear, however, how these differences in time and instruments affect ES.

Pre/Follow-Up Effectiveness of Psychoanalysis

Only five studies provided data regarding follow-up analyses (Table 5). The mean pre/follow-up ES (Cohen’s d)24 of psychoanalysis across all these studies and all measurement instruments is 1.46 (95% CI, 1.08–1.83; p < .01; see Figure 4), indicating that the effect of psychoanalysis at follow-up remains stable. This effect remains fairly stable when the one-study-removed method is followed. Heterogeneity of the overall analysis is moderate but not statistically significant (I2 = 50.56%). Removing studies does somewhat lower the heterogeneity, with the lowest heterogeneity (I2 = 25.75%) resulting from the removal of the study by Berghout, Zevalkink, and colleagues.32–46 This study has the lowest mean ES (0.90) of the follow-up studies; the other studies’ ESs were 1.20 (Sandell et al.),16 1.43 (Grande et al.),49 1.79 (Leichsenring et al.),61 and 1.97 (Huber/Klug et al.).51,52 Removing the Berghout and Zevalkink study elevates the mean ES across studies to 1.59 (95% CI, 1.25–1.93; p < .01).

Table 5:
Meta-analyses of Studies Examining the Pre- to Follow-Up Treatment Change with Psychoanalysis
Figure 4:
Meta-analysis pre/follow-up effect sizes, overall.

The mean pre/follow-up ES of psychoanalysis across all studies that included only symptom instruments is 1.65 (95% CI, 1.24–2.06, p < .01), indicating that the effect of psychoanalysis at symptom level is stable or even enlarged at follow-up. The mean pre/follow-up ES remains similar when the one-study-removed method is followed. Heterogeneity of the overall analysis is moderate (I2 = 56.89%) but not statistically significant. Removing the Huber and Klug study51,52 lowers heterogeneity considerably (I2 = 33.65%) and leaves the mean ES at 1.50 (95% CI, 1.12–1.87; p < .01). The Huber and Klug study is an outlier with the highest mean ES for symptom instruments (2.24); the other studies in this category have ESs of 1.25 (Berghout/Zevalkink et al.),32–46 1.58 (Grande et al.),49 2.03 (Leichsenring et al.),61 and 1.17 (Sandell et al.).16

The mean pre/follow-up ES of psychoanalysis across all studies that include only personality and psychosocial functioning instruments is 1.31 (95% CI, 1.00–1.62; p < .01), again indicating that the effects of psychoanalysis are stable at follow-up. The mean ES remains similar with the one-study-removed method. Heterogeneity of the overall analysis is low (I2 = 29.55%). The study of Berghout and Zevalkink32–46 seems an outlier; removing this study lowers heterogeneity to zero and raises the mean ES to 1.43 (95% CI, 1.15–1.72; p < .01). This study has the lowest mean ES across personality and psychosocial functioning instruments (0.75); the other studies in this category had ESs of 1.29 (Grande et al.),49 1.69 (Huber/Klug et al.),51,52 1.54 (Leichsenring et al.),61 and 1.21 (Sandell et al.).16

Sub-analyses showed that at follow-up there were no differences in effects between studies that were considered higher in quality and studies lower in quality (p = .39). There was a significant difference at follow-up, however, between the studies with cross-sectional design and the other studies, with the former reporting lower mean ESs (p = .02). Since all studies reporting follow-up were conducted in Europe, no comparison could be made between European and American studies in this respect. We also found two significant differences (overall effect and symptom change) between the (German) studies with a lower mean number of sessions (mean number of sessions across studies = 266) and those with a higher mean number of sessions (mean number of sessions across studies = 810), with the latter reporting lower ESs. The difference between these studies for personality and psychosocial functioning change was a trend finding in the same direction (p = .07).

Finally, we found a significant difference between studies with follow-up periods up to one year and studies with longer follow-up periods (p < .01), indicating lower ESs with studies that included longer follow-up periods. Heterogeneity in the statistically significant sub-analyses was very low to zero, indicating that these follow-up periods were relevant sources of heterogeneity in the main analyses. The two studies using depression instruments showed a large mean ES at follow-up (1.81; 95% CI, 0.33–3.28), again presenting high heterogeneity, which was discussed earlier.32–46,51,52

Comparison of Psychoanalysis Posttest Means and Means of Nonclinical Norm Groups

Seven studies16,32–46,51,52,60–62 could be used to compare posttest means of psychoanalysis against means of nonclinical groups. Table 6 presents between-group ESs.

Table 6:
Comparison of Posttest Means of Psychoanalysis and Means of Nonclinical Norm Groups

Generally, the posttest means of psychoanalysis do not differ from the means presented by nonclinical groups. Between-group ESs are small and not statistically significant. Three subscales of the Minnesota Multiphasic Personality Inventory in the Berghout and Zevalkink study show that the posttest means of patients who underwent psychoanalysis are still more elevated than those of nonclinical groups.

Ratings of Therapists Versus Patients Versus Observers

We compared all patient-rated outcomes with therapist-rated outcomes and with observer-rated outcomes (Table 7). Post-treatment and follow-up measurements were taken together. We found that therapist-rated instruments yielded the lowest ESs and that observer-rated instruments yielded the highest, with patient ratings falling in between. Only the difference between the ratings of therapists (lowest ratings) and observers (highest ratings) was statistically significant.

Table 7:
Therapist Versus Patient Versus Observer Ratings

Clinically Significant Change

Our secondary outcome was clinically significant change, indicating how many patients underwent a change that was considered clinically relevant. The criteria are presented in Tables 8a and 8b. The former presents the results measured with symptom or general instruments, and the latter shows the results measured with personality instruments.

Table 8a:
Significant Clinical Change on Symptom and General Instruments in Psychoanalysis Studies
Table 8b:
Clinically Significant Change in Personality and Psychosocial Functioning Instruments in Psychoanalysis Studies

At treatment termination an average of 77% of the patients achieved scores under a clinically defined cutoff score or criterion (indicating they were falling in the range of a nonclinical population) of a symptom or general instrument, 48% more than the number of patients scoring under those cutoff scores at baseline. At follow-up an average of 75% achieved that status. For personality and psychosocial functioning instruments, the results indicate that an average of 62% of the patients achieved scores under a clinically defined cutoff score or criterion (indicating that they fell in the range of a nonclinical population), 34% more than the number of patients scoring under those cutoff scores at baseline. At follow-up, an average of 65% achieved such a status.

Publication Bias

Based on the absence of significant differences between the adjusted mean ESs (and 95% CI) and the observed values for any of the main comparisons, we failed to find any indication of publication bias in this meta-analysis (Table 9). When looking at the number of trimmed studies, some evidence for publication bias was found. The mean ES for publication bias of all studies based on only symptom instruments was lower at post-treatment after adjusting for publication bias (Cohen’s d = 1.36; 95% CI, 1.03–1.65). The number of trimmed studies was two, indicating that (based on the funnel plot that shows the spreading of studies and their ESs) due to publication bias, two studies in the field of psychoanalysis might be missing. This publication bias refers to the possibility of studies not being published (perhaps due to study quality or minor results). However, the adjusted value represents a small difference from the 1.52 that we found in this meta-analysis.

Table 9:
Publication Biases of All Studies Examining the Pre/Post and Pre/Follow-up Treatment Change with Psychoanalysis


We found that psychoanalysis yields substantial pre/post and pre/follow-up change for patients presenting with long-standing, complex mental disorders—most often a combination of DSM-IV mood or anxiety disorders and personality disorders. At treatment termination, the mean pre/post ES was 1.27 for all outcome instruments taken together, 1.52 for symptom instruments, and 1.08 for personality and social functioning outcomes, all indicating substantial pre/post change. At follow-up the mean pre/follow-up ESs 1.46, 1.65, 1.31, respectively, indicating a stable effect. The majority of patients (62%–76%) achieved a clinically significant change, and these figures seemed stable at follow-up. Posttest means showed that after their treatment, psychoanalysis patients mostly fall in the range of nonclinical groups.

As our findings are based on pre/post studies, the effects of psychoanalysis cannot be compared to the effects of possible alternative treatments; consequently, firm conclusions about effectiveness are not possible here.

The dropout rate (between 3% and 33%) did not seem higher in psychoanalysis than in short-term psychotherapies (e.g., 47% in Pampallona et al.85 and 37%–54% in Casacalenda et al.),86 which is notable in view of the length of treatment. Two of the three studies with the highest dropout rates involved more severe pathology, with 100% of the patients presenting with a personality disorder.62,64–66,87

Overall, the heterogeneity in the analyses was moderate, indicating that there are probably systematic differences between the outcomes. The heterogeneity might be influenced by the different measurement instruments used and by differences in patient populations and the treatments used. For instance, 72% of the patients in the Berghout and Zevalkink study32–46 met criteria for personality disorders, and these patients showed lower ESs on depression instruments. By contrast, 34% of the patients in the Huber and Klug study51,52 and 19.50% in the Knekt60 study had personality disorders, and both groups of patients showed higher ESs on depression instruments. Although we can reach no definitive conclusions regarding the relationship between personality disorders and depression outcomes, Newton-Howes and colleagues88 have shown in a meta-analysis that the presence of personality disorders reduces the effect of treatment outcomes for depression.

It could also be suggested, however, that heterogeneity was mainly influenced by the differences between the studies with lower session frequency—all performed in Germany—and those with higher session frequency. The German studies were characterized by better study quality, lower prevalence of patients with personality disorders, and, on average, fewer sessions and higher ESs. In Germany, insurance coverage for psychoanalysis is limited to 300 sessions. How this influences treatment results or indications remains unclear. More research is needed to shed further light on our findings; for example, dose-response studies would be especially useful.

Sub-analyses at treatment termination indicate that some heterogeneity is present even among the German studies. Rudolf’s study,67 for example, seems to be an outlier within that group; it has considerably lower ESs than the other, more recent studies. A partial explanation could that the study used different measurement instruments; whereas the Rudolf study used only one (German) questionnaire (Psychischer und sozial-kommunikativer Befund), whereas the other studies used various, more internationally employed instruments such as the Beck Depression Inventory, Hamilton Depression Rating Scale, Inventory of Interpersonal Problems, and Symptom Checklist–90. In addition, the Rudolf study, dating from 1994, is the oldest of the German studies. Advances in the discipline could potentially have contributed to the differences seen in the more recent studies. That said, the differences remain, without further investigation, largely unexplained.

For personality measurements at treatment termination, heterogeneity was almost zero, indicating that heterogeneity resulted from differences in the effects of symptom change across studies. At follow-up, heterogeneity was also very low to zero in the statistically significant sub-analyses.

Publication bias seems fairly low in our study. ESs computed after the trim-and-fill method did not differ significantly from the mean ESs found in the meta-analysis. Due to the small number of studies, however, calculations of publication bias must be interpreted cautiously.

Nine of the 14 studies encompassed a long-term psychoanalytic psychotherapy condition in addition to psychoanalysis. In this article we restricted ourselves to the pre/post findings of psychoanalysis studies. The question of whether the results of psychoanalysis and LTPP in nonrandomized studies can be compared is a complicated one. One study51,52 did randomize patients to psychoanalysis or LTPP. The authors found that at follow-up, psychoanalysis performed better than LTPP on personality measures (Inventory of Interpersonal Problems and Scale of Psychological Capacities) and on a goal attainment scale.

Finally, we found that in this meta-analysis, therapist ratings were the lowest, that observer ratings were the highest, and that patient ratings fell in between (a possibly counterintuitive result in that one might expect therapists to rate their own work higher than independent observers). There are pros and cons, of course, for utilizing the ratings provided by these three different groups. On the one hand, independent observers have less vested interest in the treatment and might therefore be less biased in judging results. On the other hand, patients and therapists have much more exposure to the actual evidence than independent observers. The literature is not in agreement on the question of whether patients and therapists might overestimate therapy success. In analyzing the findings of the Menninger Foundation’s psychotherapy research project, Harty and Horwitz70 found that both therapists (65%) and patients (54%) rated therapy success higher than independent judges (38%). Other studies have found, though, that self-reports present more modest results than observer ratings.76,78,89,90


Several limitations of our meta-analysis caution against overinterpreting the results. The most important limitation is the use of pretest/posttest analyses; all studies, except for one, were pre/post cohort studies, lacking (randomized) control groups. In evidence-based medicine’s hierarchy of evidence, RCTs present strong scientific evidence, whereas the evidence from pre/post cohort studies is only moderate. The importance of control groups is made clear by Smit and colleagues91 in their recent meta-analysis of LTPP. Their subgroup analysis of the domain’s “target problems showed that LTPP did significantly better when compared to control treatments without a specialized psychotherapy component, but not when compared to various specialized psychotherapy control treatments.” Considered from this point of view, the evidence for the effects of psychoanalysis cannot be more than of moderate strength.

Several researchers have pointed to the difficulties and limitations of RCTs in the field of intensive, long-term treatments, of which psychoanalysis is paradigmatic.64,92,93 de Jonghe and colleagues20 brought attention to the limited feasibility of RCTs because of the restricted acceptability of the control conditions—especially, but not exclusively, in psychoanalysis. They argue that randomization to the most informative control conditions (waiting list, placebo, and no treatment), coupled with the extended length of the treatment period, renders RCTs unacceptable for patients. Most patients considering a psychoanalytic treatment have previously tried therapies with a much lower frequency or duration with no success, and no evidence-based therapies with frequency of sessions and duration comparable to psychoanalysis are available yet to serve as additional conditions in an RCT. Patients are not likely to accept the risk of being allocated by chance to a control condition that they know all too well.

Notwithstanding such concerns, some RCTs have been undertaken. Huber and Klug51,52 succeeded in randomizing patients with depressive disorders to two fairly complicated randomization rounds (G. Klug, written communication). In the first phase, patients were randomized between psychoanalysis and psychodynamic psychotherapy. A few years later a third group was added—for cognitive-behavioral therapy (CBT). In this second phase the randomization board considered but ultimately rejected the possibility of randomly allocating new patients to the three experimental groups; instead, most patients were allocated to the cognitive-behavioral condition, bring it up to the same number as the other two groups. As psychoanalysis in this RCT averaged two sessions a week, the relatively small difference between this treatment and the other condition (psychodynamic therapy of one session/week) might have contributed to the acceptability of the RCT. Likewise, a pilot study by Steven Roose and colleagues94 succeeded in randomizing ten patients to psychoanalysis or CBT. Another ongoing German study by Marianne Leuzinger-Bohleber and Manfred Beutel95 includes an RCT in which patients are randomized to psychoanalysis (two or three times weekly) or CBT. Although the results of these latter two studies are not yet available, the RCTs discussed here demonstrate that randomization is not impossible; we recommend that further RCTs be conducted in this field.

In the meantime, psychoanalysis has to rely mainly on pre/post cohort studies, and it is often argued that such studies might overestimate the ES of a treatment. This drawback of the cohort study design and the related possibility of biased outcomes96,97 cannot be denied, but several extended reviews demonstrate that, in practice, no systematic differences have been found in the results of RCTs versus those of cohort studies and pre/post studies.14,98–102 In a meta-analysis comparing nonrandomized effectiveness studies with randomized efficacy studies of anxiety disorders, Stewart and Chambless103 found a very small difference (Cohen’s d = −0.08 [significant]) between the ESs of the two types of studies. In addition, other studies show that patients receiving no treatment improve minimally. Norton and Price104 found an ES of 0.25 for placebo groups in studies of anxiety disorders, and Leichsenring and Rabung (unpublished data) an ES of 0.12 in control groups of psychoanalytic therapies.

Knowledge of the “natural, untreated” course of the personality pathology of this target group would be helpful in interpreting the results of pre/post studies. For obvious reasons, such knowledge is scarce. Most people that suffer do seek, and fortunately often find, help. Some research suggests that the symptoms of personality disorders somewhat lessen over time, but this research is based almost exclusively on individuals who have been exposed to treatment105–108 or young children or adolescents, in whom personality change is more expected.105,109 Several longitudinal studies, however, have investigated natural changes in personality of adults. Franz and colleagues110 investigated the spontaneous, long-term course of neurotic spectrum disorders, personality disorders, stress reactions, and somatoform disorders in a representative sample of the normaladult population of Mannheim over a period of 11 years. They found a high correlation between the first and last measurements11 years later (r = .55) and strong evidence for a long-term course of psychological impairment. Roberts and DelVecchio111 meta-analyzed 152 longitudinal studies (including 55,000 individuals) and compiled 3,217 test/retest correlations. They found that personality traits were increasingly stable in adulthood (r = 0.31 in childhood; r = 0.64 at 30 years of age; r = 0.74 between 50 and 70 years of age). Terracciano and colleagues112 presented a longitudinal study measuring intra-individual personality change of 684 subjects who were tested at regular intervals of first 6 and then 12 years. Individual stability on ten scales of personality dimensions was high (r = 0.75), and the stability increased slightly when people were over 30 years of age. This research indicates that personality traits and pathology seem, when untreated, fairly stable in adult populations. More research in this area is necessary, and it could serve as a control for otherwise uncontrolled studies of long duration.

Finally, it seems that more and more researchers value uncontrolled effectiveness studies that parallel controlled ones. As Stewart and Chambless103 concluded in their recent meta-analysis of CBT, “One of the most contentious issues in evidence-based practice is the extent to which results from randomized controlled trials can be generalized to routine clinical practice. Uncontrolled effectiveness research permits the researcher to maximize external validity by testing treatments (with prior supporting efficacy research) in all types of naturalistic circumstances to evaluate whether these treatments translate well to the clinical setting.”

In the present meta-analysis, the number of studies is small; the studies are of varying quality; and they each contain small samples of patients. The results therefore rest on a relatively narrow foundation. The treatment and patient groups also vary considerably, and outcomes are not differentiated by DSM disorder. A further limitation of most studies reviewed is that they report only on completers and do not perform intent-to-treat analyses. Completers analysis may exaggerate results. There were only five studies that used follow-up periods, and their lengths were short (with a maximum of 3.5 years). These brief follow-up periods may be important, as our results suggest that the effects after a longer follow-up period are smaller than after a shorter one.

Finally, many psychoanalysts believe that the concept of scientific research (with its measurements, randomization, and strict criteria and procedures) is alien to psychoanalysis. Many would argue that the criteria used in such research—such as the frequency of sessions, the use of a couch, or the presence of particular diagnoses—fail to capture, or even correlate with, the core elements of psychoanalysis. They would see the researcher as an unwanted “third party.” And they would argue that the process of psychoanalysis and the changes in patients cannot be reliably caught in simple, oversimplifying measurement instruments. In this context, it is worth noting that the measurements of personality change in this meta-analysis were mostly done by self-report scales such as the Inventory of Interpersonal Problems, Sense of Coherence Scale, and Social Adjustment Scale. We believe that these outcomes should be subjected to more psychoanalytically relevant personality measurements or factors such as the Adult Attachment Interview, the Minnesota Multiphasic Personality Inventory, projective tests, quality of object relations, and defense styles.


We found evidence that psychoanalysis yields substantial pre/post and pre/follow-up change in patients presenting with complex mental disorders for whom this type of treatment is indicated. These results are almost exclusively based on a small number of pre/post cohort studies, which, from the perspective of evidence-based medicine, are of only moderate scientific strength, as they lack control groups. Therefore, we cannot draw firm conclusions regarding the effectiveness of psychoanalysis. Controlled studies are urgently needed that (1) describe patient samples in both DSM and psychoanalytic diagnostic terms, (2) describe the treatment in more detail, (3) use intention-to-treat analyses, (4) apply in-depth, psychoanalytic personality outcome measures, (5) use long-term follow-up, (6) monitor dropout, (7) ensure treatment integrity, and (8) include cost-effectiveness measures.

Declaration of interest

The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the article.


1. Shedler J. The efficacy of psychodynamic psychotherapy. Am Psychol 2010;65:98–109.
2. Leichsenring F, Rabung S. Long-term psychodynamic psychotherapy in complex mental disorders: update of a meta-analysis. Br J Psychiatry 2011;199:15–22.
3. Bachrach H, Galatzer-Levy R, Skolnikoff A, Waldron S. On the efficacy of psychoanalysis. J Am Psychoanal Assoc 1991;39:871–916.
4. Galatzer-Levy R, Bachrach H, Skolnikoff A, Waldon S. Does psychoanalysis work? New Haven, CT: Yale University Press, 2000.
5. Doidge N. Empirical evidence for the efficacy of psychoanalytic psychotherapies and psychoanalysis: an overview. Psychoanal Inq 1997;suppl:102–50.
6. Fonagy P. An open door review of outcome studies in psychoanalysis. London: International Psychoanalytical Association, 2002.
7. Doidge N. Is psychoanalysis effective? Econ Neurosci 2001;3:41–7.
8. de Maat S, de Jonghe F, Schoevers R, Dekker J. The effectiveness of long-term psychoanalytic therapy: a systematic review of empirical studies. Harv Rev Psychiatry 2009;17:1–23.
9. Jacobs DH. Three models of training. Am Psychoanalyst 2007;41:11–9.
10. Mertens W. Einführung in die psychoanalytische therapie [Introduction to psychoanalytic therapy], vol. 2. Stuttgart: Kohlhammer, 1990.
11. Gabbard GO. Long-term psychodynamic psychotherapy: a basic text. Arlington, VA: American Psychiatric Publishing, 2004.
12. Kernberg OF. A severe sexual inhibition in the course of the psychoanalytic treatment of a patient with a narcissistic personality disorder. Int J Psychoanal 1999;80:899–908.
13. Fonagy P, Moran GS. Understanding psychic change in child psychoanalysis. Int J Psychoanal 1991;72:15–22.
14. Leichsenring F, Rabung S. Effectiveness of long-term psychodynamic psychotherapy: a meta-analysis. JAMA 2008;300:1551–65.
15. American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 4th ed. Washington, DC: APA,1994.
16. Sandell R, Blomberg J, Lazar A, Carlsson J, Broberg J, Schubert J. Varieties of long-term outcome in psychoanalysis and long-term psychotherapy: a review of findings in the Stockholm Outcome of Psychoanalysis and Psychotherapy Project (STOPP). Int J Psychoanal 2000;81:921–42.
17. Doidge N, Simon B, Brauer L, Grant DC, et al. Psychoanalytic patients in the U.S., Canada, and Australia: I. DSM-III-R disorders, indications, previous treatment, medications, and length of treatment. J Am Psychoanal Assoc 2002;50:575–614.
18. Doidge WJ, et al. Psychoanalytic patients in the U.S., Canada, and Australia: II. A DSM-III-R validation study. J Am Psychoanal Assoc 2002;50:615–27.
19. Caligor E, Stern BL, Hamilton M, et al. Why we recommend analytic treatment for some patients and not for others. J Am Psychoanal Assoc 2009;57:677–94.
20. de Jonghe F, de Maat S, Barber JP, et al. Designs for studying effectiveness of long-term psychoanalytic treatments: balancing level of evidence and acceptability for patients. J Am Psychoanal Assoc 2012;60:361–87.
21. Cuijpers P, van Straten A, Warmerdam L, Andersson G. Psychological treatment of depression: a meta-analytic database of randomized studies. BMC Psychiatry 2008;8:36.
22. Alderson P, Green S, Higgins JPT, eds. Cochrane reviewers’ handbook: 4.2.2. Chichester, UK: John Wiley & Sons.
23. Leichsenring F. Randomised controlled versus naturalistic studies: a new research agenda. Bull Menninger Clin 2004;68:137–51.
24. Cohen J. Statistical power analysis for behavioural sciences. Hillsdale, NJ: Lawrence Erlbaum, 1988.
25. Comprehensive meta-analysis (version 2.2.021). Englewood, NJ: Biostat.
26. Hedges LV, Vevea JL. Fixed- and random-effects models in meta-analysis. Psychol Methods 1998;3:486–504.
27. Dunlap WP, Cortina JM, Vaslow JB, Burke MJ. Meta-analysis of experiments with matched groups or repeated measures designs. Psychol Methods 1996;1:170–7.
28. Hedges LV, Olkin I. Statistical methods for meta-analysis. New York: Academic, 1985.
29. Huedo-Medina TB, Sanchez-Meca J, Botella J, Marin-Martinez F. Assessing heterogeneity in meta-analysis: Q statistic or I2 index. Psychol Methods 2006;11:193–206.
30. Higgins JP, Thompsom SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ 2003;327:557–60.
31. Duval S, Tweedie R. Trim and fill: a simple funnel-plot based method of testing and adjusting for publication bias in meta-analysis. Biometrics 2000;56:455–63.
32. Berghout CC, Zevalkink J. Klinisch geval of worried well? Een beschrijving van patiënten voorafgaand aan psychoanalytische behandeling. Psychologie en Gezondheid 2007;35:76–84.
33. Berghout CC, Zevalkink J. Differential selectivity of patients assigned to long-term psychoanalytic treatment. J Am Psychoanal Assoc 2007;55:294–9.
34. Berghout CC, Zevalkink J. Identifying clinical cases among patients assigned to psychoanalytic treatment. Bull Menninger Clin 2008;72:163–78.
35. Berghout CC, Zevalkink J, de Jong JTVM. Symptomen en persoonlijkheidsproblemen voor, tijdens en na langdurige psychoanalytische behandelingen: een multiple-cohort studie. Psychologie en Gezondheid 2010;38;66–75.
36. Berghout CC, Zevalkink J, de Wolf MHM. Psychoanlayse en psychoanalytische psychotherapie: Nederlands onderzoek naar de effectiviteit van beide behandelvormen. Tijdschrift voor Psychotherapy 2010;36:22–39.
37. Berghout CC, Zevalkink J. Clinical significance of long-term psychoanalytic treatment. Bull Menninger Clin 2009;73:7–33.
38. Berghout CC, Zevalkink J, de Jong JTVM. 2010. Effectiveness of long-term psychoanalytic treatment: measuring personality functioning and symptomatic distress in a multiple-cohort design. PhD diss., VU University, Amsterdam, Netherlands.;jsessionid=0B3A8164AB35F765B1B956665FD9C413?sequence=1
39. Berghout CC, Zevalkink J, Hakkaart-van Roijen L. A cost-utility analysis of psychoanalysis versus psychoanalytic psychotherapy. Int J Technol Assess Health Care 2010;26:3–10.
40. Berghout CC, Zevalkink J, Hakkaart-van Roijen L. The effects of long-term psychoanalytic treatment on health care utilization and work productivity and their associated costs. J Psychiatr Pract 2010;16:209–16.
41. Berghout CC, Zevalkink J, Katzko MW, de Jong JTVM. Changes in symptoms and interpersonal problems during the first two years of long-term psychoanalytic psychotherapy and psychoanalysis. Psychol Psychother 2012;85:203–19.
42. Berghout CC, Zevalkink J, de Wolf MHM. Psychoanalyse en psychoanalytische psychotherapie: effect groottes en klinische significantie. Tijdschrift voor Psychotherapie 2010;36:22–39.
43. Zevalkink J, Berghout CC. Expanding the evidence base for the cost-effectiveness of long-term psychoanalytic treatment. J Am Psychoanal Assoc 2006;54:1313–9.
44. Zevalkink J, Berghout CC. Door de bank genomen. Hoe effectief zijn psychoanalytische behandelingen? Amsterdam: Nederlands Psychoanalytisch Instituut, 2008.
45. Zevalkink J, Berghout CC. Klinische besluitvorming ten aanzien van indicatie voor psychoanalyse en psychoanalytische psychotherapie: een empirische invalshoek. Tijdschrift voor Psychotherapie 2008;34:151–68.
46. Zevalkink J, Berghout CC. Mental health characteristics of patients assigned to long-term ambulatory psychoanalytic treatment: psychoanalysis versus psychoanalytic psychotherapy. Psychother Res 2008;18:316–25.
47. Cogan R, Porcerelli JH. Clinician reports of personality pathology of patients beginning and patients ending psychoanalysis. Psychol Psychother 2005;78:235–48.
48. Erle JB, Goldberg DA. The course of 253 analyses from selection to outcome. J Am Psychoanal Assoc 2003;51:257–93.
49. Grande T, Dilg R, Jakobsen T, et al. Differential effects of two forms of psychoanalytic therapy: results of the Heidelberg-Berlin study. Psychother Res 2006;16:470–85.
50. Rudolf G, Dilg R, Grande T, Jakobsen T, et al. Effektivität und Effizienz psychoanalytischer Langzeittherapie: die Praxisstudie Analytische Langzeitpsychotherapie [Effectiveness and efficiency of long-term psychoanalytic psychotherapy: the practice study of long-term psychoanalytic psychotherapy]. In: , , , eds. Psychoanalyse des Glaubens [Psychoanalysis of religious belief]. Giessen: Psychosozial-Verlag, 2004.
51. Huber D, Klug G. Munich Psychotherapy Study (MPS): the effectiveness of psychoanalytic longterm psychotherapy for depression. In: Society for Psychotherapy Research, ed. Book of abstracts: from research to practice. Ulm, Germany: Ulmer Textbank, 2006:154.
52. Huber D, Henrich G, Gastner J, Klug G. Must all have prices? The Munich Psychotherapy Study. In: , , , eds. Current clinical psychiatry: psychodynamic psychotherapy research: evidence-based practice and practice-based evidence. New York: Humana, 2012.
53. Kantrowitz J, Singer J, Knapp PH. Methodology for a prospective study of suitability for psychoanalyis: the role of psychological tests. Psychoanal Q 1975;44:371–91.
54. Kantrowitz J, Paolitto F, Sashin J, Solomon L, Katz A. Affect availability, tolerance, complexity and modulation in psychoanalysis: a follow-up of a longitudinal, prospective study. J Am Psychoanal Assoc 1986;34:529–59.
55. Kantrowitz J. Suitability for psychoanalysis. In: , ed. The yearbook of psychoanalysis and psychotherapy, vol. 2. New York: Guilford, 1987:403–15.
56. Kantrowitz J, Katz A, Paolitto F, Sashin J, Solomon L. Changes in the level and quality of object relations in psychoanalysis: follow-up of a longitudinal prospective study. J Am Psychoanal Assoc 1987;35:529–60.
57. Kantrowitz J, Katz A, Paolitto F, Sashin J, Solomon L. The role of reality testing in psychoanalysis: follow-up of 22 cases. J Am Psychoanal Assoc 1987;35:367–86.
58. Kantrowitz J, Katz AL, Paolitto F. Follow-up of psychoanalysis five to ten years after termination: 1. Stability of change. J Am Psychoanal Assoc 1990;38:471–96.
59. Kantrowitz J, Katz AL, Paolitto F. Follow-up of psychoanalysis five to ten years after termination: 2. Development of the self-analytic function. J Am Psychoanal Assoc 1990;38:637–54.
60. Knekt P, Lindfors O, Laaksonen MA, et al. Helsinki Psychotherapy Study Group. Quasi-experimental study on the effectiveness of psychoanalysis, long-term and short-term psychotherapy on psychiatric symptoms, work ability and functional capacity during a 5-year follow-up. J Affect Disord 2011;132:37–47.
61. Leichsenring F, Biskup J, Kreische R, Staats H. The Göttingen study of psychoanalytic therapy: first results. Int J Psychoanal 2005;86:433–55.
62. Löffler-Stastka H, Rössler-Schülein H, Skale E. Prädiktoren des Therapieabruchs in psychoanalytische Behandlungen von Patiënten mit Persönlichkeitsstöhrungen. Zeitschrift Psychosomatischen und Medische Psychotherapie 2005;54:63–76.
63. Luborsky L, Stuart J, Friedman S, et al. The Penn Psychoanalytic treatment collection: a set of complete and recorded psychoanalyses as a research resource. J Am Psychoanal Assoc 2001;49:217–34.
64. von Rad M, Senf W, Bräutigam W. Psychotherapie und Psychoanalyse in der Krankenversorgung: Ergebnisse des Heidelberger Katamnese-Projektes. Psychother Psychosom Med Psychol 1998;48:88–100.
65. Heuft G, Seibuchler-Engec H, Taschke M, Senf W. Langzeitoutcome ambulanter psychoanalytischer Psychotherapien und Psychoanalysen. Ein textinhaltsanalytische Untersuchung von 53 Katamneseinterviews. Forum der Psychoanalyse Zeitschrift 1996;12:42–355.
66. Kordy H, von Rad M, Senf W. Time and its relevance for successful psychotherapy. Psychother Psychosom 1988;49:212–22.
67. Rudolf G, Manz R, Öri C. Ergebnisse psychoanalytischer Therapien. Z Psychosom Med Psychoanal 1994;40:25–40.
68. Wallerstein R. Forty-two lives in treatment: a study of psychoanalysis and psychotherapy. New York: Guilford, 1986.
69. Kernberg O, Burstein ED, Coyne L, Appelbaum A, Horwitz L, Voth H. Psychotherapy and psychoanalysis: the final report of the Menninger Foundation’s Psychotherapy Research Project. Bull Menninger Clin 1972;36:1–275.
70. Harty M, Horwitz L. Therapeutic outcome as rated by patients, therapists, and judges. Arch Gen Psychiatry 1976;33:957–61.
71. Sashin JI, Eldred S, van Amerongen ST. A search for predictive factors in institute supervised cases: a retrospective study of 183 cases from 1959–1966 at the Boston Psychoanalytical Society and Institute. Int J Psychoanal 1975;56:343–59.
72. Weber JJ, Bachrach HM, Solomon M. Factors associated with outcome of psychoanalysis: a report of the Columbia Psychoanalytic Center Research Project (II). Int Rev Psychoanal 1985;12:127–41.
73. Bachrach HM, Weber JJ, Solomon M. Factors associated with the outcome of psychoanalysis (clinical and methodological considerations): report of the Columbia Psychoanalytic Center Research Project (IV). Int Rev Psychoanal 1985;12:379–89.
74. Dührssen A. Dynamische Psychotherapie, Psychoanalyse und analytische Gruppenpsychotherapie im Vergleich. Z Psychosom Med Psychoanal 1986;32:161–80.
    75. Grossarth-Maticek R, Eysenck HJ. Prophylatic effect of psychoanalysis on cancer prone and heart disease prone probands as compared with control groups and behaviour therapy groups. J Behav Ther Exp Psychiatry 1990;21:91–9.
    76. Keller W, Westhoff G, Dilg R, Rohner R, Studt HH; Study Group on Empirical Psychotherapy Research in Analytical Psychology. Efficacy and cost effectiveness aspects of outpatient (Jungian) psychoanalysis and psychotherapy—a catamnestic study. Berlin: Department of Psychosomatics and Psychotherapy, University Medical Center Benjamin Franklin, Free Universityof Berlin, [1998].
    77. Leuzinger-Bohleber M, Target M, eds. Outcomes of psychoanalytic treatment: perspectives for therapists and researchers. London, Philadelphia: Whurr, 2001.
      78. Leuzinger-Bohleber M, Stuhr U, Rüger B, Beutel M. How to study the ‘quality of psychoanalytic treatments’ and their long-term effects on patients' well-being: a representative, multi-perspective follow-up study. Int J Psychoanal 2003;84:263–90.
      79. Hartmann S, Zepf S. Einflüsse auf die Symptombesserung in der Psychotherapie bei Patienten mit unterschiedlichen Beschwerdebildern. Psychother Psychosom Med Psychol 2004;54:445–56.
      80. Stehle S. Psychotherapeutische Berfustätigkeit. Ergebnisse der DGPT-Therapeutenerhebung. In: , eds. Psychoanalyse des Glaubens [Psychoanalysis of religious belief]. Giessen: Psychosozial-Verlag, 2004.
        81. Rascon SR, Corona PC, Lartugue T, Rios JM, Garza DL. A successful trial utilizing the Leuzinger-Bohleber methodology for evaluation of psychoanalytic treatment: preliminary report. Int J Psychoanal 2005;86:1425–40.
        82. Brockmann J, Schlüter T, Eckert J. Langzeitwirkungen psychoanalytische rund verhaltenstherapeutischer Langzeittherapien [Long-term outcome of long-term psychoanalytic and behavioral long-term therapy]. Psychotherapeut 2006;51:15–25.
        83. Puschner B, Kraft S, Kächele H, Kordy H. Course of improvement over 2 years psychoanalytic and psychodynamic outpatient psychotherapy. Psychol Psychother 2007;80:51–68.
        84. Knekt P, Lindfors O, Välikoski M, Laaksonen M; the Helsinki Psychotherapy Study Group. Quasi-experimental study on the effectiveness of psychoanalysis, long-term and short-term psychodynamic psychotherapy and solution focused therapy on psychiatric symptoms during a 5-year follow-up. Presentation at the annual meeting of the Society for Psychotherapy Research, Madison, WI, June 2007.
        85. Pampallona S, Bollini P, Tibaldi G, Kupelnick B, Munizza C. Combined pharmacotherapy and psychological treatment for depression: a systematic review. Arch Gen Psychiatry 2004;61:714–9.
        86. Casacalenda N, Perry JC, Looper K. Remission in major depressive disorder: a comparison of pharmacotherapy, psychotherapy, and control conditions. Am J Psychiatry 2002;159:1354–60.
        87. Löffler-Stastka H, Steinmair D. Psychoanalytical theory of affects and its applicability on the Affect Regulation and Affect Experience Q-Sort Test (AREQ). J Med Psychol 2009;1 (1):10–20.
        88. Newton-Howes G, Tyrer P, Johnson T. Personality disorder and the outcome of depression: a meta-analysis of published studies. Br J Psychiatry 2006;188:13–20.
        89. Leichsenring F, Leibing E. The effectiveness of psychodynamic therapy and cognitive behavior therapy in the treatment of personality disorders: a meta-analysis. Am J Psychiatry 2003;160:1223–32.
        90. Lambert MJ. Psychotherapy outcome research. In: , ed. Handbook of psychotherapy. 3rd ed. New York: Wiley, 1986:94–129.
        91. Smit Y, Huibers MJH, Ioannidis JPA, van Dyck R, van Tilburg W, Arntz A. The effectiveness of long-term psychoanalytic psychotherapy—a meta-analysis of randomized controlled trials. Clin Psychol Rev 2012: 32:81–92.
        92. Leichsenring F. Are psychodynamic and psychoanalytic therapies effective? A review of empirical data. Int J Psychoanal 2005;86:841–68.
        93. Seligman ME. The effectiveness of psychotherapy: the Consumer Reports Study. Am Psychol 1995;50:965–74.
        94. Caligor E, Hilsenroth MJ, Devlin M, Rutherford BR, Terry M, Roose SP. Will patients accept randomization to psychoanalysis? A feasibility study. J Am Psychoanal Assoc 2012;60:337–60.
        95. Leuzinger-Bohleber M, Beutel M. Psychoanalytic and cognitive-behavioral treatment: first data on the LARC depression study. Presentation at the Psychoanalytic Process Research Strategies conference, Ulm, Germany, June 2009.
        96. Sacks H, Chalmers TC, Smith H Jr. Randomized versus historical controls for clinical trials. Am J Med 1982;72:233–40.
        97. Kunz R, Oxman AD. The unpredictability paradox: review of empirical comparisons of randomised and non-randomised clinical trials. Br Med J 1998;317:1185–90.
        98. Concato J, Shah N, Horwitz RI. Randomised controlled trials, observational studies and the hierarchy of research designs. N Engl J Med 2000;342:1887–92.
        99. Benson K, Hartz AJ. A comparison of observational studies and randomized, controlled trials. N Engl J Med 2000;342:1878–86.
        100. Shadish WR, Matt G, Navarro A, Phillips G. The effects of psychological therapies under clinically representative conditions: a meta-analysis. J Consult Clin Psychol 2000;126:512–29.
        101. Driessen E, Cuijpers P, de Maat S, Abbass A, de Jonghe F, Dekker JJM. The efficacy of short-term psychodynamic psychotherapy for depression: a meta-analysis. Clin Psychol Rev 2010;30:25–36.
        102. Van HL, Dekker J, Koelen J. Patient preference compared with random allocation in short-term psychodynamic supportive psychotherapy with indicated addition of pharmacotherapy for depression. Psychother Res 2009;19:205–12.
        103. Stewart RE, Chambless DL. Cognitive behaviour therapy for adult anxiety disorder in clinical practice: a meta-analysis of effectiveness studies. J Consult Clin Psychol 2009;77:595–606.
        104. Norton PJ, Price EC. A meta-analytic review of adult cognitive-behavioral treatment outcome across the anxiety disorders. J Nerv Ment Dis 2007;195:521–31.
        105. Lenzenweger MF, Desantis Castro D. Predicting change in borderline personality: using neurobehavioral systems indicators within an individual growth curve framework. Dev Psychopathol 2005;17:1207–37.
        106. Zanarini MC, Frankenburg FR, Henne J, Silk KR. The longitudinal course of borderline psychopathology. Am J Psychiatry 2003:160;274–83.
        107. McGlashan TH, Grilo CM, Sanislow CA, et al. Two-year prevalence and stability of individual DSM-IV criteria for schizotypical, borderline, avoidant and obsessive-compulsive personality disorders: toward a hybrid model of Axis II disorders. Am J Psychiatry 2005;162:883–9.
        108. Stone M. Long-term outcome in personality disorders. Br J Psychiatry 1993;162:299–313.
        109. de Clerq B, van Leeuwan K, van den Noortgate W, de Bolle M, de Fruyt F. Childhood personality pathology: dimensional stability and change. Dev Psychopathol 2009;21:853–69.
        110. Franz M, Lieberz K, Schmitz N, Schepank H. A decade of spontaneous long-term course of psychogenic impairment in a community population sample. Soc Psychiatry Psychiatr Epidemiol 1999;34:651–6.
        111. Roberts BW, DelVecchio WF. The rank-order consistency of personality traits from childhood to old age: a quantitative review of longitudinal studies. Psychol Bull 2000;126:3–25.
        112. Terraciano A, McCrae RR, Costa PT Jr. Intra-individual change in personality stability and age. J Res Pers 2010;44:31–7.
        113. Derogatis LR, Lazarus L. SCL-90-R, Brief Symptom Inventory, and matching clinical rating scales. In: , ed. The use of psychological testing for treatment planning and outcome assessment. New York: Hillsdale, 1974:217–48.
        114. Beck AT, Ward CH, Mendelson MM, Mock J, Erbaugh J. An inventory for measuring depression. Arch Gen Psychiatry 1961;4:561–71.
        115. Spielberger CD, Gosuch RL, Lushene RE. Manual for the State-Trait Anxiety Inventory. Palo Alto, CA: Consulting Psychologists, 1970.
        116. Kiresuk TJ, Lund SH. Goal attainment scaling: research, evaluation and utilization. In: , , eds. Program evaluation in health fields, vol. 2. New York: Human Science, 1979.
        117. Rudolf G. Ein psychoanalytisch fundiertes Instrument zur Patient Selbstschätzung. Zeitschrift für Psychosomatischen Medizin 1991;37:350–60.
        118. Luborsky L. Clinician’s judgments of mental health. Arch Gen Psychiatry 1962;7:407–17.
        119. Hamilton M. A rating scale for depression. J Neurol Neurosurg Psychiatry 1960;23:56–62.
        120. Hamilton M. The assessment of anxiety states by rating. B J Med Psychol 1959;32:50–5.
        121. Guy W. ECDEU assessment manual for psychopharmacology. Rockville, MD: U.S. Department of Health, Education, and Welfare, Public Health Service, Alcohol, Drug Abuse, and Mental Health Administration, National Institute of Mental Health, Psychopharmacology Research Branch, Division of Extramural Research Programs, 1976.
        122. Horowitz LM, Strauss B, Kordy H. Inventory of Interpersonal Problems. 2nd ed. Göttingen: Beltz, 2000.
        123. Groth-Marnat G. Handbook of psychological assessment. Hoboken, NJ: Wiley, 1997.
        124. Jacobsen NS, Follette WC, Revenstorf D. Psychotherapy outcome research: methods for reporting variability and evaluating clinical significance. Behav Ther 1984;15:336–52.
        125. Jacobsen NS, Roberts LJ, Berns SB, McGlinchey JB. Methods of defining and determining the clinical significance of treatment effects. Description, application, and alternatives. J Consult Clin Psychol 1999;67:300–7.
        126. Jacobsen NS, Truax P. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol 1991;59:12–9.
        127. DeWitt KN, Hartley DE, Rosenberg SE, Zilberg NJ, Wallerstein RS. Scales of Psychological Capacities: development of an assessment approach. Psychoanal Contemp Thought 1991;14:343–61.
        128. Westen D, Shedler J. Revising and assessing Axis II, Part 1: developing a clinically and empirically valid assessment method. Am J Psychiatry 1999;156:258–72.
        129. Westen D, Shedler J. Revising and assessing Axis II, Part 2: toward an empirically based and clinically useful classification of personality disorders. Am J Psychiatry 1999;156:273–85.
        130. Antonovsky A. Unravelling the mystery of health. San Francisco: Jossey-Bass, 1987.
        131. Weissman M, Bothwell S. Assessment of social adjustment by patient self-report. Arch Gen Psychiatry 1976;33:1111–5.
        132. Ilmarinen J, Tuomi K, Klochars M. Changes in the work ability of active employees over an 11-year period. Scand J Work Environ Health 1997;23(suppl 1):49–57.
        133. Lehtinen V, Joukaama M, Jyrkinen T, et al. Mental health and mental disorders in the Finnish adult population [in Finnish, with English summary]. Turku/Helsinki: Publications of the Social Insurance Institution, 1991.
        134. George C, Kaplan M, Main M. Adult Attachment Interview. Berkeley: University of California Press, 1996.

        Appendix 1 Research Quality Score


        Appendix 2 Instruments Used in Studies

        [1] Symptoms

        With regard to measuring symptoms, the following instruments were used: Symptom Check List-90 (Derogatis & Lazarus [1994]),113 Beck Depression Inventory (Beck et al. [1961]),114 State-Trait Anxiety Inventory (Spielberger et al. [1970]),115 (moderated) Goal Attainment Scale (Kiresuk & Lund [1979]),116 Psychischer and sozial-kommunikativer Befund (Rudolf [1991]),117 Health Sickness Rating Scale (Luborsky [1962]),118 Hamilton Depression Rating Scale (Hamilton [1960]),119 Hamilton Anxiety Rating Scale (Hamilton [1959]),120 Global Assessment of Functioning (DSM-IV), Positive Symptom Distress Index (based on the SCL-90), Positive Symptom Total (based on the SCL-90), and Clinical Global Impression–Severity or –Improvement (Guy [1976]).121

        [2] Personality and Social Functioning

        With regard to measuring changes in personality and psychosocial functioning, the following instruments were used: Inventory of Interpersonal Problems (Horowitz et al. [2000]),122 Minnesota Multiphasic Personality Inventory (Groth-Marnat [1997])123 (using those clinical scales that were, at baseline, clinically elevated relative to a defined cutoff point [Jacobsen et al. (1984, 1999),124,125 Jacobsen & Truax (1991)126], Scales of Psychological Capacities (DeWitt et al., [1991]),127 Shedler–Westen Assessment Procedure–200 (Westen & Shedler [1999]),128,129 Sense of Coherence Scale (Antonovsky [1987]),130 Social Adjustment Scale (Weissman & Bothwell [1976]),131 Work Ability Index (Ilmarinen et al. [1997]),132 work subscale of the Social Adjustment Scale (Weissman and Bothwell [1976]),131 and Perceived Psychological Functioning Scale (Lehtinen et al. [1991]).133 More in-depth measurements of personality change, such as the assessment of attachment styles, defense styles, or object relation-quality, were largely missing or, as in the case of the Knekt study, not yet reported. One study (Berghout/Zevalkink et al. [2006–10, 2012])32–46 used the Adult Attachment Interview (George et al. [1996])134 for assessing outcomes.

        © 2013 President and Fellows of Harvard College