How Anti-Psychiatry Researchers Attack Antidepressants With Faulty Statistics

emilskeptic

9 years ago

Generic pill image

Anti-psychiatry is a pseudoscience that downplays or rejects the existence and severity of psychiatric conditions, denies the efficacy of established treatments and demonizes medical doctors. Not all anti-psychiatry activists are committed to all of these three positions, but they are common beliefs within the movement. It is thus very reminiscent of anti-vaccine activists who wrongly think that vaccine-preventable diseases are natural and not very harmful, reject vaccines and demonize pediatricians. In terms of debating tactics, anti-psychiatry activists make use the same standard denialist toolkit: quoting scientists out of context, cherry-picking data, misunderstanding basic science and so on.

A recent paper by Jakobsen and colleagues (2017) claims to have shown that the antidepressant class SSRI has questionably clinical efficacy. It turns out that they base this claim on a piece of highly deceptive statistical trickery: they erect an arbitrary and evidence-free effect size threshold for clinical significance and then reject all treatments that do not fulfill it.

Because the threshold they picked was so large, they would be forced to reject both psychotherapy and a considerable portion of medications used in general medicine as well. The researchers cite National Institute for Health and Care Excellence (NICE) as support for their criteria, but NICE dumped this criteria as flawed around eight years ago. In the end, SSRIs are effective and a useful treatment for depression (but do not work perfectly for everyone) and clinical significance is a spectrum and not a black-and-white issue.

What are antidepressants and how effective are they?

Depression (also called major depression) is a psychiatric condition that involves feelings of emptiness, hopelessness, worthlessness and guilt, loss of interest in things that were previously pleasurable, sleep alterations and other symptoms. It is caused by a complex interaction between biological, psychological and social factors (Passer et al., 2009).

Antidepressants are a class of psychiatric medication that is used to treat depression. They are many different kinds and over time, they have improved efficacy and decreased side-effects as research and development has proceeded. One of the most common forms of antidepressants is selective serotonin reuptake inhibitors (SSRIs), but there are many others as well.

SSRIs have been the subject of hundreds of efficacy and safety studies and approved by regulatory authorities in both the United States and the European Union (e. g. NICE, 2009b). Large-scale randomized control trials (RCTs) show that antidepressants are effective against depression, but they are not miracle medications that works perfectly for everyone (NICE, 2009b; Turner et al. 2008). Research has shown that the best available treatment involves a combination of antidepressants and psychotherapy (and perhaps some exercise) and these often work better together than only getting one of them alone (e. g. Nemeroff et al., 2003).

Depression symptoms are typically measured by validated and reliable rating scales such as the Hamilton Rating Scale for Depression (HRSD) and the impact of a treatment is measured by changes in HRSD score between before and after treatment.

Individual studies can be informative, but the best available evidence can be gotten from systematic reviews that use meta-analytical tools and data from many different studies to boost the accuracy of the conclusions if the meta-analysis is carried out without methodological flaws or substantial bias.

The meta-analytic effect size should be HRSD if all studies use it (Hoarder, 2011). If some studies use different effect sizes, it is suitable to use e. g. the standardized effect size Cohen’s d. Cohen’s d is calculated as ([average change in experimental group] – [average change in control group]) / [pooled standard deviation]. To use this as a meta-analytic effect size means calculating one Cohen’s d effect size per study and then weighting together the results from individual studies by either the sample size or 1 / standard error of the mean of each trial. An effect size of, say, d = 0.5 means that the difference between the average improvement in the two groups is half of a pooled standard deviation in favor of the experimental treatment.

Some anti-psychiatry activists (such as Kirsch and Sapirstein (1998) and Kirsch et al. (2008)) even calculate Cohen’s d wrongly by calculating two faulty “effect sizes” per study (one per group) by taking ([before treatment] – [after treatment]) / [pooled standard deviation {using before and after SDs}], then weighting all treatment “effect sizes” and all control “effect sizes” separately and subtracting them. This has been show to systematically underestimate the efficacy of antidepressants (Horder et al., 2011).

What is the effect size threshold gambit and why is it flawed?

This anti-psychiatry gambit is based on setting up a faulty threshold for clinical significance originally proposed in 2004 by National Institute for Health and Care Excellence (NICE) of d = 0.5. The anti-psychiatry researchers then calculate (sometimes wrongly) the effect size for antidepressants and reject the form of treatment when it do not reach this cut-off as “clinically insignificant.” This gambit did not arise out of nowhere, and precursor gambits can be found with erroneously calculating effect sizes (see next section).

The effect size threshold gambit is a highly deceptive maneuver engineered to load the dice against psychiatric medication from the very start. This is because the tactic has several fatal flaws:

1. Arbitrary: the cut-off is completely arbitrary, since there is no reason to prefer d = 0.5 over some other value (Möller, 2008). This is the same problem that faces the incessant obsession with getting a p value under 0.05.

2. No evidence: there are no scientific studies that have established that d = 0.5 is a generally reliable indicator of clinical significance (Möller, 2008).

3. Black-and-white: it assumes that clinical significance is a black-and-white issue rather than a matter of degree (Hegerl and Mergl, 2010).

4. Ignores context: it assumes that clinical significance can easily be determined from just looking at effect sizes in relatively short-term studies without taking into account the broader scientific context (Hegerl and Mergl, 2010).

5. Would reject psychotherapy: since psychotherapy has an effect size that is below d = 0.5, this gambit would also lead you to reject psychotherapy (Cuijpers et al., 2010; DeRubeis, Siegle and Hollon, 2008). Since most anti-psychiatry activists do not reject psychotherapy (although some do), their position would instantly detonate as self-contradictory.

6. Would reject much of general medicine: the threshold is set so high that it would force us to reject a considerable portion of general medicine treatments. In general, psychiatric medication and general medicine medication are comparable in terms of efficacy (Leucht et al., 2012).

7. NICE rejects the cut-off: the cut-off in question comes from the NICE Clinical guideline [CG23] called “Depression: management of depression in primary and secondary care” from 2004 (NICE, 2004a; NICE 2004c, p. 41). However, this was replaced with the updated NICE Clinical guideline “Depression in adults: recognition and management” from 2009 that does not contain this recommendation (NICE, 2004c, NICE, 2009a). It is also not included in the full recommendations (NICE, 2009b). In the new guidelines, clinical significance is treated as a spectrum, not an arbitrary cut-off. Thus, referencing the NICE guidelines from 2004 means referencing material that is 13 years old and 8 years out-of-date. The Jakobsen et al. (2017) paper uses the old name of NICE, namely National Institute of Clinical Excellence, which suggests that they may not even have visited the website to find out about their clinical guidelines.

How the effect size gambit threshold has evolved

Jakobsen et al. (2017) is not the first time the effect size threshold gambit (and the erroneously calculated effect size precursor) has been deployed. It tends to return about once a decade. The first time it seems to have occurred was in Kirsch and Sapirstein (1998) and refuted by Klein (1998). It returned again in Kirsch et al. (2008) and was refuted by data from Turner et al. (2008) and arguments in Turner and Rosenthal (2008).

Conclusion

When looking at systematic reviews and meta-analyses on the topic of the efficacy of antidepressants, identify what effect size (and error bars) the researchers found, if this effect size was calculated correctly and if any arbitrary standards for clinical significance was used in a deceptive way. Also compare their findings with the broader conclusions of the scientific literature on the subject.

Follow Debunking Denialism on Facebook or Twitter for new updates.

References and further reading:

Charles B. Nemeroff, C. B. et al., (2003). Differential responses to psychotherapy versus pharmacotherapy in patients with chronic forms of major depression and childhood trauma. PNAS. 100 (24) 14293-14296.

Cuijpers P., van Straten A., Bohlmeijer E., Hollon S.D., Andersson G. (2010). The effects of psychotherapy for adult depression are overestimated: a meta-analysis of study quality and effect size. Psychol Med. 40(2):211-23.

DeRubeis, R. J., Siegle G. J. and Hollon, S. D. (2008). Cognitive therapy versus medication for depression: treatment outcomes and neural mechanisms. Nature Reviews Neuroscience 9, 788-796.

Hegerl, U. and Mergl, R. (2010). The clinical significance of antidepressant treatment effects cannot be derived from placebo-verum response differences. Journal of Psychopharmacology. 24(4) 445–448.

Horder J., Matthews P., Waldmann R. (2011). Placebo, prozac and PLoS: significant lessons for psychopharmacology. J Psychopharmacol. 25(10):1277-88.

Jakobsen, J. C. et al. (2017). Selective serotonin reuptake inhibitors versus placebo in patients with major depressive disorder. A systematic review with meta-analysis and Trial Sequential Analysis. BMC Psychiatry. 17:58.

Kirsch, I. and Sapirstein, G. (1998). Listening to Prozac but hearing placebo: A meta-analysis of antidepressant medication. Prevention & Treatment, Vol 1(2).

Kirsch I., Deacon B.J., Huedo-Medina T.B., Scoboria A., Moore T.J., et al. (2008). Initial Severity and Antidepressant Benefits: A Meta-Analysis of Data Submitted to the Food and Drug Administration. PLoS Med 5(2): e45.

Klein, D. F. (1998). Listening to meta-analysis but hearing bias. Prevention & Treatment, Vol 1(2).

Leucht, S., Hierl, S., Kissling, W., Dold, M., Davis, J. M. (2012). Putting the efficacy of psychiatric and general medicine medication into perspective: review of meta-analyses. The British Journal of Psychiatry. 200 (2) 97-106.

Möller, H. J. (2008). Isn’t the efficacy of antidepressants clinically relevant? A critical comment on the results of the metaanalysis by Kirsch et al. 2008. European Archives of Psychiatry and Clinical Neuroscience. 258 (8), 451–455.

NICE. (2004a). Depression: management of depression in primary and secondary care (before removal). Accessed: 2017-02-18.

NICE. (2004b). Depression: management of depression in primary and secondary care (current website). Accessed: 2017-02-18.

NICE. (2004c). Depression: management of depression in primary and secondary care (full guidelines, outdated). Accessed: 2017-02-18.

NICE. (2009a). Depression in adults: recognition and management. Accessed: 2017-02-18.

NICE. (2009b). Depression: the treatment and management of depression in adults (updated edition). Accessed: 2017-02-18.

Passer, M., Smith, R., Holt, N., Bremner, A., Sutherland, E., & Vliek, M. (2009). Psychology: The Science of Mind and Behavior. New York: McGraw-Hill Education.

Turner E. H. , Rosenthal R. (2008). Efficacy of antidepressants: Is not an absolute measure, and it depends on how clinical significance is defined. BMJ 336:516-7.

Turner, E. H., Matthews, A. M., Linardatos, E., Tell, R. A., & Rosenthal, R. (2008). Selective Publication of Antidepressant Trials and Its Influence on Apparent Efficacy. New England Journal of Medicine, 358(3), 252-260.

Spread this: