Debunking Denialism

Fighting pseudoscience and quackery with reason and evidence.

Category Archives: Misuse of Statistics

How Anti-Immigration Activists Misuse Lethal Violence Statistics

Lethal violence in Sweden

The Internet is a wonderful thing. It has put the combined knowledge mass of the human species in a format that is easily accessible by billions of people hungry for scientific and historical facts about the world. The Internet, however, has also brought with it the possibility of spreading misinformation and nonsense at a rate that was never before possible. Someone can post an incendiary fake news story that inspire fear and anger about some real or imagined event during breakfast and before the evening has set in, the story has been shared hundreds of thousands of times on social media, provoked and misled millions of people and sometimes even made it into the mainstream media.

Sweden has recently become a target of various political outbursts designed to spread fear, anger and misinformation about refugees and immigrants. In reality, Sweden is a country that has not declared a single war since 1814 and is one of the best countries in the world to live in based on dozens of different metrics such as safety, education, health care, happiness and so on. In the dark and damp places of the Internet, however, Sweden is wrongly portrayed as a hellhole where murder and rape are out of control, criminal gangs have taken permanent control over several dozen areas and the radical feminist government and the media are actively covering it all up.

The reality, of course, is entirely different. Anti-immigration activists abuse rape statistics, the rape definition used in Swedish law has expanded multiple times since the late 1990s, the propensity to report rape has doubled in recent years and the police records each individual case as a separate police report. Two reports published by the Swedish federal police has shown that although there are social problems in especially vulnerable areas, but the idea that they are somehow no-go zones is a propaganda myth and the police works there every single day.

Read more of this post

More Deadly Doctor Gambits: Mortality Rates and Doctors on Strike

a deadly doctor?

Mainstream medicine and medical doctors are increasingly under attack. Homeopaths, naturopaths and other groups pretend to have real medical knowledge and subject vulnerable and sick people to fake “treatments” that has never been scientifically tested or has been tested and shown not to work.

Concerned parents think that a few hours spent reading blogs written by medically unqualified individuals without any credible sources gives them more knowledge and diagnostic skills than medical doctors. Cranks on the Internet who are literally making stuff up about avocados, sun-staring and bleach as a cure for all diseases amass an audience of millions for their nonsense assertions that fly in the face of rational thinking and published scientific evidence.

Even more sinister, quacks attempt to portray medical doctors as a dangerous threat to the health and life of patients. This is often done by deploying the so-called deadly doctor gambit (originally discussed by C0nc0rdance) by falsely claiming that doctors are more dangerous than guns. In reality, this argument falls apart when you realize that there are 33 million hospitalizations per year in the U. S. so the relative risk from guns is much higher than doctors. As an analogy, there are more people who die from car accidents than from motor cycle accidents, but since there are many more people driving cars than motor cycles, motor cycles are more dangerous per capita drivers.

Thus, many of the hateful alternative medicine proponents that would say almost anything to make doctors seem evil really just ignorantly misunderstand and abuse basic statistical considerations.

Read more of this post

At Least 60% of Reported Shootings in Malmö Not Actual Shootings

Malmo gun violence

International news media seems to think that the Swedish city of Malmö is being overrun by gun violence. In reality, the definition of a shooting differs between Malmö and the other two main cities. In Malmö, it does not have to be an actual gun that was fired and there is no requirement for forensic or eyewitness testimony. The geographical areas also differ, with Malmö only covering the main city and the urban area Arlöv. Once researchers looked through the data and counted the number of actual shootings, the figures dropped by 60-75%. The Malmö police does offer some justifications of their classification scheme, but in the end, organizations that gather statistics have an intellectual responsibility to ensure that their data are not easily abuse by being clear with definitions and what can and cannot be inferred from their data.

A recent analysis by Swedish Council for Crime Prevention was covered in the newspaper Sydsvenskan. The sad reality is that fear propaganda gets front page news, whereas a careful statistical analysis gets only a small notice in most papers.

What is a shooting event?

Perhaps surprisingly, the definition of a shooting differs drastically between the three largest cities in Sweden (Stockholm, Göteborg and Malmö). In Stockholm and Göteborg, a shooting requires two criteria: (1) the discharge of a gunpowder-loaded projectile and (2) corroborating forensic evidence or eyewitness (or earwitness) testimony. In Malmö, a lot of other things are also classified as a shooting, such as firing of airsoft guns, slingshots, damages to windows that look like a bullet hole and even damages done by stone chips. Thus, the figures cannot be naively compared because they do not measure the same things.

Read more of this post

Apparently, NHST Defenders Could Get Even More Ridiculous

Häggström and NHST, again

Looks like Häggström has decided to re-join the crucial discussion of p values and NHST again, despite refusing to continue after our last encounter because he claimed (without evidence) that my writings were a “self-parody”. This is reminiscent of childish and narcissistic posters on Internet forums who writes a post about how they are leaving the forum because of this or that perceived insult, yet stays around to continue posting. Tiresome and pathetic, especially since he apparently considers a link to the ASA position statement on Twitter to be equivalent to “spewing Twitter bile”. Talk about being easily offended to even the smallest amount of (fair) criticism.

Häggström recently managed to get a paper of his defense of NHST published in the journal Educational and Psychological Measurement. Perhaps “managed” is not quite the correct word, as it is a journal with a very low impact factor of 1.154 and is either in the middle or the bottom half of journals in mathematical psychology (8 out of 13), educational psychology (30 out of 50) and interdisciplinary applications of mathematics (46 out of 99). Perhaps a low quality psychology journal is the only place Häggström can get his rabid defense of NHST published? Well, that and a paper from a conference held in Poland. Not exactly impressive stuff.

Ironically, at the very same day he wrote his blog post about his new “paper”, the prestigious American Statistical Association published a position statement severely criticizing NHST. A previous article on this blog discusses several aspects of it in greater detail. Häggström claims that he agrees with the ASA, yet his paper in EPM attempts to refute NHST critics, both those he call “strongly anti-NHST” and those he labels “weakly anti-NHST”.

Some of the problems with new NHST defense by Häggström

There are too many errors and problems in his paper to recount in this space, but we can look closer at a couple of them:

(1) Häggström presents the NHST situation as a debate, thereby committing the fallacy of false balance.

There is no debate about NHST. The vast majority of papers published discussing NHST are very critical and there have been hundreds and hundreds of such papers published in the past 20 years. Today, there are hardly any papers published defending NHST and those that do defend NHST are few and far between. This shows that Häggström does not have a sufficient command of the NHST literature which is, as we shall see, a recurring theme. It also demonstrates that he most likely deliberately deploys a pseudoscientific debating methods against his opponents called false balance. This is because he, as a self-identified scientific skeptic with substantial experience from the fight against climate change denialists, knows full well that it is socially effective to attempt to undermine the scientific consensus position by portraying it as if there were a debate with two equally legitimate sides. It is not.

Read more of this post

American Statistical Association Seek “Post p < 0.05" Era

American Statistical Association

The edifice of null hypothesis significance testing (NHST) is shaken to its core once more. On March 6th, the American Statistical Association (ASA) revealed to the world that they’d had enough. For the first time in its history since being founded in 1839, they published a position statement and issued recommendations on a statistical issue. This issue was, of course, p values and statistical significance. The position statement came in the form of a paper in one of their journals called American Statistician, together with a press release on the ASA website. The executive director of ASA, Ron Wasserstein, also gave an interview with Alison McCook at the website Retraction Watch and the Nature website has a news item about it.

What was the central point of the position statement?

The press release (p. 1) summed it up quite nicely:

“The p-value was never intended to be a substitute for scientific reasoning,” said Ron Wasserstein, the ASA’s executive director. “Well-reasoned statistical arguments contain much more than the value of a single number and whether that number exceeds an arbitrary threshold. The ASA statement is intended to steer research into a ‘post p <0.05 era.'"

In other words, ASA acknowledges that p values was not supposed to be the central way to evaluate research results, that basing conclusions on p values and especially if the results are statistically significant or not cannot be considered well-reasoned and finally, that the scientific community should move in a direction that severely de-emphasize p values and statistical significance. Coming from a world-renowned statistical association, this is a stunning indictment of the mindless NHST ritual.

The final paragraph of the preamble to the position statement (p. 6) also points out that this criticism of NHST is not new:

Let’s be clear. Nothing in the ASA statement is new. Statisticians and others have been sounding the alarm about these matters for decades, to little avail. We hoped that a statement from the world’s largest professional association of statisticians would open a fresh discussion and draw renewed and vigorous attention to changing the practice of science with regards to the use of statistical inference.

ASA seems to share the sentiment among many critics of NHST, namely that there are several valid objections to NHST and that these have been raised as very serious problems for many decades with very little progress.

Read more of this post

How Anti-Immigration Activists Misuse Rape Statistics

Nationella Trygghetsundersökningen

The Internet has brought an enormous mass of knowledge to the fingertips of everyone with a computer, smartphone or tablet. Never before have so many individuals been so close to true scientific facts about the world, from fun facts about animals to the latest crime statistics. Large communities with blogs, forums and social media groups have grown up around a wide variety of special interests and it has become a powerful tool for communication, cooperation and the advancement of human knowledge.

However, this has also led to the creation of ideologically isolated Internet communities, where faulty claims and misunderstandings of statistics and empirical evidence gets repeated in an endless echo chamber and all refutations are either ignored, misrepresented or subjected to ideologically driven rejection, often with stale references to supposed “political correctness”, as if that was a statistically mature rebuttal.

This article will show that according to crime victim surveys, the actual rate of sex crimes has been more or less unchanged in Sweden between 2005 and 2014, despite the fact that immigration has increased during the same time period. Instead, the increasing rates of reported rapes are influenced by expansion of the legal rape definition, an increase in the tendency to report rapes, police efforts to classify each individual rape as a separate crime and their tendency to classify any sex crime that could potentially be rape as rape. It will also demonstrate that reported rates between countries such as Sweden and Denmark cannot be naively compared to do the large difference in legal rape definition and police registration methods.

Read more of this post

New Nature Methods Paper Argues that P Values Should be Discarded

Fickle P Values

In the wake of the recent discussions about null hypothesis statistical significance testing and p values on this website, Häggström has decided not to respond beyond calling the latest installment in the series nothing more than a “self-parody”. No substantial statistical or scientific arguments were presented. Despite his unilateral surrender, it can be informative to examine a method paper entitled “The fickle P value generates irreproducible results” written by Halsey, Curran-Everett, Vowler and Drummond (2015) that was just published in the renowned Nature Methods journal that slammed the usage of p values. The authors even call for a wholesale rejection of p values, writing that “the P value’s preeminence is unjustified” and encouraging researchers to “discard the P value and use alternative statistical measures for data interpretation”.

As expected, it corroborates and confirms a large number of arguments I presented in this exchange and directly contradicts many of the flawed assertions made by Häggström. In fact, the paper goes even further than I have done, (1) showing that p values are unstable to direct replication even at levels of statistical powers that are generally considered to be acceptable (e.g. 0.8), (2) that p values are probably superfluous for analysis with adequate statistical power and (3) that previous research that relied on p values needs to be reexamined and replicated with proper methods for statistical analysis.

Read more of this post

Häggström on NHST: Once More Unto the Breach

Häggström, round four

It appears that Häggström still refuses to address the major criticisms laid out against NHST. In the addition he wrote to his previous post, he continues to engage in personalities and develops his tendency to mischaracterize my position into a genuine art form. Contrary to Häggström, I actually do think that people can often be statistically or scientifically non-naive yet promote naive beliefs and positions. That is the very definition of selective skepticism, that we all know is widespread. By claiming that “accept” is a legitimate NHST synonym for “not reject”, Häggström inadvertently show that NHST has to carry some of the responsibility for common misconceptions, such as confusing statistical non-significance with equivalence. I go into greater detail about how the popular R. A. Fischer quote that statistical significance either means that the null hypothesis is false or that something unlikely has occurred is false with the counterexample of large sample sizes. Finally, I reiterate the many criticisms that Häggström has either failed to respond to, or “responded to” by making faulty straw man assertions about what my position was.

Häggström, despite correction, fails to distinguish between person and argument

His recent response comes with another spate of attempted insults and engagement in personalities. This time, he alleges that I am a “disinformant” and a “silly fool”. Not only that he has now started complaining that my tone is “precocious” and “patronizing”. He even goes so far as to arbitrarily attribute emotions to me when he claims that I “angrily attack” NHST. Yet none of this constitute actual substantive arguments. None of it implies that any of my arguments are mistaken and none of it implies that Häggström is correct.

I can think of nothing else to do but to quote what I wrote in the latest post: “The more my opponent dwell on my alleged personal traits or failings and make liberal use of invectives, the more they demonstrate that they are (1) unable to distinguish between an argument and the person making that argument, (2) have reduced capacity for emotional regulation and (3) tacitly admit that they do not have much in way of substantive arguments against my position. Their behavior does not harm me in any way. In fact, I find it endlessly entertaining. All they are doing is harming their own capacity to accurately perceiving reality.” Not that I think the message will get across. So by all means, I hope that Häggström continues to engage in personalities, since it is just a repeated demonstration of (1)-(3). He is doing all the work for me. Fantastic.

Read more of this post

Häggström Disrobed on NHST

Häggström, round three

In previous posts, I criticized the doomsday arguments made by some NHST statisticians about the recent banning of null hypothesis significance testing (NHST) as well as debunked the objections leveled against Geoff Cumming’s dance of the p value argument. This has now drawn the attention of mathematical statistician Olle Häggström and prompted him to write a response post to yours truly. He spends most of the post engaging in personalities and raving about perceived injustices he thinks I subjected him to, but he eventually discuss two examples where he thinks I have gone astray. Unfortunately, his first example is a trivial misreading of what I wrote as well as a quotation out of context. The second example, where he provides a situation where he thinks NHST is essential, is only slightly better. In the end, he fails to successfully rebuke any of my substantial arguments.

Häggström excessively engages in personalities

Because I have argued against pseudoscience for many years, I have developed a thick skin and a laser-like mentality trained at cutting through the nonsense. The more my opponent dwell on my alleged personal traits or failings and make liberal use of invectives, the more they demonstrate that they are (1) unable to distinguish between an argument and the person making that argument, (2) have reduced capacity for emotional regulation and (3) tacitly admit that they do not have much in way of substantive arguments against my position. Their behavior does not harm me in any way. In fact, I find it endlessly entertaining. All they are doing is harming their own capacity to accurately perceiving reality.

Read more of this post

The Laughable Desperation of NHST proponents

Häggström again

In a previous post, the many insurmountable flaws and problems of null hypothesis statistical significance testing (NHST) were discussed, such as the fact that p values are only indirectly related to the posterior probability, almost all null hypotheses are false and irrelevant, it contributes to black-and-white thinking on research results, p values depends strongly on sample size, and it is unstable with regards to replication. For most realistic research designs, it is essentially a form of Russian roulette. After a mediocre effort, mathematical statistician Olle Häggström failed to defend p values and NHST from this onslaught. Now, he was decided to rejoin the fray with yet another defense of NHST, this time targeting the dance of the p values argument made by Geoff Cumming. Does his rebuttal hold water?

Arguing from rare exceptions does not invalidate a general conclusion

Häggström seems to be under the impression that if he can find rare and complicated counterexamples, he can undermine the entire case for confidence intervals [being generally superior to p values, see clarification here]. (all translations are my own):

To calculate a confidence intervals is akin to calculating p values for all possible parameter values simultaneously, and in more complex contexts (especially when more than one unknown parameter exists) this is often mathematically impossible and/or lead to considerably more complicated and difficult-to-interpret confidence regions than the nicely intervals that are obtained in the video.

This is perhaps due to his background in mathematics where a single counterexample really does disprove a general claim. For instance, the function f(x) = |x| is continuous but not differentiable, thus disproving the claim that continuity implies differentiability. In the case of confidence intervals, on the other hand, the fact that they work in cases with a single parameter is enough to justify their usage. Keeping in mind that the vast number of experiments done in e. g. medicine are probably not complicated estimations of multiple population parameters, but more akin to measuring the effects of a medication compared with placebo, the superiority of confidence intervals over p values for a large portion of experiments stands. Yes, obviously we need more sophisticated statistical tools in more complicated experiments, but that is not a valid argument in the surrounding where they can be calculated and where they do work.

Finally, Häggström continues to refuse the fact that confidence intervals can be dislodged from the framework of NHST. Read more of this post

Debunking Statistically Naive Criticisms of Banning P Values

Häggström Hävdar about NHST

Olle Häggström is a mathematical statistician from Chalmers University of Technology and a prominent scientific skeptic. His projects and papers relevant for skepticism include several hard-hitting defenses of good science, such as opposing pseudoscience about climate change, criticizing the encroachment of postmodernism into higher education and exposing the intelligent design creationist abuse of the No Free Lunch (NFL) theorems. However, he also promotes unsupported beliefs about NHST, mathematical platonism and artificial general intelligence, thus making him another example of an inverse stopped clock.

Recently, Häggström wrote a credulous blog post where he exclaimed that banning NHST from the journal would constitute intellectual suicide by BASP. In it, he repeats a number of errors that he has done before and adds on a few others.

The only things about NHSTP and confidence intervals that are “invalid” are certain naive and inflated ideas about their interpretation, held by many statistically illiterate scientists.

In this sentence, Häggström deploys the classic rhetorical technique whereby he says that the NHST procedure itself is not flawed, only that many scientists misuse it. This was refuted in a previous post on Debunking Denialism that strongly criticized NHST: “[a] method like NHST that has such a strong potential for misunderstandings and abuse even among a large proportion of the most highly intelligent and highly educated has to accept a large proportion of the blame.” But even if we ignore that, NHST is flawed for a great number of reasons.

First, the p value is only indirectly related to the posterior probability. This means that a low p value is not a good argument against the null hypothesis because the alternative hypotheses might be even more unlikely. If you test homeopathy for cancer or the alleged psychic ability of someone, it is not really that impressive to find a p value that is lower than 0.05 (or lower than 0.0001 or whatever). Even testing moderately unlikely hypotheses (with an empirical prior of anywhere between, say, 10% and 30%) means that the p value is not a good measurement of posterior probability.

Second, null hypotheses are almost always both false and irrelevant. Read more of this post

Why P-Values and Statistical Significance Are Worthless in Science

P-values are scientifically irrelevant

Why should we test improbable and irrelevant null hypotheses with a chronically misunderstood and abused method with little or no scientific value that has several, large detrimental effects even if used correctly (which it rarely is)?

During the past 60+ years, scientific research results have been analyzed with a method called null hypothesis significance testing (NHST) that produce p-values that the results are then judged by. However, it turns out that this is a seriously flawed method. It does not tell us anything about how large the difference was, the precision estimated it or what it all means in the scientific context. It tests false and irrelevant null hypotheses. P-values are only indirectly related to posterior probability via Bayes theorem, what p-value you get for a specific experiment is often determined by chance, the alternative hypotheses might be even more unlikely, it increases the false positive rate in published papers, contributes to publication bias and causes published effect sizes to be overestimated and have low accuracy. It is also a method that most researchers do not understand, neither the basic definitions nor what a specific p-value means.

This article surveys some of these flaws, misunderstandings and abuses and looks at what the alternatives are. It also anticipates some of the objections made by NHST supporters. Finally, it examines a case study consisting on an extremely unproductive discussion with a NHST statistician. Unsurprisingly, this NHST statistician was unable to provide a rationally convincing defense of NHST.

Why NHST is seriously flawed

There are several reasons why NHST is a flawed and irrational technique for analyzing scientific results.

Statistical significance does not tell us what we want to know: A p-value tells us the probability of obtaining at least as extreme results, given the truth of the null hypothesis. However, it tells us nothing about how large the observed difference was, how precisely we have estimated it, or what the difference means in the scientific context.

The vast majority of null hypotheses are false and scientifically irrelevant: It is extremely unlikely that two population parameters would have the exact same value. There are almost always some differences. Therefore, it is not meaningful to test hypotheses we know are almost certainly false. In addition, rejections of the null hypothesis is almost a guarantee if the sample size is large enough. In science, are we really interested in finding if e. g. a medication is better than placebo. We want to know how much better. Therefore, non-nil null hypotheses might be of more interest. Instead of testing if a medication is equal placebo, it can be more important to test if a medication is good enough to be better than placebo in a clinically meaningful way.

Read more of this post

Choking the Black Swan: GM Crops and Flawed Safety Concerns

Failure of precuationary principle

Despite the fact that the technology behind genetically modified crops has been around as long as Commodore 64 and been shown to be safe in hundreds of studies, anti-GM activists continue to spread misinformation.

Recently, a paper on the precautionary principle in relation to genetically modified foods has been making rounds in the anti-GM social media circles. One of the authors is statistician Nassim Nicholas Taleb, who has previously written books such as The Black Swan on the impact of low-probability events. The other two authors are physicist Yaneer Bar-Yam, and politician-philosopher Rupert Read. They attempt to develop an improved version of the precautionary principle in an effort to undermine the usage of GM crops.

What can a thinly veiled anti-GM paper written by a physicist, a politician and a statistician teach us about the risks of genetically modified foods? Unfortunately, it is just more of the same illusionary sophistry common among anti-GM activists.

Read more of this post

Risk Factors: Misunderstandings and Abuses

Risk factors

Although risk factors occupy a central place in medical and epidemiological research, it is also one of the most misunderstood concepts in all of medicine.

The World Health Organization (2009) defines a risk factor as: “A risk factor is any attribute, characteristic or exposure of an individual that increases the likelihood of developing a disease or injury. Some examples of the more important risk factors are underweight, unsafe sex, high blood pressure, tobacco and alcohol consumption, and unsafe water, sanitation and hygiene.” The CDC (2007) offers a similar definition: “an aspect of personal behavior or lifestyle, an environmental exposure, or a hereditary characteristic that is associated with an increase in the occurrence of a particular disease, injury, or other health condition.” However, the CDC also uses the term risk factor when it comes to sexual violence. For instance, they consider alcohol and drug use, antisocial tendencies, hostility towards women, and community-level tolerance to sexual violence.

Based on these sources, we can develop a simplified definition of a risk factor: if A is a risk factor for B, then the presence of A increases (but not necessarily in a causal sense) the probability of B occurring.

A is a risk factor for B does not necessarily mean that A causes B. It might be the case that A causes B only indirectly via some third factor, that B causes A, or that some third factor causes both A and B. In other words, correlation does not on its own imply causation. However, it is possible to disentangle these possibilities by measuring B at the start of the study. If physical punishment of children is a risk factor for aggressiveness, we can find out what comes first by measuring baseline child aggressiveness.

A is a risk factor for B does not mean that A will cause B in every instance of A. Smoking causes lung cancer, but some smokers can smoke all their lives without developing lung cancer. This does not mean that smoking is not a cause of lung cancer. It just means that there are other factors that also play a role. It is common for pseudoscientific cranks to bring up exceptions of this kind to argue against a correlational or causal association in an effort to spread uncertainty and doubt. Read more of this post

Investigative Skepticism Versus the Mass Media

Relationship violence against men

We are constantly being bombarded with messages from newspapers, television, blogs and social media sites like Facebook and Twitter about alleged facts, recently published scientific studies and government reports. With the knowledge that the mass media often get things wrong when it comes to science, how can you separate the signal from the noise?

One popular approach is to check what many different news organizations has to say about the issue. However, this ignores the fact that many websites just rewrite stories they have seen on other websites. Some even go so far as to just copy/paste press releases. In the fast-paced world we live in, getting the “information” out there as fast as possible has apparently come to triumphs scientific and statistical accuracy. This problem is aggravated in cases when the misinterpretation fits snuggly within a particular political or philosophical worldview (e. g. some conservative groups and climate change denialism). Another approach is limiting yourself to only reading news from websites that fit with your own positions. However, this leaves you open to considerable bias. The classic example is anti-immigration race trolls who only read “alternative media”, which tend to twist a lot of the news item they publish to fit with their agenda. A third approach is a combination of the two above: only believe things that news organizations with radically different stances agree on. The downside to this is that it almost never happens with issues that are scientifically uncontroversial, but controversial in the eye of the public (climate change being the obvious example).

This post will outline an explicit investigative method based on scientific skepticism designed to find out the truth behind popular stories on science. To illustrate it, a case study of mass media treatment of two new Swedish studies on relationship violence against men will described Read more of this post

%d bloggers like this: