Harbingers of Doom – Part V: Botching Philosophy of Science

Here Be Dragons

Previously, we have dealt with a broad range of issues such as the intricate details about medieval maps, biological weapons of mass destruction, anti-psychiatry nonsense about psychopharmacology and changes in diagnostics of social anxiety, misunderstandings of heritability and the question of whether repeated selection of embryos can produce massive gains in IQ, the biological basis of the mind, cryogenically freezing your dying body, uploading your consciousness to a computer server, superintelligent AI risk and the futility of atomically precise manufacturing, at least as traditionally conceived.

In this latest installment, we look at everything from ancient science to statistical significance. Was there no science in antiquity and almost all philosophers just sat around and thought about stuff? Does science desperately need induction? What does it mean for evidence to independently converge on the same general conclusion? What about inferences to the best explanation? Is past experience on dawn the only reason why we might suspect that dawn will also occur tomorrow? Does scientific research fail because the observation of a yellow banana allegedly support the hypothesis that all ravens are black? What is falsifiability? Why is the Duhem-Quine thesis not a large threat to science? How do we know that solipsism is incoherent? We also revisit our favorite bad statistical method NHST, which Häggström continues to defend with teeth and claw.

Section XLI: Science in ancient antiquity

Häggström repeats the classic fallacy that there was hardly any empirical scientific research in antiquity (p. 140) because almost all philosophers (he sparingly mentions Sextus Empiricus as an exception in passing in footnote 320 on the same page) just sat around and thought about stuff instead of testing their ideas. This turns out to be a historical myth because there were a ton of empirical research and development being done in antiquity.

Carrier (2010, pp. 396-419) mentions several examples: Aristotle (animal dissection and live experimentation), Theophrastus (plant physiology, botany, mineralogy, Strato of Lampsacus (physics), Ctesibius and Philo (experimental pneumatics), Eratosthenes (cartography, estimating circumference of the planet, effect of moon on tides), Herophilus (dissection of dead humans), Erasistratus (early mapping of brain function and nerves), Aristarchus (distances between heavenly body and proposed heliocentrism) Archimedes (mechanics and hydrostatics), Hipparchus (discovered precession and supernovas, predicted solar and lunar eclipses), Seleucus of Babylon (effect of sun on tides), Dioscorides (botany, mineralogy, pharmacology), Hero (pneumatics), Ptolemy (astronomy, optics, cartography) and Galen (anatomy, physiology and medicine).

These are just a few examples and we have not even begun to look at ancient Middle East or China. Had Häggström consulted literature he would have found source material such as:

Keyser, P. T., & Irby-Massie, G. L. (Eds.). (2009). Encyclopedia of Ancient Natural Scientists: The Greek Tradition and its Many Heirs. New York: Routledge. (1072 pages)

Rihll, T. (1999). Greek Science. New York: Oxford University Press.

…or any of the books in the Routledge Sciences of Antiquity Series, such as “Ancient Botany”, “Ancient Medicine”, “Ancient Meteorology”, “Ancient Natural History”, or “Cosmology in Antiquity”.

The belief that there was no or very little empirical science in antiquity is essentially a historical fabrication and because Häggström did not properly research this area (which is a recurring theme when it comes to areas outside his immediate expertise), he is vulnerable to making these kinds of blatant errors (like he did with dragons in ancient maps as we saw in the first part of this series.) and most claims about biology.

Section XLII: An elusive Feynman quote

A few pages later (p. 143), Häggström presents us with a quote he attributes to Feynman: “Science is what we have learned about how to keep from fooling ourselves”. However, he admits that he has no idea where this quote is from (footnote 324, p. 143). Then how can he be so sure it was said by Feynman? This might seem like a tiny thing to complain about, but it is part of a consistent pattern where Häggström does not properly research his claims.

One of the best resources for finding and verifying quotes is Wikiquote, and the closest said by Feynman is “The first principle is that you must not fool yourself, and you are the easiest person to fool.”, that originally comes from a talk called “What is and What Should be the Role of Scientific Culture in Modern Society” from the Galileo Symposium in Italy in the early 1960s, and it also makes an appearance in a commencement address at Caltech (Cargo Cult Science) a decade later and also occurs in print in the book “Surely You’re Joking, Mr. Feynman!” (1985, p. 343).

I suspect that the quote was invented by grabbing the gist of the following part of the commencement address:

But this long history of learning how to not fool ourselves—of having utter scientific integrity—is, I’m sorry to say, something that we haven’t specifically included in any particular course that I know of. We just hope you’ve caught on by osmosis.

The first principle is that you must not fool yourself—and you are the easiest person to fool. So you have to be very careful about that. After you’ve not fooled yourself, it’s easy not to fool other scientists. You just have to be honest in a conventional way after that.

This kind of analysis would have been simple for Häggström to perform. Only a few keystrokes away.

Section XLIII: Abduction

Häggström claims that scientific arguments are typically classified as either deductive or inductive (p. 143), but this is largely an outdated view. For instance, hypothetico-deductive arguments uses both (deduction for deducing a prediction from a model, induction to make claims about broader reality from a disconfirming observation). It is also wrong because modern scientific arguments are typically done through methods we might call “independently converging evidence from multiple sources” and “inference to the best explanation” (or abduction). A single fossil, no matter how impressive, demonstrates common descent and no single climate measurement, no matter how reliable, demonstrates global warming. These well-supported scientific models are instead accepted because of a massive amount of evidence from different sources, different research teams and different methods all converge on the same general conclusions and no other model can better explained the observed data. Reading almost any paper in molecular biology will show this, since most papers use a multitude of methods to probe the suggested model they are working with.

Section XLIV: Will there be a dawn tomorrow?

His discussion of the problem of induction is very weak, because ideas such as “there will be a dawn tomorrow” is not merely based on the observation that dawn has previously occurred. It is based on the massive amount of independently converging evidence for all the scientific models that have some implication for the issue (space-time symmetries, physical conservation laws etc.), philosophical considerations (macroscopic causality), historical necessity (life could never have evolved in a universe that was not subject to physical, chemical and biological regularity) and direct observations. If there would be no dawn tomorrow, that would mean that the earth would stop spinning about its axis, decelerating from 1200 km/h at the equator to ~0 (or at least very close to 0 in comparison with 1200 km/h). This would likely have enormous structural impacts on the world around us, from broken roads, shattered cities, crooked masses of water. Anytime you do not observe this, you can be reasonably sure there will still be dawn tomorrow.

Because of independently converging evidence from multiple sources and inference to the best explanation, we do certainly not “desperately need induction (or something like it) in science” (p. 144). In particular, science is not in the business of producing logically true statements, but accurate assessment of the evidence for any given claim. Thus, Häggström is overstating the dangers of the problem of induction for the natural sciences.

This is also clear because we can always apply “induction problem”-like attacks against deduction of the kind “you say deduction works now, but how do you know deduction will continue to work in the future”, which in Häggström’s view would leave us without any possibility of knowledge, a stance he himself rejects as absurd (p. 151). It is, of course, absurd, since it is logical impossible and self-refuting to claim to have the knowledge that no knowledge exists.

Section XLV: The raven paradox declined

Häggström continues his tedious and laborious discussion of induction by trotting out the raven paradox originally devised by Hempel in the 1940s. The claim (a) “all ravens are black” are logically equivalent to (b) “all non-ravens are non-black”, he says, so therefore, an observation of a yellow banana is evidence that all ravens are black. If this seems like a dishonest sleuth of hand it might be because of the fact that it is.

Häggström spends only half a page to this issue, but refuses to discuss the large literature devoted to this apparent paradox, including the proposed solutions. He does not even bother to discuss Hempel’s (1945) own resolution, which is that it is only an apparent paradox because of our presuppositions about the world. If the claim instead was “all sodium salts burn yellow”, the observation of a “non-yellow burning non-sodium substance” (such as melting ice) is indeed something we should expect based on the model that all sodium salts burn yellow.

Häggström also does not consider the fact that all observations are not equally strong evidence for a position. This boils down to the fact that there are loads more non-black non-ravens than black ravens. This is sometimes known as the Bayesian approach (Good, 1960), but there are many versions of this general reply.

Furthermore, the two statements might not truly equivalent. Finding a black raven disproves the negation of “all ravens are black”, but finding a yellow banana does not really count against “all ravens are black” or its negation. Thus, a black raven has some properties that a non-black non-raven lacks, and so they cannot be completely equivalent (Scheffler and Goodman, 1972).

Again, this shows the deficiencies in the background research carried out by Häggström. He will probably retort that he could not cover everything in just ~280 pages, but then he should either have not included such a shallow and misleading treatment or spent the time on it. There are no shortcuts to good reasoning.

Section XLVI: Falsifiability, corroboration and the Duhem-Quine thesis

Häggström ponders the question of whether corroborations strengthens our belief in a model on Popperian falsification (p. 147). This is undoubtedly true because a model that pass stringent and dangerous tests are more likely to be accurate that if it had failed. A person who completes a thousand triathlons are probably more likely to be fit than a person who cannot do a single one. Issues like “how do you know he will keep being fit” or “what if it is just a hallucination” are at best red herrings. Or to put it more consistently, abduction is not induction.

He also seems to confuse practical issues with fundamental (pp. 147-148), when he talks about the claim that “not all ravens are black” or “half of all ravens are black” as being unfalsifiable. They are not. Falsifiability is not dependent on pragmatic considerations. Even if myself and Häggström cannot test, say, the claim that a certain comet has a very long orbit and is only visible in 2120, it is still falsifiable, because there are observations that, if true, would refute or at the very least strongly count against such a hypothesis. Falsifiability is rather about identifying ideas or models that are consistent with all observations. Such models are so vague and expansive that their scientific value is negligible or non-existent. If a model claims to explain everything, it can explain nothing.

A similar confusion occurs for his discussion of the Duhem-Quine thesis (p. 149), which basically states that you never test scientific model in isolation, but rather in conjunction of the model with a list of auxiliary hypotheses. While this is true, it is not at all a threat to science. This is because while a proponent of a model can choose to reject an auxiliary hypothesis to retain the core hypothesis, this auxiliary hypothesis can itself be tested under circumstances that are acceptable to both proponents and opponents.

Furthermore, all ad hoc hypotheses are not equal. Some ad hoc hypotheses increase the falsifiability of the core hypothesis (because it increases the number of testable predictions made from the conjunction), whereas some decrease it or leave it constant. Thus, we can decide to accept only those ad hoc hypotheses that increase falsifiability and ignore those that do not. This is precisely what happened in the anomalous orbit of Uranus case.

Section XLVII: Solipsism is incoherent

Häggström claims there is no way to prove the existence of an objective reality, but references various pragmatic objections to epistemic relativism (p. 151). In reality, solipsism is incoherent. This is because the only way to attempt to justify solipsism is to reference either logical arguments or empirical evidence. But since solipsists deny (or at least not accept) the existence on an objective reality, empirical evidence is out of the question. Logic on solipsism would be on the same level as subjective beliefs, which of course cannot prove a statement. So that means that logic is out of the question as well, thereby making solipsism fundamentally unjustifiable and incoherent. Here is Thornton’s entry Solipsism and the Problem of Other Minds in the Internet Encyclopedia of <a Philosophy:

One might even say, solipsism is necessarily foundationless, for to make an appeal to logical rules or empirical evidence the solipsist would implicitly have to affirm the very thing that he purportedly refuses to believe: the reality of intersubjectively valid criteria and a public, extra-mental world. There is a temptation to say that solipsism is a false philosophical theory, but this is not quite strong or accurate enough. As a theory, it is incoherent.

Section XLVIII: Is speculation about intelligence explosions scientific?

Perhaps the most crucial section in this chapter and possibly the reason for why it was included is the issue of whether or not speculations about an intelligence explosion can count as scientific or if it is pseudoscience or perhaps just non-science. Häggström lists what is presumably the three most impressive pieces of evidence for an upcoming intelligence explosion (italics in original):

First of all, contemporary thinking about the nature and possible consequences of a breakthrough in artificial intelligence is not pure speculations in the sense of being isolated from empirical data. On the contrary, it is fed with data of many different kinds. Examples (some of which were touched upon in Section 4.5) include (a) the observed exponential growth of hardware performance known as Moore’s law, (b) the observation that the laws of nature have given rise to intelligent life at least once, and (c) the growing body of knowledge concerning biases in the human cognitive machinery […]

To be honest, this is a bit underwhelming.

Häggström appeals to Moore’s law as evidence for an upcoming intelligence explosion despite having substantially downplayed it earlier in the book (p. 97), suggesting that some versions of the law is already broken (footnote 246, p. 109), calling it a straw man argument leading to “pretty stupid discourse” among skeptics (footnote 250, p. 110) and even went so far as stating that it isn’t actually a prediction made by intelligence explosion thinkers (p. 109) and that people like Yudkowsky has distanced himself from arguments based on Moore’s law (p. 109, but also see footnote 248 on same page). Judging by the many contradictions, Häggström seems ambivalent about Moore’s laws and attempt to keep his cake and eat it too.

The observation that life exists is hardly evidence for an intelligence explosion within 100 years or so because it is just as compatible with a very slow emergence of computer intelligence over centuries or no intelligence explosion at all. As an analogy, rats exists, but that is not evidence that rats with a mass of two tons might exist in the future (in fact, basic physical considerations rule this out). The existence of cognitive biases is also not really evidence for an intelligence explosion. Quite the contrary, because such an observation might lead us to think that intelligence explosion thinkers might be suffering from these very same cognitive biases. If this is the best case Häggström can make for why intelligence explosion speculations are scientific, his position is in deep trouble.

Häggström continues with an extremely vulgar comparison between intelligence explosion speculation and climate predictions (pp. 152-153). These are not at all comparable (climate science are eons ahead of intelligence explosion thinkers in terms of evidence) and it gives no support for the supposed scientific status of the former whatsoever. Häggström even admits that this is a false comparison (p. 154). A better comparison is to religious apologists predicting the end of the world within our life time. In fact, we shall later see why the core ideas in the speculations about intelligence explosion is essentially Pascal’s wager.

He also seems to believe that it is up to skeptics to show that intelligence explosion speculations are unscientific (p. 154), but in reality, of course, it is up to proponents to show that it is scientific. Häggström does not accomplish this and he likely knows it, because he complains about it being a “poorly developed area” or “difficult”, but there are many areas that are poorly developed and difficult where it is easy to decide if it is scientific or not. The take-home message is that the burden of evidence is on the proponents of an idea, not the skeptics, to show that it is scientific and so far, Häggström has not managed to do this.

Section XLIX: Statistical significance

I have refuted Häggström on null hypothesis significance testing (NHST) several times before and also independently discussed the problems with NHST and p values in a lot of posts. My interactions with Häggström on NHST and p values have been largely unproductive. This is largely because Häggström has refused to engage my arguments (like he has refused to engage the arguments I presented in this articles series) and settled on various forms of name-calling.

Häggström repeats the same old tired falsehoods. He claims that statistical significance is of crucial importance to science, but it isn’t and has never been, not to mention the fact that it is has been enormously damaging to science, leading to publication bias, inflated effect size estimations and the replication crisis. Häggström thinks that statistical significance is of crucial importance to all science because it is mentioned often in a Nature News item on how to interpret scientific claims. But this is the exact opposite. The reason statistical significance feature prominently on that list is because of the many flaws and limitations with the method in question! Just read the sections “Differences and chance cause variation.”, “Bigger is usually better for sample size”, “Beware the base-rate fallacy”, “Seek replication, not pseudoreplication”, “Significance is significant” (which gives a wrong definition of a p value), “Separate no effect from non-significance” and “Effect size matters”.

Although Häggström does offer lip service to the problems with NHST, he continues to rehash the same errors he made before. He emphasizes nil null hypotheses, while they are actually false and irrelevant in the vast majority of cases, he ignores “large sample size” as a third option to “null is false” or “unlikely occurrence” when statistical significance obtains. He peddles the same arbitrary cutoffs and ignores prior probability. He does not seem to understand that the scientific context needs to inform effect size interpretations, because a small effect size can be important in some cases, whereas a very large effect size might be of low relevance in other cases.

Perhaps the dumbest claim is that decision-makers needs probabilities, but ignores that effect size, confidence intervals, scientific context and replication provides much more reliable evidence to decision-makers than p values. He claims that the point of confidence intervals are to do NHST, but ignores the many uses outside NHST.

His errors are not limited to NHST, but makes several false claims in Bayesian statistics as well. He claims that prior probabilities are arbitrary and subjective (p. 164), but it is possible to use empirical base rates such as HIV prevalence in a population to use Bayesian statistics for HIV test results. This is not “subjective” or “arbitrary” and certainly not a case of “assigning equal probabilities to all parameter values” (footnote 360, p. 164). This might seem like a special case, but quantitative priors can be constructed based on empirical evidence in more complex cases as well and this is the precise opposite of subjectivity. Not to mention that NHST is largely subjective and arbitrary.

Häggström simply has no idea of what he is talking about here. He may be a mathematical statistician, but his expertise is in percolation theory, not NHST. I am not an expert on NHST either, which is the reason why I use references to the scientific literature (in the posts references above).

His nonsense just goes on and on, and I have refuted all of his major claims before. If you want to read more about this, check out the posts linked in the first paragraph of this section.

I do not think that Bayesian statistics will solve the problem with NHST. This is because of the flexibility introduced by having to specify a prior probability and the added option of trying out Bayesian methods should frequentist methods “not yield anything interesting” or whatever rationalization scientists use to justify their data dredging. For that, we are better of focusing on effect sizes, interval estimation, scientific context, replication and meta-analysis, such as the approach taken by Geoff Cumming in a recent position paper in Psychological Science.

Section L: Statistical significance is incompatible with Popperian falsificationism

A key property of pseudoscience is that it tries to parasitize on real intellectual efforts and results to prop up itself. Häggström tries to prop up NHST by claiming that “it resonates well with Popperian falsificationism” (p. 155). But this is flatly untrue as the following table illustrates. For references, see posts linked at the start of the previous section above.

Approach: NHST Popperian falsificationism
Which hypothesis to test? An extremely unlikely and irrelevant null hypothesis. The actual research hypothesis.
Is it a severe test? No, a single null hypothesis test is not a severe test because we know null hypothesis are almost always false. Yes, falsifiability is based on using severe tests.
Can results be used to support the hypothesis being tested? No, as statistical non-significance is not the same as equivalence. Yes, passing a severe test corroborates the research hypothesis because bad hypotheses are unlikely to pass severe tests.
What metrics are allowed to be used? Only p values, as NHST is based on comparing p value to an arbitrary cutoff to decide if statistical significance has been obtained or not. Any relevant metric, such as effect size, confidence intervals, appreciations for scientific contexts, meta-analyses etc.
Is it logically sound? No, as a low p value does not disprove the null hypothesis (remember, something unlikely could have happened), and the alternative hypothesis might be even more unlikely. Yes, it is based on modus tollens.

References and further reading:

Carrier, R. (2010). Christianity Was Not Responsible For Modern Science. In J. W. Loftus (Ed.), The Christian Delusion. New York: Prometheus Books.

Cumming, G. (2014). The New Statistics: Why and How. Psychological Science. 25(1). 2014. 7-29.

Feynman, R. (1985). Surely You’re Joking, Mr. Feynman! (Adventures of a Curious Character). W. W. Norton & Company.

French, R. (1994). Ancient Natural History. New York: Routledge.

Good, I. J. (1960). The Paradox of Confirmation. The British Journal for the Philosophy of Science. 11(42). 145-149.

Hardy, G. & Totelin, L. (2015). Ancient Botany. New York: Routledge.

Hempel, C. G. (1945). Studies in the Logic of Confirmation I. Mind 54 (213). 1–26.

Keyser, P. T., & Irby-Massie, G. L. (Eds.). (2009). Encyclopedia of Ancient Natural Scientists: The Greek Tradition and its Many Heirs. New York: Routledge.

Nutton, V. (2012). Ancient Medicine. New York: Routledge.

Rihll, T. (1999). Greek Science. New York: Oxford University Press.

Scheffler I & Goodman NJ. (1972). Selective Confirmation and the Ravens: A Reply to Foster. Journal of Philosophy, 69(3). 78-83.

Taub, L. (2003). Ancient Meteorology. New York: Routledge.

Thornton, S. P. (2010). Solipsism and the Problem of Other Minds. Accessed: 2016-07-22.

Wright, R. (1995). Cosmology in Antiquity. New York: Routledge.


Debunker of pseudoscience.

%d bloggers like this:

Hate email lists? Follow on Facebook and Twitter instead.