Image by Matt Buck, under Attribution-ShareAlike 2.0 Generic.
Note: Snout (Reckless Endangerment) has made some good arguments in the comment to this post. The gist is that HIV/AIDS denialists overestimate the false positive rate by assuming that the initial test is all there is, when in fact, it is just the beginning of the diagnostic process. Snout also points out that it is probably wrong to say that most people who get tested have been involved in some high-risk behavior, as a lot of screening goes on among e. g. blood donors etc. I have made some changes (indicated by del or ins tags) in this post because I find myself convinced by the arguments Snout made.
There have already been several intuitive introductions to Bayes’ theorem posted online, so there is little point in writing another one. Instead, let us apply elementary medical statistics and Bayes’ theorem to HIV tests and explode some of the flawed myths that HIV/AIDS denialists spread in this area.
The article will be separated into three parts: (1) introductory medical statistics (e. g. specificity, sensitivity, Bayes’ theorem etc.), (2) applying Bayes’ theorem to HIV tests to find the posterior probability of HIV infection given a positive test result in certain scenarios and (3) debunking HIV/AIDS denialist myths about HIV tests by exposing their faulty assumptions about medical statistics. For those that already grasp the basics of medical statistics, jump to the second section.
(1) Introductory medical statistics
A medical test usually return a positive or a negative result (or sometimes inconclusive). Among the positive results, there are true positives and false positives. Among the negative results, there are true negatives and false negatives.
True positive: positive test result and have the disease.
False positive: positive test result and do not have the disease
True negative: negative test result and do not have the disease.
False negative: negative test result and have the disease.
For the purpose of this discussion, will indicate a positive test, will indicate a negative test, will indicate having HIV and will indicate not having HIV.
is the probability of an event A, say, the probability that a fair dice will land on three. Conditional probabilities, such as , represents the probability of event A, given that event B has occurred. If A and B are statistically independent events, then , if (because the definition of has in the denominator).
Let us define some conditional probabilities that are relevant for HIV tests and Bayes theorem:
is the probability of obtaining a positive result, given that the person has HIV. This is known as the sensitivity. It is a measure of how good the test is at identifying individuals with HIV. It is the number of true positives divided by the sum of true positives and false negatives. A test that has a high sensitivity is unlikely to miss any individuals with the disease, and therefore has a low rate of false negatives.
is the probability of obtaining a negative test result if you do not have HIV. It is know as the specificity. It is a measure of how the test is at identifying people who do not have HIV. It is the number of true negatives divided by the sum of the true negatives and false positives. A test with a high sensitivity is unlikely to wrongfully identify individuals without HIV as HIV+.
Only having a high sensitivity or specificity is not enough. Any crank test that always returned a positive test result would have a sensitivity of 1 or if it always returned a negative result, it would have the specificity of 1. A valid medical test has both a high sensitivity and a high specificity.
is the prior probability of having HIV. That is, how likely is it that a random person pulled from the population has HIV? This is usually taken to be the prevalence (population base rate). There are roughly 315 million people living in the U. S. and 1.2 million of them have HIV. So the prevalence is 1.2 / 315, which we can round off to 0.4% (CDC, 2012b).
is the posteriori probability, that is, how likely is it that a given person has HIV after we have taken into account the base rate and updated it with the available evidence (i.e. result of HIV test). The posteriori probability is also know as the positive predictive value. The calculate the positive predictive value, one only needs to know three values: the specificity, the sensitivity and the prior probability. The formula, known as Bayes’ theorem, looks like this:
The numerator represents the number of true positives, and the denominator represents the sum of the true positives and false positives.
is calculated by as you either have or do not have HIV. is the false positive rate and is determined by as a positive is either a true positive or a false one.
(2) Applying Bayes’ theorem to HIV tests
Let us look at the sensitivities and specificities (for HIV-1) of selected rapid HIV tests (CDC, 2008).
|Name of Rapid HIV test
||Sensitivity (%) [95% CI]
||Specificity (%) [95% CI]
|OraQuick ADVANCE Rapid HIV-1/2 Antibody Test (oral fluid)
|Uni-Gold Recombigen HIV (Whole blood)
|Clearview COMPLETE HIV 1/2 (Serum & Plasma)
95% CI refers to 95% confidence interval. A 95% confidence interval for sensitivity means that 95% of the time you take a sample and calculate the sensitivity, the confidence interval will include the fixed population parameter (the “real” sensitivity).
For the purposes of our calculations, let’s use the 0.99 figure for both sensitivity and specificity. For the prior probability, let’s use 0.004 (0.4%). The posterior probability (positive predictive value), then becomes:
So, in a random screen, the posterior probability (positive predictive value) of having HIV, given a positive test, is 28%. What can we make of this?
(3) Debunking HIV/AIDS denialist myths about HIV tests
So the general argument by HIV/AIDS denialists goes something like this: since the posterior probability, that is, P(HIV|+) is low to moderate, this means that HIV tests are inaccurate or unreliable. There are three general objections that can be made from science-based medicine and these are overlapping to a large extent.
1. Real HIV testing is not a random screen: the calculation made above assumes that you pull a random person from the overall population, and test him or her. But this is not how HIV tests are done in practice. Presumably,
most of individuals who go and get tested for HIV has been involved in some high-risk behavior, such as unprotected sex or intravenous drug use etc. This means that it is be inappropriate to use the population prevalence as the prior probability. Instead, the prevalence in that risk group should be used. That means that the posterior probability will increase, as the only change being made in the formula is an increased prior probability.
2. HIV/AIDS denialists commit the fallacy of transposed conditions. P(HIV|+) is not the same as P(+|HIV), which was the sensitivity. It is the latter, together with P(-|not-HIV), i.e. the specificity, that tells you anything about the accuracy (in this case validity) of the test, not P(HIV|+).
3. Low prior probability, not an intrinsic flaw of rapid HIV tests, is the main reason for why the posterior probability was low. Having a specificity and sensitivity of over 99% means that the test rarely gives any false positives or false negatives, so the tests have high validity.
4. . This argument was made by Snout (Reckless Endangerment).
If we carry out the calculation with a more appropriate prior probability, say, the probability of having HIV if you belong to a risk group, the results become quite different. Let us take the risk group of men who have had sex with men and have gotten an HIV test. The prevalence among those is 19% (CDC, 2012a). This is not the same as saying that the prevalence among men who have sex with men in the U. S. overall is 19% (), only among those who have gotten tested in the time period from which the results are from. Presumably, the prevalence among those who get tested are higher than the overall population. Anyways, using that prevalence, the positive predictive values becomes:
The take home message is that if you belong to a risk group, decide to go get tested and get a positive HIV test result, you are very likely to actually have HIV. The assumption of a random screen does not conform to reality that well. It is also worth pointing out that a positive HIV test is followed up with a western blot to make sure the positive result is a true positive. If there are any inconsistencies, doctors can perform a PCR test as well. Taken together, these three tests reach a certainty level as high as it can in practice get in medicine.
To be fair, I have not actually seen any HIV/AIDS denialist make the argument using Bayes’ theorem. In fact, I doubt that most HIV/AIDS denialists online are that familiar with medical statistics in the first place. Most of the time, they just pull the posterior probability from a random screen and present it triumphantly, as if it meant that rapid HIV tests are unreliable or inaccurate. It does not. Now you know why.
References and further reading
CDC. (2008). FDA-Approved Rapid HIV Antibody Screening Tests. Accessed: 2012-08-16.
CDC. (2012a). HIV among Gay and Bisexual Men. Accessed: 2012-08-16.
CDC. (2012b). HIV in the United States: At A Glance. Accessed: 2012-08-16.
Altman, D. G. (1999). Practical Statistics For Medical Researchers. New York: Chapman & Hall/CRC, p. 409-416.