There are a bunch of videos on the internet that attempt to explain what I will call  the "Diagnostic Test Fallacy" via Bayesian math. The explanations are helpful, but I find that they don't stick in my information-addled brain. The other day, I tried desperately to reassure a friend who got a harrowing blood test result, for a blood test that seeks to diagnose extremely rare disease. The experience of trying to reassure a friend in dire need of reassurance helped me distill the following:

For a test that is 99% accurate, it sounds like the test only fails 1% of the time. Intuitively, you would think that a positive test result means you have a 99% chance of having the dreaded rare condition or disease. If an HIV test is 99% accurate and you test positive, well you probably have a 99% chance of having HIV, no? If your baby tests positive for spina bifida, and the test is 95% accurate, you might as well assume that your baby is doomed, right?

Wrong. Intuitively yes, but not empirically, no. The probability of having some disease and the probability of testing positive for it are not the same, because applied statistical math is about populations, and rare disease is not normally distributed within populations. The chances of having these conditions is very slim. So how to square the test accuracy versus the probability of having a disease?

False positives must be drawn from the population of true negatives, right? Think about it. The people who get a positive result should have tested negative, if that result is a false positive. The population level probability of being in the true negatives group is massive. The vast majority of people do not have any given rare disease or condition, by definition. So the 1% of tests that are wrong are one percent of a huge number.

Let us set up a quick and dirty equation, with very limited math, which illustrates the point. What is the chance of actually having a disease, given a positive test result? Why, the logic of chance dicates that the probability would be the number of true positives, divided by the total number of tests. True positives in the numerator, and all other possible scenarios in the denominator. TN = True Negative, and remember, the TN group is massive (almost the whole population). Also recall that the TP number is tiny (it's 99% of a sliver):


Without doing any math, one can readily see that the numerator will always be tiny, and the numerator will always be large. 99% of .00001 is much smaller than 1% of 300,000. Because the population of (true) negatives is so large, the false positives being 1% of these, means that false positives are an extremely common error type. A whole lotta people get false positives, even for a relatively accurate test. So if you get bad news, don't freak out! Just get more tests done!