Bayes' theorem: a strange example


Posted by Diego Assencio on 2013.10.13 under Mathematics (Statistics and probability)

Let $A$ and $B$ be two events and let $P(A|B)$ be the conditional probability of $A$ given that $B$ has occurred. Then Bayes' theorem states that: $$ P(B|A) = \displaystyle\frac{P(A|B)P(B)}{P(A|B)P(B)+ P(A|B^c)P(B^c)} $$

In other words, Bayes' theorem gives us the conditional probability of $B$ given that $A$ has occurred as long as we know $P(A|B)$, $P(A|B^c)$ and $P(B) = 1 - P(B^c)$.

Assume now we have a laboratory test designed to detect if a subject has a given disease. We need to define three events:

$D$the subject has the disease
$+$the laboratory test result for a given subject is positive
$-$the laboratory test result for a given subject is negative

Notice that a test result can be positive for a subject who does not have the disease (false positive) and also negative for a subject who has the disease (false negative). To clarify this point, consider the conditional probabilities below:

$P(+|D):$probability that the test result for a subject who has the disease is positive
$P(-|D^c):$probability that the test result for a subject who does not have the disease is negative

These quantities are called the sensitivity and specificity of the test respectively. Ideally we would have $P(+|D) = P(-|D^c) = 1.0$ ($100\%$), meaning the result of the test would always correspond to the actual state of the subject (sick or not sick). For real world tests, however, this is often not the case.

Now assume that the following is true for our laboratory test: $$ P(+|D) = 0.997 \quad\quad\quad P(-|D^c) = 0.985 \quad\quad\quad P(D) = 0.001 \label{post_b30e25e3b6a86586342729a19cfaf299_numbers} $$

The test result for a subject who has the disease is then positive with probability $99.7\%$ while the test result for a subject who does not have the disease is negative with probability $98.5\%$. Finally, the probability that a subject chosen randomly from the population has the disease is $0.1\%$.

Now here is where things will get strange. The numbers above give us the feeling that the test results are almost always correct, but from Bayes' theorem and using the fact that $P(+|D^c) = 1 - P(-|D^c)$ (for a proof of this equation, see this post), we have: $$ \begin{eqnarray} P(D|+) & = & \displaystyle\frac{P(+|D)P(D)}{P(+|D)P(D)+ P(+|D^c)P(D^c)} \nonumber \\[5pt] & = & \displaystyle\frac{P(+|D)P(D)}{P(+|D)P(D)+ [1 - P(-|D^c)][1 - P(D)]}, \end{eqnarray} $$ which yields, after plugging in the numbers given on equation \eqref{post_b30e25e3b6a86586342729a19cfaf299_numbers}: $$ P(D|+) = \displaystyle\frac{0.997\times 0.001}{0.997 \times 0.001 + 0.015 \times 0.999} = 0.062 $$

Notice that in spite of the fact that the test is very accurate, if we do the test on a randomly chosen subject, a positive result will be correct only in $6.2\%$ of the cases. How can that be?!

The problem here is that the fraction of the population which actually has the disease is very small ($0.1\%$). To clarify, assume we have a very large population and randomly choose $N = 1000000$ individuals from this population to take the test. Let $S$ be the set of chosen subjects who have the disease ("sick") and let $H$ be the set of chosen subjects who do not have it ("healthy"). We can expect that the following will be approximately true:

  • $|S| := N\times P(D) = 1000\,$ is the number of tested subjects who have the disease
  • $|H| := N \times P(D^c) = 999000\,$ is the number of tested subjects who do not have it

The number of positive test results among the subjects in $S$ is then: $$ R^+_S := |S|\times P(+|D) = 997 $$ and the number of positive test results among the subjects in $H$ is $$ R^+_H := |H|\times P(+|D^c) = 14985 $$

So $R^+_T := R^+_H + R^+_S = 15982$ is the total number of performed tests which will result positive. However, among these subjects, the fraction of subjects who actually have the disease is: $$ \displaystyle\frac{R^+_S}{R^+_T} = \displaystyle\frac{R^+_S}{R^+_S + R^+_H} = \displaystyle\frac{997}{15982} = 0.062 $$

To summarize, even though positive test results are correct for $99.7\%$ of the subjects who have the disease, the number $|H|$ of tested subjects who do not have the disease being so much larger than the number $|S|$ of tested subjects who have it means the number of false positives of the former set ($R^+_H = 14985$) greatly exceeds the number of true positives of the latter set ($R^+_S = 997$). The poor accuracy of positive results is what makes the test inappropriate for testing whether a randomly chosen subject has the disease or not.

Comments

No comments posted yet.