Friday, 27 March 2020

Testing for Covid-19 Antibodies - a Bayesian Problem

Explaining the problem of reliability in any form of testing takes us into the world of Bayesian statistics, conditional probabilities and false positives.  As we look for a viable antibody test for the CoronaVirus we come up against a problem: to assess its reliability you need to know the underlying base rates.  We know people who have had the virus and sure enough, when checking the reliability of a candidate new test let us assume it is 100 percent accurate in confirming the presence of antibodies.  Is that it, is that the end of the story?  It is if the test does not generate a positive outcome with people who have not had the virus.  There can be a number of reasons for this: maybe, for example, it isn't specific enough to this particular variant of corona virus, remembering that four different corona viruses already give us seasonal colds which we brush off with a few doses of NightNurse.

So, this is where we start:  what is the proportion of the underlying population that has had the bug?  Let us assume 10%.  This is our base rate probability that we have the bug.  So if someone asked you out of the blue what is the chance you have had the bug you would answer 10%, i.e., Prob(Bug) = 0.1

Next step: what is the probability that the test will show a positive if you have had the bug?  That's easy, we are assuming it picks up all cases where people have had the bug so Prob(Pos/Bug) = 1.0, which we can read as the probability of a positive result given you have had the bug.  However, this isn't what you want to know, what you want to know is what is the chance I have had the bug if the test shows positive, i.e., Prob(Bug/Pos).

Now we have a problem because the test can give a positive result even when you haven't had the bug.  This is the problem of false positives.  Let us assume the test throws up a false positive 5% of the time when there are no specific antibodies.  So now what we need is the probability of a positive result if you take the test.  Well it's not just 10%, it's going to be 1.0 x 0.1 + 0.05 x 0.9.  Prob(Pos) = 0.55   That doesn't look good.  If you take the test there is a 55% chance it will give a positive result.

So now, how do we reverse the conditional? The Prob(Pos/Bug) of 1.0 obviously isn't the same as Prob(Bug/Pos).  It's out by a factor of the ratio of the Prob(Bug) = 0.1 to the Prob(Pos) = 0.55 or, if you prefer the true underlying likelihood of you having had the bug without the test to the likelihood of the test saying you have had the bug.  The answer is that Prob(Bug/Pos) = 0.1818 = 18.2%

So, to assess the reliability of an antibody test you need to know: (i) the likelihood that the test will show positive if the antibody is present, (ii) the error rate with the test when applied to people who have not had the virus (this gives the probability of false positives and (iii) the proportion of people in the population who have had the bug and have the antibodies.  (i) is easy to discover, (ii) you need an isolated community of hermits who you are certain are and have been bug free and, (iii) you need to know the base rate i.e, the proportion of the overall population who have been infected.  What is sure is that you cannot estimate (iii) from the test results unless you are sure you know the answer to (ii).

So do we have a good test - well no, all it has done is shift your prior expectation from one in ten to a little short of one in five.  Would you be prepared to go out and mingle on the basis of a test that gave you such little information - especially when you don't know what the underlying level of infection happens to be?


No comments: