Interestingly enough after last week’s post there is a brilliant article in the BBC magazine about doctors and their understanding of statistics.
Gerd Gigerenzer is one of those names in statistics I trust. His discussion of risk is fascinating. Take the example mentioned in the article:
As a doctor, you know the following facts to be true:
- The probability that a woman has breast cancer is 1% (“prevalence”)
- If a woman has breast cancer, the probability that she tests positive is 90% (“sensitivity”)
- If a woman does not have breast cancer, the probability that she nevertheless tests positive is 9% (“false alarm rate”)
When a 50 year old female patient, who has no other symptoms of breast cancer, has a routine mammogram, she tests positive. Alarmed, she asks you what her risk is? Which of the following is the best answer?
- nine in 10
- eight in 10
- one in 10
- one in 100
If, like me, you read this at lunch with a box of strawberries with one eye on your MOOC numbers, you probably said ‘nine in ten’. In fact the answer is ‘one in ten’. Why is this the case?
Well first remember that if there are a hundred random women in a room, the prevalence of the disease in the population suggests that one of them will have breast cancer. Second, remember that if we test the same hundred women, we will have nine women testing positive who don’t have the disease, and the woman who does have the disease has a 90% chance of testing positive (meaning that it’s possible she won’t test positive).
So with no other symptoms to go on, and remembering that it’s likely that 10 of our hundred random women would test positive (one because she does have cancer and the other nine because they get false positives), the best estimate of whether this patient has cancer is actually one in ten. She might be the true positive. But nine times out of ten she’s the false positive.
It’s an excellent teaching opportunity and the maths make sense when you think about it, but it’s keeping the populations separate in your head that makes it difficult.
In other news, I picked up Andy Field’s ‘Discovering Statistics Through R’ and I’m really enjoying it so far.