Friday, February 25, 2011

Is Spectrum Bias a Problem for Error Statistics?

A phenomenon called spectrum bias might help my argument that advocates of error statistics should take the positive predictive value (PPV) and negative predictive value (NPV) of their tests seriously. Spectrum bias is typically discussed in the context of medical diagnostic tests. Such tests are characterized by their sensitivity and specificity, where a test’s sensitivity is the probability that it yields a positive result if the condition in question is present, and its specificity is the probability that it yields a negative result if the condition in question is absent. PPV and NPV are more clinically relevant than sensitivity and specificity. However, sensitivity and specificity are more popular measures of a test’s performance because, unlike PPV and NPV, they are generally taken to be intrinsic properties of the test, independent of the prevalence of the condition in the population.
Spectrum bias is the phenomenon that sensitivity and specificity are not, in fact, intrinsic properties of medical tests. Like PPV and NPV, they vary when they are applied to different populations. There are both theoretical and empirical studies supporting the claim that spectrum bias exists. At least one study I have looked at purports to show that sensitivity and specificity vary with features of the population almost as much as PPV and NPV. One part at least of the explanation for this phenomenon is that medical conditions typically are not truly dichotomous; they can be present to varying extents. Misclassification is more likely for individuals who are close to the classification cutoff. As a result, sensitivity and specificity are lower for populations in which many individuals are close to the cutoff than they are for populations without this feature.

If spectrum bias afflicts error statistical tests generally, then an advocate of error statistics cannot deny the relevance of PPV and NPV on the grounds that they are not intrinsic properties of tests without also impugning their preferred error rates α and β.
I need to find out more about spectrum bias and its prevalence and severity before I can be confident that this argument is a good one. However, it does seem promising and is not likely to have been considered before within the philosophy of science, where spectrum bias seems to be largely unknown.

1 comment:

  1. I think this is a very important project. But why stop at pointing out the importance of PPV and NPV for Frequentists? If you can calculate PPV and NPV then you can do a full Bayesian analysis, no? I'm only really familiar with the context of medical tests; it certainly seems true there that people calculate PPV and NPV using Bayesian priors (which in that case are very easy to be sure about).