Finding Outliers in Outliers?
Presenting at the Dutch Chemometrics Society annual meeting late May this year, I heard a talk of Klaas Faber on the “Athletes Biological Passport” – especially targeting the Pechstein case. Now that the Swiss court finally confirms the ruling, things popped up again. Faber, being the expert of Pechstein, talks about “torture the data until they confess”.
Regardless of Pechstein being guilty or not, there are some problems with the passport from the statistical point of view.
The two major problems with the passport which I remember from Faber’s talk are:
- The sample from which the confidence intervals are created is based on ordinary people, or at least average sports men. This is certainly due to the fact that we need a large enough sample, but is it representative for the few top athletes – doped or not?
- Assuming the confidence intervals are created for sportsmen who did not use illegal methods to enhance their performance, we a-priori know that we mistakenly will convict x% clean sportsmen given the (100-x)% interval.
Not only as a statistician I have a problem with the above mentioned points, as statistics is used to convict someone merely by the fact that he/she is off the limits with his/her biological measurements, without any causal connection proven that these values are caused by doping.
Let me finish with a simple example of a sample of a normal distribution of size 100,000 illustrating the dilemma. Plotted in a boxplot, we get the following:
What is marked as “outlier” by the boxplot is, for most cases, not any different from the adjacent values at the whiskers. Getting more and more to the fringes, we might find that some values really “look like” outliers. For this sample we would “convict” 763 cases according to the boxplot definition, although all come from a “perfect” normal distribution.
In the end, much seems to be determined by the credibility of the different sports associations. Very much points to a doping case for Contador, but the UCI seems to cover up for him – the ISU did the opposite for Pechstein.
Martin,
Pechstein was not convicted under the guidelines issued by the World Anti-Doping Agency (WADA), but by some, even more arbitrary sort of ruling.
However, ‘doing the wrong thing right’ would have been decisive in that case.
For certain (but not all) flaws underlying WADA’s guidelines, see:
N.M. Faber and B.G.M. Vandeginste, Flawed science ‘legalized’ in the fight against doping: the example of the biological passport, Accreditation and Quality Assurance, 15 (2010) 373-374.
It helps keeping in mind that Sottas, the person who developed the biological passort for WADA, is neither a statistician nor a forensic scientist.
For people who can read Dutch, the following article may be of interest:
N.M. Faber, Het biologisch paspoort: veelbelovende opsporingstechniek of juridisch wankel?, Tijdschrift voor Sport & Recht, (2010) 2: 79-89