Last month at the French Embassy in Washington, DC, I participated in a day-long session on the regulation of female athletes, following the recent decision by the Court of Arbitration for Sport to uphold the IAAF regulations targeted at Caster Semenya and others. During the session, Stéphane Bermon of the IAAF and a chief architect of the regulations, relied heavily on a new literature review by Clark et al. 2019 in the journal Clinical Endocrinology – a group of scientists associated with the U.S. Anti-Doping Agency – to argue that testosterone levels between males and females do not overlap.
It turns out. Clark et al. 2019 is fatally flawed. I have been in touch with Clark et al. and the editor of the journal for the past six weeks. Read on for the details, which I am publicizing for the first time today.
In DC, Bermon showed the figure below from Clark et. al. 2019 to argue that there no overlap between “healthy” male and female testosterone levels. Bermon further argued that when you look at 46 XY DSD individuals (categorized by Clark et al. in terms of specific conditions 5ARD2 and PAIS-CAIS and as shown in the figure), they overlap with “healthy” males but not “healthy” females. (I put “healthy” is scare quotes, because 46 XX and XY DSD individuals are also perfectly healthy, but that’s a topic for another time). Thus, Bermon argued explicitly and Clark et al. implied, we should consider 46 XY DSD individuals to be males and 46 XX DSD individuals to be females. Science has spoken!
Take a close look at the figure. It shows what Bermon says it shows, but I noticed at the time that it also shows something odd. I have annotated the figure below to highlight the point that is accompanied by no associated distribution of values, like all of the others. That is very odd, I thought.
So, being a nerdy academic, I decided to replicate from the original papers and data the literature review of Clark et al. for the 5ARD2 XY category. When I did so, I found multiple sloppy errors, and one major error which is fatal to the argument made by Bermon and IAAF, and to the conclusions of Bermon et al.
I immediately contacted Clark et al. and the journal and then formally submitted a correspondence to the journal pointing out the errors. The correspondence has since been rejected, for reasons unknown to me. The editor tells me that the journal will be soon posting an error correction to the paper, but I do not know what it will say. Let’s hope for the best. Errors happen. What matters is what the scientific community does once they are found.
But, cutting to the chase, here is what the figure of Clark et al. looks like when the distribution of values is added to that naked and lonely data point — the distribution that somehow did not make it into the original paper.
The addition of the actual range reported in Maimoun et al. for testosterone levels of 46 XY DSD individuals, as characterized in Clark et al., completely undercuts the conclusions of Clark et. al., because it shows that 46 XY DSD 5ARD2 individuals overlap with “healthy” males and “healthy” females.
This is a big deal.
You can read my (rejected) correspondence to the journal here in PDF. I’ve also pasted it in below. I’ll update as this develops.
12 June 2019
Roger Pielke, Jr.
University of Colorado Boulder
Clark et al. (2019) includes a number of mistakes, one of which undercuts a core conclusion of the paper. Specifically, in Table 2 Clark et al. report testosterone levels for 46 XY individuals with 5‐alpha reductase deficiency, type 2 (5ARD2). A re-review of the literature cited in this table has identified 3 errors (data reported elsewhere in Clark et al. has not been re-assessed):
- Shabir et al. (2015) report mean, not median;
- From Veiga-Junior et al. (2012) the median T level is 24.1 and not 13.4;
- Most importantly and materially, with respect to Maimoun et al. (2011), a reanalysis indicates different values than are reported in Clark’s Table 2:
- Number of patients aged 14 to 23 = 9 (in contrast Clark et al. report 17 patients – and note that there are two additional patients, aged 24 and 26 outside the Clark et al. age range)
- Average T = 18.6 (Clark et al. report 17.5)
- Median T = 21.1 (not reported by Clark et al.)
- Range = 0.8 to 32.2 (not reported by Clark et al.)
The Clark et al. failure to report the range of testosterone values of Maimoun et al. in favor of reporting a point estimate is a significant oversight and was not done for any of the other studies. Below is Figure 1 of Clark et al. modified to include the range of Maimoun et al. (with the addition of the heavier dashed line to represent the actual range of testosterone values reported in Maimoun et al.).
Corrected Figure 1 of Clark et al. 2019.
The corrected figure clearly indicates that several claims by Clark et al. are incorrect. First, the claim that the range for all 5ARD2 individuals is “well beyond the range for normal females” is not correct. Second, the claim that such individuals “whose phenotype at birth was ambiguous or female, have testosterone levels within or near the normal male range” is also incorrect. Finally, the claim that “existing studies strongly support a clear bimodal distribution of serum testosterone levels in females compared to males” depends upon the oversight of material details from Maimoun et al.
As Clark et al. is currently being used prominently in public discussions and legal forums to justify regulation of certain women in elite athletics, it is important that the scientific record be corrected promptly when errors are discovered, else policy be built upon flawed evidence.
Disclosure: R.P. served as an expert witness in recent proceedings of the Court of Arbitration for Sport on behalf of Caster Semenya, and testified to matters of scientific integrity.
Clark, R. V., Wald, J. A., Swerdloff, R. S., Wang, C., Wu, F. C., Bowers, L. D., & Matsumoto, A. M. (2019). Large divergence in testosterone concentrations between men and women: Frame of reference for elite athletes in sex‐specific competition in sports, a narrative review. Clinical endocrinology, 90(1), 15-22.
Maimoun, L, Philibert, P, Cammas, B, et al. Phenotypical, biological, and molecular heterogeneity of 5alpha‐reductase deficiency: an extensive international experience of 55 patients. J Clin Endocrinol Metab. 2011; 96: 296‐ 307.
Shabir, I, Khurana, ML, Joseph, AA, et al. Phenotype, genotype and gender identity in a large cohort of patients from India with 5alpha‐reductase 2 deficiency. Andrology. 2015; 3: 1132‐ 1139.
Veiga‐Junior, NN, Medaets, PA, Petroli, RJ, et al. Clinical and laboratorial features that may differentiate 46, XY DSD due to partial androgen insensitivity and 5alpha‐reductase type 2 deficiency. Int J Endocrinol. 2012; 2012: 964876.