Monday, June 07, 2021

From reproducibility failure to methodological success

Out today in PLOS ONE, "Comparative analysis of three studies measuring fluorescence from engineered bacterial genetic constructs" solves a mystery hiding in the iGEM interlaboratory studies for 2016, 2017, and 2018. You see, the publication of the 2017 interlab data was delayed, even after the publication of the iGEM 2016 study and iGEM 2018 study, because of a troubling mystery: the plate reader results from the 2016 and 2017 studies did not match.  This was a shock, because the 2017 study was intended to be a replication of the 2016 study, plus a few extensions and enhancements. But what we got was shockingly different, systematically off by a factor of more than 10. So which, if either of them, was right?

This is a terrible and unsettling place to find oneself in, but we couldn't actually answer the question until after we had run and analyzed the 2018 study. With that study, we finally had a way to put plate reader data on the same scale as flow cytometry data, so that we could assess accuracy through two independent measurements. So once we'd finally finished analyzing and publishing that data, we turned to comparing the three years to find out what had happened and how to understanding our failure to reproduce. And here is the story, finally, summed up in a single image:


It appears the 2017 plate reader results were right: they match both the 2018 results as well as the flow cytometry from 2016. There's a lot more detail in the paper, as well as additional confirmations, but the bottom line is that it looks like the calibrant that we prepared for the 2016 study did not have the concentration of fluorescein that it was intended to. 

Embarrassing, but actually, I think, good news in the end. Because we could tell! We are no longer held to the tyranny of uncertainty, unable to even know if our measurements have been reproduced. With multiple independent measures and a successful confirmation of values reproduced in three different studies (2016 flow, 2017 plate, 2018 both), we now have truly solid ground on which to stand, biologically. Every future study that we build can bootstrap off of these results, and know if the numbers that come out are reasonable or not.

But why are we still preparing our own fluorescent calibrants in the first place? We need metrological traceability and easily purchased commercial preparations with adequate quality control, just like we have for units of time and length. Calling all reagent suppliers: who will first start to sell a plate reader cellular quantification kit?