Given my two roles, a researcher and a funder, it’s clear to me that reproducibility in science is increasingly seen as a concern, at least a high level. And thus, making science more reproducible is a challenge that many people want to solve. But it’s quite hard to do this, in general. In my opinion, there are a variety of factors responsible, including:
- Our scientific culture thinks reproducibility is important at a high level, but not in specific cases. This reminds me of Mark Twain’s definition of classic books: those that people praise but don’t read. We don’t have incentives or practices in place that translate the high level concept of reproducibility into actions that support actual reproducibility.
- In many cases, reproducibility is difficult in practice, due to some unique situation. For example, data can be taken with a unique instrument, such as the LHC or a telescope, or the data may be transient, such as seismometer that measured a specific signal, though on the other hand, in many cases, data taken in one period should be statistically the same as data taken in another period.
- Given limited resources, reproducibility is less important than new research. As an example, perhaps a computer run that took months has been completed. This is unlikely to be repeated, because generating a new result is seen as a better use of the computing resources than reproducing the old result.
We can’t easily change culture, but we can try to change practice, with the idea that a change in practice will eventually turn into a change in culture. And we can start by working on the easier parts of the problem, not the difficult ones. One way we can do this is by formalizing the need for reproducibility. This could be done at multiple levels, such as by publishers, funders, and faculty.
Publishers could require that reviewers actually try to reproduce submitted work as a review criterion. Funders could require the final project reports contain a reproducibility statement, a demonstration that an unrelated group had reproduced specific portions of the reported work, with funders funding these independent groups to do this. And faculty could require students to reproduce the work of other students, benefitting the reproducer with training and the reproducee with knowledge that their work has been proven to be reproducible.
What do we do about work that cannot be reproduced due to a unique situation? Perhaps try to isolate that situation and reproduce the parts of the work that can be reproduced. Or reproduce the work as a thought experiment rather than in practice. In either case, if we can’t reproduce something, then we have to accept that we can’t reproduce it and we need to decide how close we can come and if this is good enough.
In all of these cases, there’s an implied cost-benefit tradeoff. Do we think the benefit of reproducibility is worth the cost, in reviewers’ time, funders’ funds, or students’ time? This gets to the third factor I mentioned previously, the comparative value of reproducibility versus new research. We can try to reduce the cost using automation, tools, etc., but it will always be there and we will have to choose if it is sufficiently important to pursue.
Let me close by going back to Twain’s definition, and asking, will reproducibility become one of the classic books of the 21st Century, praised but not carried out? Or will we choose to make the effort to read it?
Some work by the author was supported by the National Science Foundation (NSF) while working at the Foundation; any opinion, finding, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the NSF.