Cancer Reproducibility Project

The Reproducibility Project: Cancer Biology will replicate selected experiments from 50 high-impact papers
23 December 2014

Interview with 

Tim Errington, The Centre for Open Science


In 2012, the research world was rocked when scientists at Amgen announced they've been unable to repeat 47 out of 53 landmark publications that they've picked out as promising avenues for the development of new therapeutics. Now eLife is supporting an initiative called, the "Reproducibility Project: Cancer Biology", which is a collaboration between the Centre for Open Science and the Science Exchange to independently replicate selected results from 50 leading cancer biology papers. The aim isn't to conduct a witch hunt, but instead to identify the factors that might influence reproducibility more generally as Tim Errington explains to Chris Smith...

Tim - This initiative is to look and do an open investigation of how reproducible certain pre-clinical cancer research papers are. What that means is if I look at the results of a published paper, how likely is it that I can reproduce those exact findings if I do everything the exact way that the original authors did?

Chris - Looking at it another way, Tim. How unlikely is it that were I to take a cancer paper off the shelf today and attempt to replicate what's in it, that I wouldn't be able to do so?

Tim - That gets into some of the barriers that this project is trying to investigate, which is how much information are we actually sharing with our colleagues and with the public? How much methodology, how much data, how much of the actual results are being presented in any given paper? It's kind of analogous to cooking. If I want you to make a cake but I don't tell you all the ingredients and the entire recipe step-by-step, but I only show you a pretty picture of a cake at the end. It's really difficult to know how to recreate that cake.

Chris - The only certainty is you must be Michelin star chef.

Tim - Yes!

Chris - Do you think that there's an endemic problem in science? I mean, if one looks at the volume of material that's published, is there evidence that this is unreliable or untrustworthy, or that people are publishing things for the wrong reasons?

Tim - So, there's a lot of evidence that's suggesting there's a lot of excessive positive results in the literature. And I think what we're seeing is that all the incentive structure is about these results that show that something is occurring, suggesting that researchers are holding back some of the data that they're actually producing.

Chris - But you go further than that in announcing this study. One commentator has gone quite a bit further than that and I will quote. "Some laboratories publish one irreproducible study after another in high-impact journals, collecting data to support their intuition and paying little attention to whether or not the data truly support the conclusion" So it sounds like rather than just people cherry picking, actually, they're actively deceiving.

Tim - There's some evidence in the literature that does suggest that. And this project's intent is if somebody is holding back and only doing the experiments and showing the results that support their hypothesis and if we do a direct replication, how likely is it that we'll see the exact same distribution, the exact same result if we look at every piece of data?

Chris - How are you going to go about doing this?

Tim - We're going to take a subset of experiments in these 50 papers that we've identified and we're going to work with the original authors, and obtain all of that original materials, methodology to get a full description of what was exactly done and to hopefully have the original authors look it over and provide that input that is missing from the publication.

Chris - So, the broad aim of this study is to say, "We have identified 50 big papers in the field of cancer research. We are going to attempt to recreate the results using the methods and the approach taken by the authors of that paper to see whether or not our outcomes agree with theirs."

Tim - Yeah. And so we take it one step further. And I think this is the real kind of crux of the entire project which is if we do that across many experiments and many studies, we get a really rich data set that we can then analyse and ask the simple question, "Is this associated with, or not, the ability to reproduce those results?"

Chris - So say you do reproduce the result. How do you know a true result really is that positive and that you haven't just fallen for the same mistake that they made?

Tim - Hang on. That's a very good question. We designed these experiments in a way that at least tries to minimize that as much as possible for error that occurs in the replication. But, as you pointed out, if we reproduce a given result but we obtain the same materials, so we obtain that same imperfection. And thus, we obtain that same result that's flawed with that imperfection. This project can't necessarily uncover that.

Chris - And equally, the flipside of the coin. Does you failing to reproduce a result mean that the original result was wrong, or does it mean actually that you've just failed to do what someone else did?

Tim - I think what you just said the second time is the absolute right answer. It's just another experiment. It's another experiment that's trying to understand if you can reproduce the original result. It has no bearing on those results. If anything, I think it's exciting if you can't reproduce it because if you can't reproduce it then that means that all the variables that you thought were not important, might actually be important. That this effect actually is not maybe robust as you think it is but there's some underlying biology that is worth following up and understanding. Why did we not recreate what somebody else did?

Chris - And what is this project's ultimate goal? What are you hoping when you've been through these 50 papers and you have or haven't reproduced things? What are you hoping to achieve?

Tim - The main thing that we're trying to do is to create an initial data set that allows us to kind of hold the mirror to ourselves and say, "What are our current research practice and publishing practices that are having a potential influence on the ability to replicate something, and how might we change those?" And I think another very related aim for this project is to demonstrate that you can do science in an open manner and that we can do and make replications, something that we should probably do more often in the greater scientific community, that there's a lot of value in doing direct replications, especially when we're doing it for experiments that have very high therapeutic potential.


Add a comment