The 'perfect' experiment
Classically, an experiment starts with a good question but with the latest science, that's not neccessarily the case as Sam Behjati explained to Graihagh Jackson...
Sam - So after I have received my grant, I guess the first thing that would happen is a big sigh of relief. Obviously when it happens it's amazing and when it doesn't happen it's an enormous disappointment. And then, of course... the hard work begins.
So my name is Sam Behjati. I graduated from medical school in 2006 but we mainly do clinical work, so my main scientific career was my PhD, which was 2011 to 2014 and then I've restarted with research in March of this year.
Graihagh - What drew you back to research?
Sam - Oh I love sitting here and looking at my data; it's a phenomenal place to be.
Graihagh - We met for a coffee at the Sanger Institute to go through what a 'good' experiment means, and Sam although recently returned to research, has published around 50 papers in prestigious journals like Nature, Science and the British Medical Journal.
Sam - Classically, one would say... a good experiment starts with an excellent question but, I think, in our field of science where we do genomics, the first bit is to design that screening part of the experiment really well.
Graihagh - Can I think of it as screening - just finding those people who have say bowel cancer, if that's what you're looking at, and then within those select people looking for what mutations they have in common that might cause this cancer?
Sam - Perhaps we could talk through a specific experiment that we've published in the past that might just make it easier. So one of the various genotypes that I was interested in is something called "chondroblastoma," it's a very rare tumour that is mostly benign and it occurs in the end of bones. The problem with chondroblastoma is it tends to come back and it tends to destroy bones so, although it isn't a lethal cancer, it's quite a debilitating cancer.
So, I was interested in that tumour and I wanted to know what drives these tumours so what we did is collaborate with a pathologist (someone who collects tumour samples), Adrian Flanagan; we extracted the DNA; we also obtained normal tissue from the patients. And what that then allowed us to do is we've got these tumours, and then we sequence every single piece of information of the genome and compare the tumour to the normal tissue. And the difference between the tumour and normal tissue - that makes the cancer. Does it make sense?
That would be the screening experiment and, in this particular case, we found a single mutation that literally defines these tumours.
Graihagh - In some ways it sounded like quite an easy "this is the gene" but it's not always like that; it's much more complicated and nuanced. So how do you sequence all this data?
Sam - So we extract DNA and that goes onto the sequencers and they read in every single base of the human genome, so all 3 billion bases. But not only once but several times over. So they generate all this data and that's a real challenge in our experiments because you're overwhelmed with data. So if you think what 150 billion pieces of information that you need to process, obviously you can't do that on your computer; you need enormous storage space and you need enormously powerful computers to be able to process it.
Graihagh - Sam took me over to where these enormous computers are kept and to be honest, I didn't really know what to say about it. Fortunately, Sam had a few reflections of his own...
Sam - It looks cool.
Graihagh - It does look cool! It does look cool!
Sam - Like out of a spaceship!...
So we are standing in front of the data centre. So we are looking for a window and behind that window are a lot wires, a lot of black boxes and quite a lot of green and red lights.
Graihagh - It looks like a series of computer hard drives stacked on top of each other with lots of wires and lots of lights... it looks very complicated and its row, after row, after row of these things.
How many of your genome sequences would this hold or is that even unfathomable in terms of mathematics - off the top of your head?
Sam - I've absolutely no idea how big this computer frame is now because it constantly grows.
Graihagh - The thing is, you've got millions of bits of data and so finding something that's statistically relevant is....
Sam - Like finding a needle in a haystack, in a way.
Graihagh - And that's why you need the statistical analysis?
Sam - That's why we do need the statistical analyses indeed. Because, mainly, what you get when you do these experiments is you mainly end up with noise. So the main art is in getting rid of the noise and finding the truth. And finding the truth involves a lot of very intricate review of raw data but also doing further validation experiments to show that what you think you've found is, actually, true.
Graihagh - So you've got your results and then, I guess, the next bit is to write up those results, to have a discussion and compare it with other bits of the literature. Who does what; if there are multiple authors on a paper do you fight over the last word or do you hash things out together or do you give each other segments - how does that process come together?
Sam - I think different labs do things in very different ways and our lab has a particular style whereby one person writes a draft, and that then is a basis for further discussion, and then it does come down to battling every word and expression. The people that I've written with - some have have a very sort of bloomy, fruity style of writing, others are very matter of fact. And it has to be absolutely consistent with the truth. So whatever we say we've got this rule in our heads we have to be able to stand up in a court of law and defend every single sentence, every single word and expression that we've used in our paper.
Graihagh - How long does that take - I imagine it takes some time if you're fighting over every last word?
Sam - It depends. The quickest paper I've written was perhaps about four weeks and the longest paper I'm currently hacking away at, and I've been hacking away at it for about five months.
Graihagh - How does that make you feel?
Sam - Ah, I just want to get rid of it! I mean you get to a point where you don't care any more whether it gets published or not because you just want to get it out of your life.
Graihagh - OK. And I suppose the next step is what sending it, selecting a few journals perhaps and sending it off to them?
Sam - We think about the journal first. I mean you think about who your target audience is, and what the sort of journal is you think you may want to target. And then you also have a realistic review of your data and you say is it important enough or of enough impact to go into this or that journal?