The Coronavirus Mutation Situation

Underneath this global pandemic is a tiny string of genetic code, whose sequence reveals its nature...
14 April 2020
Presented by Phil Sansom
Production by Phil Sansom.


The genome of the coronavirus.


In this episode we’re taking apart the tiny creature behind this global pandemic. From how looking at the genes of the coronavirus can help figure out the animal it comes from; to the exact ways it’s spreading around the world; and even how a hidden mutation is threatening to lead vaccine-makers on a wild goose chase. Plus, Gins & Genes goes virtual; stay tuned to hear what’s inside our guest’s downstairs toilet...

In this episode

An artist impression of a coronavirus particle

00:36 - Will It Sequence: Coronavirus

Can you gene sequence the coronavirus? And why would you want to?

Will It Sequence: Coronavirus
Neil Ward, Illumina

One of the questions we regularly ask on this podcast is: Will It Sequence? Can you take something like a dog, or a sample of fishtank water, and extract the DNA? Right now there’s a pandemic going on, but it’s worth asking: Will It Sequence? Can you gene sequence the coronavirus, and why would you want to? Phil Sansom asked Neil Ward from Illumina…

Neil - Yes, we can sequence the whole genome of coronavirus. Someone will have a swab taken of their nose or their throat and then we can extract from that the genetic code that makes up the virus. Now normally we sequence DNA. In the case of the coronavirus it's a slightly different type of code, it's called RNA.

Phil - So can you read RNA the same as DNA?

Neil - We convert that actually into DNA. Once we've converted it into DNA we can put it onto our sequencing machines, and then we can read the As, Cs, Ts, and Gs, and understand that sequence.

Phil - How hard is that? How big is this amount of RNA that you've got to read?

Neil - Well fortunately for the world in this case, the RNA genomes are much smaller than the human genome. So human genomes are three billion letters long. That's a big book of instructions. Whereas the virus is actually really pretty tiny, it's thirty thousand letters.

Phil - If human DNA is like the complete works of Tolkien, what is the virus RNA in comparison?

Neil - I don't know, it's like 'Spot the Dog'. Not sure if that's one that's familiar with the audience there? But it really is very simple, a small number of words that we can quickly read through and get an understanding of the full story of the genome of that virus.

Phil - What story are you talking about here? What can you understand?

Neil - Well there's a lot we can do from looking at that viral genome, and the diagnostics that are being used today around the world testing hundreds of thousands of individuals to see whether they have the disease are a really simple genetic test, they're called an RT-PCR test.

Phil - RT-PCR?

Neil - Yeah. Reverse transcription, so that's the first step of converting the RNA into DNA; and then PCR, a polymerase chain reaction, which is a process that we do to make copies of the DNA, to amplify the amount of material such that it becomes measurable on simple machinery.

Phil - Apart from just looking for the virus itself, there's other uses for genetics here aren't there.

Neil - Yeah, what many people are trying to do with the coronavirus genome sequencing at the moment is build a family tree. And that ability to do what's called genomic epidemiology has been game-changing in the last five to ten years. Historically, if you went back prior to the use of whole genome sequencing, the public health bodies when they found multiple individuals that seemed to have the same disease would go through a series of questionnaires asking them where they've been the day before, or where they've been that week; and I don't know about you, I struggle to remember what I was doing this morning, let alone last week and who I've met. So that type of information was really difficult to get to the underlying causes and who had infected who. The sequence information that we have here is nowadays layered onto that type of questionnaire, and increasingly other electronic information as well. And collectively that allows people to build a much better understanding of how the virus is spreading.

Phil - Now obviously things aren't too good for the world right now. I'm recording this from lockdown. I think I'm right that you're in lockdown over there yourself?

Neil - Indeed. It's a challenging time. My kids have not yet interrupted, but many of us are working from home with family members. It's changing life significantly. And you're right, for many people they're concerned, they're worried, and hopefully the genomic data will go on to allow pharmaceutical, biotech, and other researchers to better understand the viral genome, with the long term aim of being able to design some vaccines or being able to design new antiviral therapies so that we can get out of this situation of having the world locked down.

Vampire bat

04:54 - What animal did the coronavirus come from?

Once the virus is isolated, you can compare it to animal versions - and the evidence so far is uncertain...

What animal did the coronavirus come from?
Arinjay Banerjee, McMaster University

When the coronavirus was first identified at the end of 2019 as something new and dangerous, there were a lot of questions. What actually is this disease? And where did it come from? These are especially difficult when you know almost nothing and your target is essentially invisible. Arinjay Banerjee at McMaster University in Canada has been researching coronavirses in bats for six years before this new one came along and his skills became extremely relevant. I asked him how he and other scientists have been answering these questions…

Arinjay - When Covid-19 cases showed up in Canada one of the first cases in the province of Ontario showed up at Sunnybrook hospital. I instantly offered my skillset and I think I knew I could contribute in understanding this new outbreak. One of the priorities was to get the virus out of these patients so we can use it to develop vaccines, drugs. So I acquired swabs from the patient's nose and I put them on cells in a high containment lab and we were able to get lots of virus onto these cells.

Phil - Like a lot of people I would expect that if the virus exists in your country or in this person across the room from you, it's not actually that hard to get it, but it sounds like it's actually a whole process.

Arinjay - Yes. If you think about it, humans have lots of different viruses. We have the common cold. You may have the flu and there are other viruses that you may not develop symptoms for. The trick is to separate out the one virus that you're interested in, which for us was the new coronavirus and we used a treatment that enhances coronavirus infection in the cells.

Phil - How do you keep yourself safe and stop yourself getting infected?

Arinjay - So these are not regular labs. These are some of the best high containment labs that exist on the planet. A containment level three lab is, it's essentially you're working inside of a cabinet that's negatively pressurised. So all the air comes into the lab and gets sucked out and that's just the facility. We also have PPE, very expensive, very fancy PPE that we wear working with the virus.

Phil - Okay, so you have your special equipment to keep you safe. You have your special treatment to take out only the coronavirus. Did it work first time?

Arinjay - Yes and no. We started with three samples and it worked for two samples and it didn't work for the one sample. Now we don't know if the sample that didn't work had very low amounts of virus in it and the samples that did work may have had lots of virus in it. We really did get lucky.

Phil - For the two that it did work, you've now got the virus’ total genetic code?

Arinjay - Yes.

Phil - Now we've seen a lot of stuff in the news that's trying to explain what animal the virus originally came from. Can you tell that from the genetics?

Arinjay - Yes and no. So for you to be able to find a good match with an animal source, somebody would have had to sample that animal sequence, the virus and that animal and submit it that sequence in the database. Now for bats - bats have been looked into extensively since SARS-1, and the closest match to the SARS-2 coronavirus with the 96% identity is a bat coronavirus. But are bats responsible for direct transmission to humans? We don't know. I think that data is very anecdotal on this. We're not sure if the virus went bats to humans or somewhere mixed up with the pangolin coronavirus.

Phil - Why pangolin? Where's this bat versus pangolin confusion?

Arinjay - If you look at the virus, there's a protein that's critical for infection - it's called the spike protein. Now that sits on top of the virus, it interacts with cells in the human body and infects human cells. Now within the spike protein, there's a small portion that's called receptor binding domain, but this small portion is critical for that entry into human cells. Now the small portion within the spike protein is almost a hundred percent identical with a pangolin coronavirus. The rest of the virus is not, but this tiny little small fragment is almost a hundred percent identical. So all of these questions and observations raise a bunch of questions like, where did the virus come from? Was it bats? Was it pangolins? Or is there a third species of animal missing?

Phil - So if the way that people even figure this out is just by matching the genetics to see what looks most similar, how close are we? Is it like putting the two of diamonds against the three of diamonds or is it like putting the two against the King and going, well, I dunno, they both have a diamond on them?

Arinjay - I think at this point in time we are really trying to find the third piece of diamond. You know, it's a great question and you can only answer this question if we could have extensively sampled all the animals that existed in the seafood market. Now this is assuming that the outbreak started from an animal in the seafood market. It's possible that an infected individual walked into that market and spread it to other individuals. I think we can say with some certainty that the virus did evolve in bat. What we cannot say with any amount certainty is that the virus jumped into humans directly from bats. How did pangolins get involved? What we don't understand is like the logical interactions that really went on.

Phil - So ultimately it's probably from bats, but a bat could have weed on something and then that animal went to something else and then it could have been this whole huge wacky chain of animals that we don't know about?

Arinjay - Yes, it is possible that we might never know which animal directly transmitted this virus to humans.

A map of the world showing hotspots of an infection.

10:21 - Meet the SARS-CoV-2 family tree

As the coronavirus has left Wuhan to spread around the world, scientists have built its family tree...

Meet the SARS-CoV-2 family tree
Richard Neher, University of Basel

As the coronavirus has left Wuhan to spread around the world, its genes have mutated in tiny ways. By gene sequencing the virus as it goes, and matching up copies wherever the genes are most similar, scientists can build a record of where it has travelled and when. Richard Neher from the University of Basel in Switzerland is one of the people behind Nextstrain, an open source project that’s been responsible for creating and analysing coronavirus ‘family trees’. Phil asked him how - and why - he's done it…

Richard - It really has been a bit of a whirlwind development. So recall that this outbreak was first announced in the very end of December last year. By January 9th, we already had the first complete genome sequence of this virus available. And then over the weeks that came, we got more and more of these genome sequences that came in. By comparing these genome sequences to each other, we immediately got very good sense of that this is a single outbreak. These genomes were essentially identical. There was maybe like, one or two differences here and there, and for an RNA virus was a fairly high mutation rate. That implies that these genomes had a very recent common ancestor only a few weeks in the past.

Phil - And when you say the genomes were coming in, are people sending them to you orwhat are they doing?

Richard - No, and that's an important point. GISAID, which is a database that's used for influenza data sharing.

Phil - GISAID?

Richard - GISAID. The Global Initiative for Sharing All Influential Data. They have jumped in and provided their infrastructure and terms and conditions. Their sort of sharing mechanisms for the coronavirus sequencing community. So that has enabled labs all over the world to share their data for analysis.

Phil - How many genomes and sets of data are they getting?

Richard - We now have more than 2000 full genomes available, and we can't look at all of them at the same time anymore. There's simply so many. It's been the first time that this sort of, real time sequencing, sharing and analysis is playing out.

Phil - Now you said those first few were basically identical. What's happened as the virus has gone all around the world?

Richard - As RNA viruses do, they mutate, not every other day, but about twice a month. That is sort of our current estimate.

Phil - And these aren't mistakes that are going to kill the virus.

Richard - Well some of them will kill the virus, but those are just dead ends, right? So the only ones that we see are those that don't kill the virus. They don't necessarily make the virus more aggressive or anything like that. Most of these mutations likely just don't really have too much of a significance. But what they do allow us to do is group viruses together. You know, a particular virus that got sampled in the US is similar to a virus that got sampled in Europe somewhere. That sort of, gives us an idea how the virus is dispersing and how different outbreaks in different places might be connected.

Phil - So how has this been useful as the virus has gone to continent after continent?

Richard - Early on when there's an outbreak in some country, politicians are very happy to say, well, we closed the borders and problem is solved, right? That has never worked. Surprise. And the sequences, they can tell you that this decision is wrong, right? If you see many sequences that are very similar in your country, they probably were transmitted locally. So this is not a problem that you solved by closing borders, but this is a problem that you have to solve by clamping down on transmission in your community.

Phil - Have you found weird cases where viruses have spread in ways that you haven't expected? And you can tell that from seeing the full genomes of the virus.

Richard - So we've certainly seen, especially in the last week or two within Europe, this viral population is very well mixed. There is not a single place this virus is coming from anymore. And this has to some extent been surprising, how rapid and thorough the spread has been. Anyone can go online and see this, you'll see both a family tree of the virus and a map that shows you where on the planet these samples come from. And one has to be very mindful of the gaps that are in those data, because mutations happen randomly. Sometimes there is no mutation for like four weeks. Sometimes there's three mutations in a week, and this means that seeing two things close together in the tree doesn't necessarily mean they were in the same place in time.

Phil - Now obviously we're in this pandemic for sort of, the long haul. What's the point of all this virus family tree mapping?

Richard - Well most obvious data that we have about this outbreak is the number of cases in different places, but what do these sequences give us? They add structure to these numbers. It's not just sort of, 10,000 cases in New York or something like that. Suddenly you can break this down into multiple variants. So you know you have not one outbreak but you might have three outbreaks that sort of, originate in different places.

Phil - And how is that practically useful?

Richard - It helps you focus infection control measures that you put in place, as you just said we're in this pandemic for a couple of more months for sure. Right. So we'll have to understand, where this virus is transmitted, having the ability to use genome sequences to identify these transmission chains, and transmission clusters gives you means to target infection control measures.

Spike glycoprotein from SARS-CoV-2.

16:04 - Coronavirus mutation disrupts vaccine trials

Researching the fundamental workings of the coronavirus has revealed a repeat mutation in monkey cells...

Coronavirus mutation disrupts vaccine trials
David Matthews, University of Bristol

Lots of institutions around the world are currently hard at work making tests and treatments - you can hear about a few of them on the Naked Scientists main show episode How COVID-19 Works. But very often, these tests and treatments are developed without anyone necessarily understanding how they work. Dave Matthews is a coronavirologist from the University of Bristol, and his mission has been to fully understand how the coronavirus functions, on a fundamental level. He told Phil Sansom how this “walk before you can run” approach has already picked out a hidden danger…

David - Together with my colleague Dr. Andrew Davidson here at Bristol and another colleague Professor Julian Hiscox, at the University of Liverpool. We are basically the only three human coronavirus people in the UK. We started working on this virus immediately and one of the things that we were interested in is what does the virus make and what does the virus do when it gets inside a living cell. What proteins does it produce and that is what we were interested in in the first instance.

Phil - How physically do you do that?

David - We take the virus into our containment laboratory where we've got a specialized cabinet with a series of airlocks in it and we take living cells. In this case we're using a monkey cell line and we add the virus and leave it for a few days until we start to see the cells are damaged and starting to float off and die. And then we harvest all the material inside the cells. The first thing we want to extract is the genetic material from the virus and the genetic material that the virus makes known as messenger RNA. And it's this messenger RNA that is interpreted by the cells and turned into proteins. And the other half of the sample that we want to take out is to extract the protein separately. And so these two samples, if you like, the genetic instructions that the virus is making and the proteins that virus is making are separated from each other and then analyzed separately.

Phil - So this is like a catalog of everything the virus is and everything it does.

David - Yes, that's right. You can ask the very direct question, what do you make and the instructions you are putting out there and are they actually turning into proteins and that's what we wanted to catalog.

Phil - So what did you find?

David - The things that you predict are being made for the most part that is actually what's happening and that may seem really rather dull and uninteresting but in terms of getting your basics right it's important to establish that the virus is doing what you think it's doing most of the time. It's making proteins that enable the virus to replicate its genetic material - that's very important. It's also making structural proteins, other instructions make the spike protein, which attaches to living cells and helps the virus gain entry to new cells. There's also a variety of proteins whose function we're not quite sure what they do, but we think some of them are involved in trying to slow down your immune response. And then there will be proteins whose job it is to simply make the cell a more amenable place for the virus to replicate in.

Phil - That's the stuff you expect. What stuff was there that you didn't?

David - So when we analyze the data in a little bit more detail, what we realized was that there was in fact two viruses now inside these monkey cells

Phil - What two coronaviruses?

David - Yes. One is the virus that is genetically identical to what we expect it to be, but the other one had missing just a very short section of instructions that makes a slightly different version of the spike protein, which we believe means that the virus that's growing in the monkey cells that has this slightly deleted version, can infect monkey cells more efficiently.

Phil - Oh, so you're saying that the bit of the virus that lets it get into cells - as you were investigating it, that bit changed so that it could get into the monkey cells better?

David - Yeah, that's what we believe.

Phil - What does that mean?

David - The issue is that when people are doing their vaccine studies, for example, typically people grow the virus in the laboratory in monkey cells. And what that means is that teams doing this could inadvertently have ended up with a mixture of two viruses, one that is the human virus and another one which is adapted slightly so that it infects monkeys a little bit better and it could completely wreck the study or give you a false idea of what's happening. That's the first problem.

Phil - I've heard plenty of stories of people already trying to get vaccines done and does that mean that these people could run into a problem that they wouldn't have otherwise realized was even there?

David - Yes, that's true. Although I think the news is out now, so I think people are definitely screening virus stocks carefully. The other problem is even more subtle. Let's say for example, you do grow large amounts of this virus and you check it and there's nothing wrong with it and it is all still human virus. Many of these vaccine studies will be done in monkeys eventually. It could be that in an individual monkey, the virus that you've given it adapts and changes and becomes a monkey version of the virus you gave it in the first place. So it means that not only do you need to check that the virus going into the monkeys is still a human virus, but also throughout the course of the vaccine study you're going to need to check that the virus hasn't mutated during your trial in each individual monkey. And it could again basically make a mess of your vaccine trial by creating anomalous results that you misinterpret as either the vaccine is working or it's not working.

A moth hanging from a twig.

21:36 - Science From Home: Moth Lab

How are scientists adapting to life without their labs? Meet one who's recreated hers in her bathroom...

Science From Home: Moth Lab
Zenobia Lewis, University of Liverpool

Many people around the world are slowly adjusting to working from home. But how are scientists themeslves adapting to life without their labs? In this virtual edition of Gins & Genes, AKA Science from Home, Phil Sansom been speaking to Zenobia Lewis from our Fly Infest-agation episode about the contents of her bathroom…

Zenobia - I'm a behavioral ecologist, I'm interested in animal behavior and I'm particularly interested in mating behavior. My main study species is a moth called the Indian meal moth. And these are actually pest species of grain. And so in a way a jar of muesli is their natural environment. And as a result I was literally able to just bring my entire lab back home.

Phil - Really? your entire lab?

Zenobia - Okay so a pared down version of it, but enough of the important stuff to be able to maintain my lines while we're on lockdown.

Phil - How many moths have you got?

Zenobia - I literally couldn't tell you, I mean I tried to maintain each generation at about a hundred adults and I brought home nine populations with me, so a lot.

Phil - So that's around 900 moths in jars of muesli in your house?

Zenobia - Yeah.

Phil - Where have you set them up?

Zenobia - I have them in my downstairs toilet. They are usually kept in an incubator so that I can maintain them at a constant temperature. Obviously that's gone out the window now they're in my downstairs bathroom. I have a microscope which I use at times for counting their sperm. And then in all honesty a lot of the kit is jars, masking tape for labeling, permanent markers for writing labels, really bog standard stuff.

Phil - How are they adapting to life in your downstairs toilet?

Zenobia - They're still alive so I think they're okay! These are actually quite special populations. I've been maintaining them generation after generation with a very particular treatment regime. So because I'm interested in mating behavior, one of the things I'm examining is the effects of what's called sexual selection where individuals, usually males, compete amongst themselves for access to females. And I've manipulated the level of that competition generation after generation for over 150 generations. That's 15 years worth of treatment. I wasn't just gonna let that go because we were shut down!


Add a comment