The DNA sequencing revolution is providing ever more data about genomes from all kinds of species, from humans to bacteria. But how do we make sense of it all? Who gets their hands on it? And how do we use it to benefit patients? We meet the scientists developing new computer tools to analyse and democratise global genomics. Plus, how your partner’s genes affect you - assuming you’re a mouse - and a shrunken gene of the month.
In this episode
01:19 - New tools for clinical genetics
New tools for clinical genetics
with Nick Lench, Congenica
Since the dawn of DNA sequencing technology, around 40 years ago, scientists have been figuring out how changes in DNA bring about changes in people. And as capacity has grown and costs have fallen, with the hundred dollar genome looming on the horizon, clinical geneticists are trying to use genomic data to diagnose patients with genetic diseases. To make this complex task simpler, Cambridge-based company Congenica has developed an online tool called Sapientia, which guides clinicians through the process of linking a genetic variation in a patient to the condition that’s affecting them. To find out what it can do for patients and their families, Kat Arney spoke to former geneticist and now chief operating officer at Congenica, Nick Lench.
Nick - So the problem that labs face now is that through the advent of whole exome sequencing, whole genome sequencing because of the massive amounts of data that are produced. It’s actually quite difficult to do that in a single laboratory. You want to be able to scale up the analysis and use sophisticated bioinformatics. So, we’ve developed a software to address that issue. It enabled the clinical scientists to make a rapid diagnosis on the patient.
Kat - So you're saying, “I've got this patient who’s come in to me. I've got their DNA sequence. What is it that’s wrong with them? What is it in their DNA that’s making them ill?”
Nick - That’s right. So you have to look at all the different DNA sequence variants that occur in the patient compared with the reference sequence, the reference genome, look at all those different variants. So for example if you compare two whole genomes, there might be as many as 4 million sequence variants between two individuals. How on Earth do you filter those 4 million down to a handful of potentially causative variants?
Kat - What sort of diseases and disorders are we talking about here?
Nick - So these are what we called rare inherited disorders. Most of these will manifest in children - up to about 80 per cent of all rare diseases will occur in children. For example, they may have features such as developmental delay when a child fails to meet their developmental milestones. Inherited cardiac disorders so maybe structural problems with the heart. There may be issues with inherited kidney disorders as well. So really, the whole spectrum, any organ system you can think of, there will be a genetic defect and a range of conditions that affect those organs caused by inherited disorders.
Kat - So with the software that you have, clinicians can go, “Okay, I think that this patient has this particular genetic change that is causing them this problem.” What do they then do with that information? Can it bring them a cure?
Nick - In most cases the best outcome is a diagnosis. We are beginning to see more and more new therapies for patients with inherited rare diseases. The more information we accumulate about diagnosing these rare disease patients more we understand about the biology. And there's a lot of patient advocacy groups out there now who really want to look for new therapies. So for example, you might look at ways of repurposing existing drugs. There's quite a few nice examples. In the US, the Cystic Fibrosis Foundation has worked very closely with a pharmaceutical company to develop a novel therapy for CF. So that’s probably a really good success story of developing new therapies for a rare disease.
Kat - What about the cases where there is no treatments? What good does the diagnosis do for those families?
Nick - So I think it’s really important for all families to reach a diagnosis. If you talk to families that they want to understand why their child has a particular disorder. So if that’s due to an inherited condition, they can then use that information to help plan their lives for their children - so whether that would be receiving medical support, social support, educational support. And it also offers reproductive choices so again, if the mother wants to have a subsequent pregnancy then there are potentially options to ensure that that is a successfully pregnancy, an effective pregnancy.
Kat - So this kind of software where you can get the information, you can get a diagnosis, is that just useful for children, for adults? What sort of patient groups can this be used for?
Nick - It can be used for any patient group. What's really exciting is a new application in the field of neonatal intensive care. So this is where we see very, very poorly babies within a few hours of birth. If you're then able to do what we called a rapid whole genome sequence, you have the opportunity to make a diagnosis in a very short period of time, maybe within three to four days. And then in some cases, if you're able to have an early intervention that can absolutely have a fantastic effect on the outcomes for some of those babies. So it might be something as simple as a vitamin supplement and that can absolutely ensure that their brains develop in a normal way and they have all of their cognitive functions, whereas if you miss that and maybe 6 months, 9 months into life, it may be irreversible damage to the child. So it’s really, really important. Other applications might be where you can make a decision whether you have to give a child a heart transplant or a bone marrow transplant as well, so early diagnosis is critical. We’ll see more and more applications of whole genome sequencing and I think it will really benefit patients.
Kat - Say your programme finds a genetic change and says, “Okay, it’s this that’s causing the disease.” How do you know that for sure? What does a doctor do with that information? Do they just go, “Yep!” The computer says, “Ding! Off we go!”?
Nick - It’s really about building the body of evidence. So I think it’s a bit like innocent until proven guilty. So you take your DNA sequence variants and you build the case for, is it really that the causal variant that causes that particular phenotype in that patient. So most of the time, you're really looking to other reported examples. Has a patient like this been reported before? Have they got a mutation in the same gene? Have they got exactly the same mutation in the same gene? And so, you're sort of looking for other examples to back that up. Ultimately, what you want to be able to do is what we called functional validation so you can prove biologically that the alternation of that particular gene has a causal effect. But in reality, if within a diagnostic clinical setting, that’s not possible. It’s beyond the capability of the system and it’s expensive. So really, you're trying to build the evidence base and then you're effectively making a subjective decision and saying, “I think this particular DNA sequence variant causes the disorder in this patient.”
Kat - Nick Lench, chief operating officer at Congenica.
08:27 - Mapping microbes
with David Aanensen, Wellcome Trust Sanger Institute
Another researcher developing tools to help make sense of the new wealth of genomic data is David Aanensen, director of the Centre for Genomic Pathogen Surveillance at the Wellcome Trust Sanger Institute and also a faculty member at Imperial College London. As he explained to Kat Arney, he’s developed a clever online platform called Microreact, which allows anyone to put in DNA information about pathogens such as bacteria or viruses, tracking how they’re spreading around the world and even working out what treatments they might be resistant to.
David - Well, I think one of the key challenges that has made its way into the media very strongly is antimicrobial resistance – so, the emerging and increasing resistance to antibiotics for particular species of bacteria. Clearly, what we need to do is to understand which strains are resistant, how were they acquiring resistance, and how and where they spread. If we can try and identify, and understand how they spread and where they're spreading – is it from humans to animals, is it from animals to humans, is it from different hosts? Do these hosts pass on between individuals, between countries, etc.? If we can try and understand the global spread of these antimicrobial resistant bugs, we can try and track them, and stop them spreading.
Kat - How are you trying to do that? What are some of the tools that you’ve got?
David - Well, we try and look at the use of whole genome sequencing. So if you can sample a bunch of bacteria, and then you sequence their genomes, you can compare how similar the genomes are to each other. You can use this to depict the relationships between them as a family tree. If we look at how more similar genomes are to each other, and whether those genomes are also resistant to particular antibiotics, we can then relate where those bacteria are or who those bacteria have been infecting, and whether we can use that information to understand who’s been spreading to who.
Kat - So, if you find related bacteria say, in Paris and in Cairo, that might tell you that either the bacteria spread from Cairo to Paris or Paris to Cairo, or something like that.
David - So these are some of the inferences that people might make. What we try and do is use what's known as bioinformatics which is computational methods of looking at the sequence to compare how similar things are to each other. Once we’ve got these family trees, we can look at whether the more closely related isolates potentially come from either the same place or different locations. We can use that information to make inferences about whether it could be a transmission from one country to another or one locale to another. We could also look for the presence of genes or genomic signatures in the genomes. If those signatures are present, it gives us an indication that that strain might be resistant to a particular antibiotic or not. Being able to do this on a global scale means that we can try and look towards monitoring the emergence of antimicrobial resistance and it spread both locally, nationally, and internationally. If we can do that then we can try and identify the emergence and then stop that spread.
Kat - How can we gather this kind of data? How can we gather and collect bacteria around the world?
David - Hospitals. People coming in to hospitals, swabbing individuals. Most often, if you go to a hospital and you have a bacterial infection then there will be some method to identify what species it is. But you can also then do standard antimicrobial resistance tests. This involves giving the antibiotic to the bacteria and seeing how many of them are killed and that gives you an indication of whether you can use the antibiotic to treat a patient or not. If we use whole genome sequencing, then we get a readout from the sequencer of a string of letters – A, C, Ts, and Gs. This is digital information so it could be stored very easily in databases. Those databases can be made easily available via the internet which means that anybody in the world essentially can get access to that information almost in real time. And we can then build on top of that intuitive interpretation and visualisation methods. So you can build trees online. You can add genomes from different countries to the same database and then we can enable anybody in the world to view the data as soon as it is produced. So we try and build methods on top of genomic sequence data that enable the information to be democratised, to make it universally accessible and available to anybody to identify whether they have more similar genome to ones that have been seen before.
Kat - You could imagine someone in the World Health Organisation or here in our NHS or the CDC in America going, “I want to know how this particular bacteria is spreading or how the flu virus is spreading this year or Zika virus” and you’ve got the visualisation tools that they could do that?
David - That's right. That’s what we’re trying to produce. We’re trying to produce open access systems that enable the collation of sequence data from anywhere in the world and the availability of that sequence data to anybody with any expertise to try and understand what's going on with the data. So this is clearly applicable for example in the UK, Public Health England, to understand the spread and they're using genome sequencing currently to understand the spread of gastrointestinal infections and other bacteria. The CDC, to understand the spread within the US, and of course, up to the level of the WHO. Genomic epidemiology has been used for understanding the spread of Ebola and Zika viruses. There are moves and efforts to actually produce these kinds of systems for exactly those kind of bugs.
Kat - What about if people, citizens, the general public want to get involved, because we must be covered with all sorts of bacteria and going around places where there are bugs. Is there any way that people could gather bacteria and help with this effort of tracking?
David - So I think that would be lovely. What would be ideal, it would be fantastic if we can monitor what's going on in a healthy population. So if we all carry bacteria ourselves, in the gut or on our skin. A classic example is Staph aureus that lots of us have on their skin and it doesn’t cause disease until it gets inside into the blood for example. If we could monitor what's going on with the healthy population, we could potentially use that information to spot the emergence of things that are potentially of greater risk to the public health. So actually, being able to swab yourself, send that off somewhere, have the genome sequenced. And for those background data to be available to anybody in the world to contextualise new information would be fantastic. What we need to do is to be giving genome sequencers away for free.
Kat - I love the idea of being a bacterial tourist – just go everywhere and swab while I'm away.
David - That would be pretty cool. Maybe not the best way to pitch a holiday to people, but that would be a wonderful thing to be able to do.
Kat - David Aanensen from the Wellcome Trust Sanger Institute. And if you have any germy genomes you’d like to analyse, you can have a play with Microreact at microreact.org
15:13 - Genomics for India
Genomics for India
with Sumit Jamuar, Global Gene Corp
Last month saw the launch of the first ever so-called beacon for genomics focusing on India. Joining many other beacons around the globe, mostly focused in more developed countries, the Indian beacon is an online portal allowing researchers all over the world to search for genetic variations specific to populations from the Indian subcontinent. Put simply, a beacon is a website through which scientists can ask institutions and organisations holding human genomic data whether they have any data with a particular DNA variation in a specific place. All the data is anonymised, but it helps researchers to identify whether that institution is holding genetic data that might be useful to share, to help discover more effective drugs or other healthcare interventions. The Indian beacon is being lit by genetic technology company Global Gene Corp and the Global Alliance for Genomics and Health (GA4GH). Kat Arney spoke to Global Gene Corp’s CEO, Sumit Jamuar, to discover why it’s so important to shift focus towards Indian genetics.
Sumit - The world spends about $1 trillion on drugs every year. Of that, when you look at various studies, just over 40 per cent is deemed to be overall not effective. So what it means is we are spending just over $400 billion of money on drugs which do not have an effect. That creates a phenomenal amount of wastage out there. Technology and the fact that you can have genomic data and you can have data about every individual, and the fact that we can tailor the treatment to them or in some cases, we can ward off certain risks in advance. It’s a phenomenal promise because that's what is needed because we will have to find the saving from some place and we will have to make things better. We realised that 60 per cent of the world’s population was contributing less than 5 per cent of genomic data. When we started out, that number was less than 1 per cent and that’s a staggering number. When we looked at a place like India which is 1.3 billion people, has got 20 per cent of the world’s population, it contributed only about 0.2 per cent of genomic data. What we realised was, if you look at the power and possibility of genomics particularly around precision medicine where you can change the health outcome for every individual, and allow them to have, not only a longer but a better quality life, that promise is incredible. What that was lacking was genomic data to realise that promise and that’s what we have set up to achieve.
Kat - Sumit Jamuar from Global Gene Corp.
17:48 - Genomes going global
Genomes going global
with Ewan Birney, European Bioinformatics Institute and Global Alliance for Genomics and Health
With more than a billion people living in India, not to mention the diaspora around the world, this kind of joined-up global genetic effort is only going to get more important in the future. And as Kat Arney discovered, the potential benefits are huge, according to Ewan Birney, director of the European Bioinformatics Institute in Cambridge and chair of the Global Alliance for Genomics and Health.
Ewan - Well, it’s really exciting. One of the reasons why is because it’s become remarkably cheaper to sequence DNA and we can now use that in healthcare. Previously, we only did it for a few people in a research context. Now, people can be sequenced because they're suspected of a particular genetic disease. But what that also means is we’ve got to realise all the research benefits. We have to be able to share that data between our healthcare systems. Not just between researchers but between people in different countries with different healthcare, rules and systems. That’s where the Global Alliance for Genomics and Health really comes in. What the Global Alliance aims to do is set up web protocols like the internet for genomic data sharing, but responsible genomic data sharing where the data will usually reside inside of the country of residence under the legal protections in each country.
Kat - Are there issues with things like formatting? I mean, those of us will remember the problems with, “Oh, that’s on a Mac, that’s on a PC…oh, I can't transfer these files.” Are there those same problems there with DNA?
Ewan - There are. Those are problems which really are just about engineering and removing. They do get in the way but that’s not the big problem. The bigger problem is having systems that work at the scale – this is petabyte, exabyte scale – so this is a physical scale of data problem. But there's also the interactions with the legal models present in every different country. Every country respects – in my experience – the value of research and the importance of sharing to create a better research on human health and they're positive about that. But when you're talking about whole populations, all of Denmark, all of the UK, all of France, you have to take a responsible attitude towards data sharing between these and researchers will have to adapt to a system which still allows research but in a responsible way of moving their code to the data sites rather than sharing data.
Kat - Ewan Birney from the European Bioinformatics Institute.
20:25 - Social genes
with Amelie Baud, European Bioinformatics Institute
Regular Naked Genetics listeners should know by now that our genes help to determine our characteristics, and influence our health and even happiness - although of course they’re not the whole story and the environment plays a role too. But did you know that your partner or housemate’s genes might be having an impact as well? In January, Amelie Baud and her colleagues at the European Bioinformatics Institute published a study in the journal PLoS Genetics, showing that this is in fact the case - or at least it is for mice, as she told Kat Arney.
Amelie - We all know that people influence each other. There's nothing new here, but we’d like to know how this works. What we have found recently is that the genes of our partners influence us indirectly.
Kat - So, my partner’s genes, my boyfriend’s genes having an impact on me. What's going on here? How does this work? What did you do and what did you find?
Amelie - So, I’ll give you an example with my partner. Say for example that his genes tend to make him a very nice person who also cooks well and for example, smells good. This is going to help me for example on bad days or maybe generally going to make me happy. So indirectly, the genes of my partner affect my welfare and health. This example is a bit simplistic of course, but it really shows that it’s not only the behaviour of our partners that matters. There's many other ways in which our partners can influence us. Skills, for example, or physicochemical traits like smell or good looks.
Kat - Being big and cuddly?
Amelie - Exactly. There's really so many ways in which our partners can influence us. What's really amazing is that we can capture this, we can measure this, and hopefully, understand how this works just by measuring the genes of our partners.
Kat - So, this kind of boils down to the assumption that it’s our genes that affect who we are and how we come out, and what we’re like, and how we behave. But from what I understand, it’s not quite as simple as like, one gene, you come out like this.
Amelie - Of course not. It’s not that simple. So first of all, our genes only explain some part of how we behave, how we are, and so on. Our environment, and life experiences, and life habits really also play a major role here. So the contribution of our genetics is limited to begin with. And also, it’s not only one gene for the traits that we are looking at here. It’s more likely many genes – tens, maybe hundreds of genes, so it is complicated.
Kat - So how do you go about starting to unpick this? What are you looking at and how do you start measuring the influence of someone’s genes on their partner?
Amelie - So, I have to say that we haven't looked at people yet. So far, our research has been in laboratory mice. So in laboratory mice, you can group mice in a cage. So you define who you choose which mouse interacts with which other mouse.
Kat - So like a kind of mouse flat share - a mouse share?
Amelie - Exactly. So first of all, you define their social partners which are cage mates in our case and then you measure a number of traits of interest. So we were interested for example in a number of behaviours including anxiety, mood, but also metabolic, immune traits, and also wound healing for example. So you measure those traits in the mice and then you also measure the genotypes or the genes of the mice. And then you simply look for an association between trait of one mouse and genes of the cage mates.
Kat - Are you looking at specific genes or are you just saying, “Okay, this mouse has a genetic variation here and the mice that it’s caged with are more anxious.” Is it at that kind of level?
Amelie - We have quantified the overall effect of the genetic makeup of cage mates. We have not so far identified specific genes. Overall the genes of cage mates affect significantly and substantially a number of traits. And what we want to do next, as you hinted, is to identify specific genes. Because with specific genes, we can get clues on how things work, how partners influence each other.
Kat - Does this work with mice that are related to each other, sort of brothers, sisters, siblings? Or does it work with mice that are completely unrelated say, like flat sharers?
Amelie - We have evidence it works with both. In our published study, we had both unrelated and related mice. In both cases, we found that the genes of one mouse influenced the trait of another mouse.
Kat - Where does this go next? So you say you’ve done this in mice, do you want to look in humans? That’s going to be tricky.
Amelie - Obviously, you want to look at humans. It’s really important to know whether this phenomenon extends to humans. We think it will because first of all, it’s been observed in other species. Not only in mice. This is the first time [in mice] but also in cattle by animal breeders who have been working on this for a while. So, there's evidence that in different species these effects exists, so we think they might exist as well in humans. But obviously, we want to find out with experimental evidence. So, we are going to look at humans. Of course, we do not choose who is interacting with who so we first of all, need to find who interacts with who. One simple way to do that for example is to look at people who live together. And then we can do very similar analysis – the statistical models we use can be used for human populations as well and we definitely want to do that. Interpretation of the results is going to be more tricky in humans than it was in mice.
Kat - So we’re not going to have a genetic dating agency or a genetic flat share agency anytime soon?
Amelie - Not anytime soon, no. It’s really difficult to identify what people expect from their partners, what they really want from their partners and this is complex, unconscious, it changes over time. So it would be really dangerous I think to try to use genetics to tell people to hang out together. It’s really not so much about the genetics, the issue. It’s really about knowing exactly what we want, what we expect from our partners.
Kat - Amelie Baud, from the European Bioinformatics Institute.
27:10 - Gene of the Month - Shriveled
Gene of the Month - Shriveled
It’s time for our Gene of the Month, and this time it’s Shriveled. First described in a paper published in May 2016 by Karen Chang and her team at the University of Southern California in Los Angeles, it’s yet another of those fruit fly genes named after the appearance of unfortunate insects carrying a faulty version of the gene. In this case, male flies with faulty Shriveled are infertile, and also have testes that shrink and shrivel with age. The reason lies in the stem cells responsible for keeping them fuelled with sperm. Although a lot is known about the genes and molecules responsible for setting up these stem cells in the first place, much less is known about the way in which they’re maintained over time. The healthy version of Shriveled seems to play a vital part in that process, keeping male flies firing on all cylinders as they age.