Genetics reveals when populations mixed
When we’re trying to piece back together the events of thousands of years ago that lead up to giant leaps like the inception of farming, or the colonisation of a continent, we rely heavily on the archaeology for clues. But thanks to a technique developed by Manjusha Chintalapati and her colleagues genetics can also be brought to bear and enable us to see very accurately when, back in history, so called “admixture” - the merging of one population into another - occurred. It can give us new insights into when postulated migrations, crucial to the evolution of the human race and practices like agriculture, did, or as it turns out, did not happen…
Manjusa - When you have two populations, which admix then the offspring has one chromosome from each parent and these chromosomes get shuffled when the offspring gives rise to the next generation. So basically the chromosomes fragments get shuffle every generation. So you can actually use the combination as like a molecular clock: based on the length of the DNA fragments you inherited from each of the parent, you can ask, when did the admixture occur? Because, if the fragments are a bit longer, then admixture could have occurred, you know, few generations ago, or if the fragments are much shorter then the admixture was many generations ago.
Chris - Very elegant idea. So I presume though, this is dependent on knowing some sequences, which are exclusive to each of the populations, so you can tell what's been added? Because, if they're the same, you wouldn't have any kind of footsteps to follow, would you. So how do you know what is unique to each of the populations that have merged?
Manjusa - You do need information of both the parent populations. Basically, the genome you want to date admixture in, and also the information of the populations you think that admixed genome has information from. Let's say you have an example of African Americans. You do need information of source populations, which are the European populations and also like the African populations. And then you can ask, when did these source populations admix?
Chris - And are you considering the entire genetic code when you do this? Or do you just look for individual little bits and you, you follow those little bits as almost like markers?
Manjusa - We actually follow the entire genome. So basically we process the entire genome of an admixed individual and ask, where are these blocks of DNA coming from? Let's say, European or African. And then we ask how long are these fragments? And just based on a estimate of how long these fragments are, we can actually go back and estimate, because this actually follows theoretically an exponential distribution, we can fit an exponential distribution to the segment length, and then find out when the admixture actually occurred.
Chris - Does it not matter where these bits are on the chromosome in terms of how likely they are to get diluted and recombined and that kind of thing? Does your method take that into account?
Manjusa - Yes. That's a very good question. It matters because there are recombination hotspots and cold spots in every genome. So, some of the fragments can get much shorter just because there could be more recombinations, meaning there's more DNA breakage there every generation. So we do use a recombination map. We actually have information where this recombination occurs and we do account for that.
Chris - This is brilliant! But I I'm frightened almost to ask, does it really work? And how do you know it works? Have we got a gold standard against which we can run this, so we can say, well, we, we know when this genuinely happened. Now we're going to ask what does your genome tool return, and do the two agree?
Manjusa - Yeah, there are examples where, you know, some of the history is documented, right? So you can use them as positive controls, but a much more better way of evaluating your method is always simulations. We actually simulated admixture between European and African ancestries. Let's say hundred generations. When we apply our method, we can date back 99 generations with three standard error, let's say. And this is true when we changed the proportion of admixture; when we change the time of admixture; when we changed the sample size of the population. So we tested a method for several scenarios and it works robustly in many scenarios.
Chris - It's almost like the genetic equivalent of carbon dating, isn't it, in the sense that, you know, you have a half life and the radioactivity is falling by a predictable amount each time. But it gets less accurate, the farther back you go, because the graph flattens out. Now does your method fall prey to the same thing where you've got a sweet spot, but once you start to get to a long, long, long time, the noise becomes more pronounced than the actual signal. So what are those ranges?
Manjusa - Yeah, that's a very good question. You read my mind! When the fragments get very, very short, it's very hard to date back the time, just because there isn't any information, right? The resolution of our method is actually up to 300 generations. So if you approximate the generation time to be around 28 years, on average in humans, that is around 5,000 years. But the novelty in our method is that because it can work with many degraded data, we can apply to ancient samples - let's say which lived 5,000 years ago - and the admixture in that sample could have occurred 5,000 years ago, so now we have ability to look at admixture events, which occurred let's say 10,000 years ago. That's the exciting part of our method.
Chris - Mm. I mean, it hasn't escaped my notice that this is mapping on beautifully to the times when ancient peoples were doing really interesting things and moving around a lot, embracing new ways of living, farming, agriculture and so on. Are you already probing some of those important questions around when people did things historically?
Manjusa - Yes, exactly. Just to give you a background, all present day, Europeans can be modeled as a mixture of three populations, which are the hunter gatherers, which come from the Mesolithic time period, Anatolian farmers from neolithic time period, and pastoralists from Steppe: from Bronze Age. So we could actually date back the time of formation for farmers and Steppe pastoralists using dates.
Chris - We have vague ideas based on a range of measures for when those different things and those different populations and different groups arose. Do our current predictions - based on those proxy measures - do they agree with your method? Have we got it right? Or have you found, with this, that in fact there are some gaps?
Manjusa - We know that agriculture in Anatolia actually dates back to a time of around 8,000 BCE, right? There's been a debate in the field saying that the agriculture in Anatolia reached there probably because of the movement of Iranian farmers, which was originally where agriculture comes from. But using our method, we could actually date back the gene flow of Iranian farmers in Anatolia to around 10,000 BC. That is way before what is documented, suggesting that probably it was just not the moment of the people themselves, but probably the techniques got diffused over time. So it was, it could have been, you know, the, probably the hunter gatherers locally transitioned to agriculture subsistence. So, using our method, we could actually give a precise timing of a mixture between two groups, which could shed some light on like how farming actually reached Anatolia.