Big Data, Big Deal?

Big Data is everywhere but what is it, how is it used and how does it impact our privacy?
17 November 2015
Presented by Chris Smith, Connie Orbach




More pieces of data have been produced in the last five years than in all of human history put together before then. But what's driving this big data revelation? We'll discover what opportunities it opens up, and we'll uncover the pitfalls we might be facing. Plus, news that scientists uncover the first water on Earth, and we talk to the team who raced a solar powered car 3,000 kilometres across Australia...

In this episode

LED Light

00:54 - Radio-powered optogenetic brain implant

A tiny wireless brain implant powered by radio waves that uses light to control nerve cells has been developed by scientists in the US...

Radio-powered optogenetic brain implant
with Dr Robert Gereau,Washington University

A fingernail-sized wireless brain implant that is powered only by radio waves and can control nerve cells using pulses of light has been developed by researchers in the US. The device makes use of a phenomenon called "optogenetics" where scientists first make nerve cells light-responsive by turning on a gene that produces a light-sensitive chemical. The implant communicates with these cells by producing light from an embedded LED. It is controlled and picks up energy using a tiny grid of wires that work like an antenna. It's the brain-child of Robert Gereau from Washington University, in St Louis and he was kind enough to talk Chris Smith through it...

Robert - Essentially what opto-genetics is, is a technology that allows scientists to control the activity of neurons, the cells that mediate transmission in the brain using light.  Typically the way this is done is something like a fibre-optic cable carries light from a laser to allow you to shine light into the brain, and so for that to happen you have to insert the fibre-optic cable into the particular part of the brain.  This has a couple of problems, one is that the fibre-optic insert itself is rigid and  can damage when there is relative motion of the implant, and of course the tethering to the laser, to the light source restricts your ability to implement this in, say, complex behavioural experiments that have been the standard of behavioural neuroscience for decades.  So it's hard to  apply this technology to the standard approaches in the field that would really enable great insight into complex behaviours.

Chris - So what was your approach instead?

Robert - So the approach that we've been taking is to take small electrical circuits that are flexible and bio-compatible and couple them with tiny light sources, something we call micro LEDs, that can be integrated into these small flexible circuits.  That technology has been advancing, there have been a couple of papers in the last couple of years of implanting micro LEDs into the brain, injecting them into the brain, and sort of integrating those LEDs with the circuit but the remaining problem was powering them, and so there have been wired cables that power the LEDs, which gets you back the same tethering problem we have before or you can envision tiny batteries, or in the case of what we've been working on here is wireless energy harvesting from an RF antennae.

Chris - They are basically just a grid of wires that can pick up fields aren't they?  So if you can beam in radio waves and they'll be sensitive, they'll pick those radio waves up and can turn it into some electricity that the devise can use.

Robert - Yes, exactly.  You design a small circuit so that it kind of matches the frequency of that energy, and harvest that energy to power this tiny little LED, and we have been able to incorporate them into materials that are stretchy and flexible and tiny, and that means that they can be completely implanted under the skin in the animal.  You can put these micro-LED devices with their antennas anywhere in the body and because they are stretchy and soft, they move with the body, so they kind of match the properties of the biological tissue.

Chris - If I had one in front of me, a) how big is it and b) what would it look like?

Robert - If you imagine sort of holding out your pinkie finger and looking at the nail on your pinkie finger, the basic prototype device would fit on the nail with a bit of room to spare. They are sort of clear, looking like a gelatinous substance. They are made of a substance called PDMS.  Medical devices are made of this, it's stretchy and soft and bio-compatible.  Imbedded in that are little gold traces, which are the metallic components of energy harvesting antennae and then some tiny electronic component that basically amplify that energy and power the LED, which is small to the point of not being visible to the naked eye.

Chris - And you could lay one of these devices alongside say, a nerve, along the spinal cord or onto the brain.  

Robert - Yes, precisely.

Chris - The idea being you'd send the energy in with the radio waves it would make the LED light up and put light of the right colour into the brain and stimulate those nerve cells.

Robert - Yes, that's exactly right.

Chris - How do you know this is going to work?  What tests have you done to show, a) it's safe and b) that you actually can control nerve tissue with this?

Robert - Well we have done long term bio-compatibility studies in our animal models, thus far. Implanting them and leaving the animals for weeks and months, and then looking at the tissue to look for any signs of damage, and signs of inflammation and we have no indication that there's any kind of adverse effects of implanting these devices over a proliferal nerve, or in the space above the spinal cord, for example to manipulate those circuits.  And in terms of knowing whether it will work, we've done some proof on concept studies to demonstrate that by implanting these over a proliferal nerve or above the spinal cord, we can very clearly manipulate the circuits that are involved in pain and pain relief, and are able to very robustly affect those behaviours in these animal models.

Chris - Could you use this therapeutically?  If you had a patient with a certain condition, could you apply this and use this to control their brain?

Robert - Well, that's certainly been my goal as we have been developing these, not only as research tools to enable us to understand  the cells and circuits that mediate pain, and how to relieve that, but ultimately, down the road, we hope to be able to develop these in the direction of medical devices where we can use them to very precisely control cells that are involved in aberrant communication in the nervous system that mediate pathological conditions, and in my case, chronic pain is at the forefront.

06:29 - Earth's first water found

Water on the earth is easy to spot but there is also a lot beneath the surface and some of this is very very old.

Earth's first water found
with Dr Lydia Hallis, University of Glasgow

Compared with our planetary neighbours, the Earth is a very wet place. But whatWater drop scientists are unclear about is how much of its water the Earth was born with when it formed, and how much water arrived afterwards aboard incoming comets and asteroids.

Now, Lydia Hallis, from the University of Glasgow, has found some of the first water that formed with the Earth, four and a half billion years ago. She knows that's what it is because the chemical composition of the water is different from what's in the oceans today; it's made up of a lighter form of hydrogen, and it shows that when the Earth formed, it came with a lot of water built in. Lydia Hallis explained to Chris Smith...

Lydia -  Up until, I would say, about 15 years ago, maybe less than that, we were looking at Earth as water rich, but we thought that most of Earth's water was present in the oceans and in the rivers and in the atmosphere, and what we're finding more and more is that, actually, when you start to look at the interior of the Earth, in the mantle, there are really a lot of places that you can store water, that you would never think you can retain water, but it turns out that if you pressurise really common minerals and silicates in the mantle, you can actually retain a lot of water in what is essentially rock. It may be that there's actually more water in the interior of Earth than there is on the surface.

Chris - So there therefore could be two things going on here?  There could be water born with the Earth, which is inside, and then there could be this veneer of water on the outside, which is a mixture of what it was born with and what was delivered here afterward then?

Lydia - Yes, so what we were trying to figure out whether the two are completely separate. So whether we could actually find a sample of rock on the Earth that really did represent this deep mantle, and whether there was a reservoir of water down in the deep mantle that was completely separate from surface water that we could analyse and that would help us to understand where the Earth's original water came from.

Chris - How can you get at rocks that date from the time when the Earth was formed though?  Do they exist?

Lydia - We don't find rocks on the surface that are from the beginning of the Earth's formation, no, because we have plate tectonics on Earth, all of our crustal rocks eventually get recycled.  So we are looking for a reservoir deep in the Earth that does represent the Earth's primordial water, its original water.  The way that we get at that reservoir is to sample larvas that come from deep mantle plumes.  I suppose the most famous area would be Hawaii, and the reason there are volcanos there is because there's this huge plume of material that's thought to come from the deep mantle and erupts onto the surface.  Therefore, we get rocks on the surface that are sourced in the deep mantle.

Chris - When you do this, and you look at the flavour of the water that is coming out of the rock and compare it with the water on the surface of the Earth.  Are they the same, or are they different?

Lydia -   They're actually quite different and it shows that, in the deep mantle, there are isolated sources that have water that has been unaffected by any processing on the surface of Earth.  So, it's been there since the formation of the Earth.  It's the original water that Earth was formed with.

Chris -   Therefore, when the Earth formed, there must have been a lot of water in the material that gave rise to the Earth for that water to be inside the planet like that?  Is that the implication of this finding?

Lydia -   Yes, especially because if you imagine when Earth was accreting or any planetary body is accreting from small dust particles into a planet sized body, you get a lot of heat in there from radioactive elements and, also, just the heat from friction in accretion, it causes the whole body to melt. And originally, the Earth would have been essentially a big ball of lava in space, really, really hot at the surface and so any water that would have been accreting in there, that was able to escape as a gas onto the surface of Earth would have escaped.  We would have lost a lot of water into space, so it shows that whatever was accreting must have been really, really rich in water for us to be able to retain the amount of water we have today having, we assume, lost a lot, a big percentage of that water.

Chris - Is this important because it begins to give us an insight into what was in that disc of material from which the rocky worlds, that we are one of, formed.

Lydia - Yes.  I think it's really important, not only in terms of the Earth's formation, but also the other rocky bodies in the solar system, especially places like Mars where we are potentially interested in going to explore as humans.  It shows, if we assume that Mars and Earth formed in a similar type of mechanism, you would expect then that not only is Earth is quite rich in water, but Mars is also, in its interior, got quite a lot of water and it starts to pose the questions of what happened to Mars' water.  Is it still there, is it retained as subsurface ice, and it's things that we really don't know much about any of the other planets at all, really.  It just goes to show that we don't know that much about our own planet.

12:00 - Mythconceptions: Sharks don't get cancer

Kat Arney dispells the myth that Sharks don't get cancer

Mythconceptions: Sharks don't get cancer
with Kat Arney

Kicking off the first in a new series of 'Mythconceptions', separating science factGreat_white_shark2 from science fiction, Kat Arney takes a look at a a persistent but very fishy tale from the underwater world - the claim that sharks don't get cancer...

Kat - The story starts back in the 1970's, based in legitimate scientific research into ways to stop cancers growing their own blood supply, a process known as angiogenesis, which provides growing tumours with a vital source of oxygen and nutrients. 

U.S scientists searching for drugs that could do this, noticed that implanting a small piece of cartilage - that's the rubbery stuff in between joints and on the end of some of your bones- could stop angiogenesis and halt tumours in their tracks.  This isn't actually quite as weird as it sounds.  Unlike bone, which is shot through with blood vessels, cartilage doesn't have any, so it stands to reason that it must be making some kind of molecule that stops blood vessel growth.

Given that sharks are cartilaginous fish, meaning there skeletons are made entirely of cartilage rather than bones, it was a fairly straightforward  mental leap to assume that not only would sharks not get cancer, but their cartilage might also be able to cure it.  Indeed, when other researchers tested shark cartilage in the lab, it strongly stopped tumours from growing new blood vessels and, as far as anyone could tell, wild sharks hardly ever seem to get cancer, and experiments exposing sharks to a chemical that causes cancer in humans, had no apparent ill effects.  So it all fitted nicely together. 

The next leap was the publication of a book in 1992 called "Sharks Don't Get Cancer", which gathered a huge amount of media attention. The man who wrote the book, a Dr William Lane, was convinced that taking shark cartilage pills could cure cancer, despite the lack of actual scientific evidence from patients to prove it, and off the back of that misguided idea, a multi-million dollar industry was born.

Sharks were caught, farmed and slaughtered in their thousands, and eventually in their millions to make ineffective shark cartilage pills that were bought, and taken, by desperate cancer patients.  To prove the point, there have been at least three clinical trials showing that shark cartilage tablets are completely ineffective against cancer, providing pretty conclusive evidence that it doesn't work.  And yet, shark pills are still on sale in health food shops and alternative medicine stores. 

Even worse, the whole idea that sharks don't get cancer is untrue.  Although it's very hard to get accurate data on diseases in wild ocean animals like sharks, because those that die from cancer or anything else, tend to sink to the bottom of the sea, rather than handily themselves for counting by scientists, marine biologists have found many examples in plenty of different species of sharks and, in 2013, scientists spotted a large tumour in possibly the most famous type of shark of all, the great white, living off the south Australian coast. 

Although the idea, sharks don't get cancer, and that their cartilage can cure it, started off with its roots in proper research, it's ultimately led to the futile death of millions of the important and beautiful ocean dwellers.  And even if it was the case that a molecule from cartilage could be effective against cancer, all the same stuff is found in cartilage of other animal parts, such as pigs ears, and pigs certainly aren't endangered in the same way that many shark species are.  It's time to sink this myth to the bottom of the briny and leave it there.

Cambridge Universities Eco Car at the finish line

15:49 - Solar Powered Car Race

In the summer we cought up with the cambridge eco car racing team, this week we got back in touch to see how they did in Australia.

Solar Powered Car Race
with Graeham Douglas, Simon Schofield, University of Cambridge

Cambridge University's Eco Car team have just landed back in the UK after takingFinnish Line Eco Car part in the World Solar Challenge. This is a gruelling 3,000 kilometer race across the Australian outback. Graihagh Jackson found out that interviewing a solar car racing team doesn't guarantee you a suntan...

Graeham - The race is the world's solar challenge, so every two years, teams from all over the world go to Australia and do a race 3,000 kilometres from the north in Darwin to the south in Adelaide.  There's about 40 teams that start at the beginning, usually between a quarter and  a third finish at the end, and that takes about four days.

Graihagh -   Entirely on solar power, so surely that give you a number of obstacles to overcome from the get go?

Graeham -   Yes, absolutely.  Our  solar array gives you less power than you get in a hairdryer and we use it to propel car with a driver in it, at highway speeds, so there's a lot of challenges on the engineering side.

Graihagh -   Powering an electric car on the same electrical output as a hairdryer.  How on Earth does this work?  Back on the racecourse I caught up with Technical Manager, Simon Schofield.

Simon - One of the things that Cambridge University is very good at is being very innovative in the way that we design our cars.  Most of the other teams have a very different design.  It looks more like what we call a table top design, which means it's a large, flat set of panels which face the sky and then there's a pod that the driver sits in in the middle of the car.  So our decision to go for something that's much more aerodynamic but has a smaller solar array is something that no other team has tried up to this point.  

Graihagh - The solar panels to me aren't very big.  I'm thinking about the solar panels you get next to swimming pool.  There's loads of them and, even then. You've got maybe two metres by half a metres of solar panel there.  How are they so efficient?

Graeham -   So the solar cells are gallium arsenide cells.  Gallium arsenide is a very good semi-conductor to use for solar cells because it's a direct bandgap semiconductor.  That means that rather than needing phonons as well as photons to absorb light, you only need the photons and that means that more of the photons that hit the solar cells will actually be converted into electrons, that turns into energy, which allows you to power a battery.

Graihagh -   And I notice you've got a covering. Does that not block some of the photons getting through?

Graeham - So the canopy is about 95% transmitted, which does mean we lose about 5% but, the benefit of having it there is that it actually keeps the aerodynamics of the car extremely good.

Graihagh -   I was amazed to see that, despite the rain, Amy managed to take off around the track for a testing.  Although, only at about 20 kilometres per hour.  But come race day the team had hoped Evolution to be racing at speeds of 110 kilometres per hour. Three months after their little demo around the running track, the team set off for Australia ready for race day but, it was a bit of bumpy ride as Graeham Douglas told me on his return to the U.K. this week.

Graeham -   While we were testing about four days before the race we had a really big problem with our motor.  The inside of it melted while the car was going about 95 kilometres on the testing track. The car swerved a bit, the driver managed to hold it together but, it was quite a scary moment and it ended up that one of the teams had a motor, from a few years ago, that had the exact spare part that we needed to fix our motor.  We put it together overnight in really an epic effort of so many of the team coming together and making it work.

Graihagh -   It sounds like nail biting stuff.

Graihagh -   How was the race because it took six days, didn't it?  How was that?

Graeham - There was ups and downs definitely.  The biggest problem we had was our battery coming close to overheating.  So we use lithium based batteries which, as many people know, there have been some famous fires with lithium batteries, so we keep ours well within its operating limits of 55 degrees Celsius, and we got up to about fifty three and a half, and when we got that high, our electrical guys notified the rest of the team and decided we had to stop and cool the battery down.

Graihagh -   I mean the battery is one thing but what about the poor person driving it.  That fifty three and a half degrees is boiling.

Graeham -   Yes, we have some really tough drivers on our team and all our solar car drives do.  Our cars are built for speed and comfort comes into play but it's a bit down the line.

Graihagh - Wow.  Scintillating stuff. That was the lowlights but, I know, day six you had a triumph.

Graeham - Yes, day six was the last day of the race and we were coming in the last 250 kilometres into Adelaide, to the finish line.  We ended up travelling in at quite a faster speed than we were hoping to.  We projected driving in at about 50 kilometers an hour and the power we had in our battery from that extra charge in the morning, we were getting up to about 65.

Graihagh - And where did you come?

Graham -   We came about twentieth out of the challenger class.  We were only allowed to drive between about 9am and 5pm, but our end time was just over 53 hours.

Graihagh - And how did that compare with the winning team?

Graeham - The winning team was a fair bit quicker.  They finished in 37 hours and 56 minutes.  So, we've got our work cut out for us for next time.

Graihagh - Something Simon talked about previously was the table top design is what most other cars go for, you went for something a bit different.  So was there something fundamental about the other cars designs?

Graeham - As Simon said, the Cambridge philosophy is building a smaller car that's based on aerodynamics.  Nuon, the team that won takes the classic design, they're a table top design.  So the flat top with four wheels coming out the sides.  They've just done a very execution.  They've got very good aerodynamics, they clearly do a lot of time in the wind tunnel and doing computer simulations of their design.

Graihagh -   Either way, it's a spectacular achievement and you must be so proud of you and all your team's efforts.

Graeham -   Yes, it was really great.  I mean, there was a moment just walking across the finish line with the team and the car.  They have a moment for all the teams.  The car stops just before the finish line, all the team comes up, and they all walk across the line together, which is really a powerful moment and very emotional for everyone because you know how much work you've put into it, and how much work everyone on the team in the past have put into the project  and that builds every year and gets us where we are now.

22:34 - What is big data?

What is big data and why does Lord David Willetts thinks it is an area in which Britain excells?

What is big data?
with Lord David Willetts

This week, we're talking about big data. But what actually is big data? In 2013, when he was the UK's science minister, Lord David Willetts hailed it as one of his 8 great technologies - industries in which the UK excels. He explains why.

David -   Big data is using the power of high performance computing, plus very smart software to detect patterns in large data sets that aren't always apparent using conventional statistical techniques.  More data has been generated in the past five years than in all previous human history. I identified it as one of the eight great technologies because Britain has got some distinctive strengths.  The fact is we've got some pretty powerful computers, but not absolutely top five computers. 

The real way to get value from these computers is to write smart software that gets more results with fewer calculations, and we do write very good software. 

We're good at extracting significance from computers even if they're not always the most powerful. 

And then what really matters for us in Britain, is as well as all the contemporary flow of data, we've also in Britain got some really long historic data sets. 

Reliable records kept for much longer periods than most other countries, from measuring climate change by tracking weather records kept by vicars in their parsonage.  We have got some amazing resource like that which does reveal patterns that otherwise would not have been spotted.

Uber China

24:06 - Big data: what is it good for?

What can big data do and how has the cloud enabled the big data revolution?

Big data: what is it good for?
with Sue Daley, Tech UK

With us now is Sue Daley - head of big data, cloud and mobile at TechUK, which isUber China an organisation representing over eight hundred and fifty technology companies. Sue explains to Chris Smith the worldwide nature of big data.

Sue -   I think the UK's in a great world leading position in terms of what we're doing with big data so far but, you are right, it is global.  Big data or data as we know is the digital currency that powers our digital economy all the way round the world, and we can see examples of big data in action wherever you may be.  For example, if it's finding your name on that fizzy coke or drinks bottle, or turning on your digital TV and finding that movie or that box set you've been talking about or hearing about all day long.  The CERN hydron collider, for example, is using big data technologies and technologies and  tools to find the answers to our universe so, yes it is a global phenomenon. 

Chris - David Willits said famously there "five years of data adds up to more than all of mankind's endeavours previous to that time".  What's driving that?

Sue - Well, it's true.  We see that its estimated that 90% of the world's data today was  created in the last two years alone. But I would say it's not just the volume of the data that's really important, or what's different perhaps now.  It's the coming together of different types of data, all in real time and that's really the significant difference.

Chris - And technology?

Sue - And also having the advanced technological tools and solutions that the technology industry is developing and making available but organisations can find those insights , that knowledge, that needle in the data haystack that enables them to use data more creatively.  But that wouldn't be possible either without cloud computing that enables all that data to be stored and processed and managed.

Chris - You'd better explain what the cloud computing concept is and what that means.

Sue - So cloud computing is a term for the ability of organisations to gain access to really complex, high powered computing resources, on demand, 24 by 7, delivered direct to your mobile device or you at home or at work.

Chris -   I suppose one of the ways of looking at this is to say, well I could have a very powerful computer at home, and I could install loads of software on it to use that software once a week to work out my tax return, or something.  On the other hand, I could rely on a very good computer to run that same piece of software somewhere in the middle of nowhere, let it grind through that data and then return it to me the results that I need to put into my tax return without me actually having to do any of that processing.

Sue - Yes. I think it's really important to remember that big data technologies and tools are not just for the large, big huge companies  or multinational organisations and the cloud is really enabling us all to take advantage of the big data revolution.

Chris - Can you give us some examples of how industry and, academ perhaps, are using this sort of technology and using this resource which has now come of age for all of us?

Sue - Some great examples of organisations that are using cloud computing, for example, you might be listening to this in a room that you booked through Air B & B. That's using cloud computing.  You might be listening to this in your car that your ordered over UBA.  These again are examples organisations that are using the ability and agility and flexibility of cloud computing.  The ability to scale up their IT resources and scale down their IT resources, as and when they need it.

Chris - I was talking the other day to someone who owns a major supermarket chain in another country and, interestingly, he said to me "having introduced a customer loyalty card, they now know who shops where, they know what they buy, the know what volumes of things they buy when" and they're even talking about saying well we then know when someone moves to another city temporarily or goes on holiday, they've got techniques they're developing to ping them a message and lure them into the shop locally.  So that, even though they're on holiday, they still nonetheless do their weekly shop with that company.

Sue - Yes.  So I think the way the companies or organisations are using data, so using big data tools and techniques to gain that insight, that knowledge from structured and unstructured data coming together, that can then be turned into a value. So that might be reducing costs, it might be driving efficiency but, more importantly, what's the value to us in terms of systems and, as you say, when you are away on holiday you can still get goods and products and services that are tailored to your likes and dislikes and meet your needs.  It's helping to make all our digital lives a little bit easier.

Where do you think this is going next?  Where's the next big thing, or what is going to be the real outcome from big data from and industrial and corporate point of view?

Sue - When I look to the future, it's a really exciting time.  We've got the coming of age of the big data revolution, but combined with that, as we've just talked about the power of cloud computing now.  Bring that together with the emergence and the rise of the interneter thing, so the internet moving away from computers and the internet into everyday devices.  Whether that's wearables, your smart oven or your smart heaters at home.  I think it's a really exciting time and I think big data's at the heart of the U.K. digital future.

Barred Spiral Galaxy

29:32 - Big data just got bigger

The Square Kilometre Array show's just how big data can get and the challenges that come with it.

Big data just got bigger
with Peter Quinn, University of Western Australia

So far we've heard what big data is and how we can use it, but what happensSpace when it gets REALLY big? Peter Quinn is the director of the International Centre for Radio Astronomy Research in Perth, Western Australia. He's part of a project that is developing a telescope so powerful that it will generate more data in a day than the entire world will pump out during 2020, when it switches on, as he explained to Connie Orbach...

Peter -   The Square Kilometre Array is a radio telescope, so it looks to the sky in frequencies that correspond to radio waves, the sort of waves that radio stations use, or TV stations use. The thing about the SKA is a transformational change in our ability to collect data from the sky.  So if you look over the history of astronomy say for the last 400 years, we started with naked eye, and we went to Galileo's telescope, and we went to slightly larger telescopes, you know space telescopes etc.   Every time we have built one of those telescopes, about every 20 years or so, the telescopes are kind of twice as good as the one we built before. With the SKA, we are building in one generation a telescope which is 10,000 times better.  This is something that hasn't happened before, and it's got all the challenges that go along with things that haven't happened before.

Connie - And what is it you're looking for?

Peter - Essentially where the edge of the observable universe is.  The universe began, we think as a big bang, full of glowing gas.  All hot glowing gases when they expand cool and eventually they cool to the point where things start to condense out, just like water drops condensing into steam.  Those first things that condensed out in the universe, they were the first sources of light, the first things to shine.  This happened probably less than a billion years after the beginning of the universe, but we don't quite know where.  Once we find that point in the cosmic history, we know what the seeds, if you like, were of all the other things that have happened in the universe.  So it's an incredibly significant quest.

Connie - So how much is this going to cost?

Peter - We expect the first phase of construction to cost 650 million Euros, and that's contributed by the eleven member countries in the SKA consortium.  The second phase of construction will probably cost several times that much again.

Connie - Wow.  This is a huge telescope.  I am assuming it's going to be collecting a lot of data.  How much are we talking about?

Peter - On a typical day of operation, it'll produce a stream of science data, that's data that's ready to do science with, of around about one exabyte.  Now an exabyte is a one with 18 zeros after it, so that's a billion gigabytes.  It's basically the kind of data volume that we would expect  the whole planet to produce in a year.

Connie - What are the main challenges with this level of data?

Peter - The main challenges are really the technology, how the computer is put together , whether it has the right ratio of input/output to storage to processing.  We need a particular ratio for the kinds of data that we're using.  Whether it has the right algorithm inside it, so an algorithm that can scale from where we are today up to the SKA scale.  Some algorithms witll just break, some systems, some pieces of mathematics that we have used for many, many years just won't work anymore when we go up to the SKA scale.  So we more or less have to start again and figure out the maths, and figure out the algorithms to go and analyse this data and that's already started.  There's cost, and this is real serious one, when we build these special computing environments.  I mean they are very big.  I mean we are talking about  petaflops of processing power and petabytes of storage, they are very expensive things.  Can they be afforded by research projects.  So we need cost effective computing, and maybe the way we do that is not perhaps by owning all the computing we would like to have but maybe using some of the new technologies like cloud computing.  Basically we just grab a piece of computing and use it for a little while, then give it back if you like because it's computing on demand, we don't have to own big computer centres and own big supercomputers for a long period of time. And then people, I really worry a lot about the training of the people to actually do the data analysis for things like SKA.  It's a bit troubling because we are not training enough people to do this stuff, so it's a long term future issue.

Connie - Once you've got all this data, where are you going to put it?  Will you be able to store it all?

Peter -   The answer is no.  When we do radio astronomy we take different kinds of data at different kinds of points in the observing process so when we first look at the sky, we collect what we call raw data.  This is the stuff that's not yet an image.  That raw data is probably between ten and a hundred times bigger than the exabyte of raw science data, so there's no way we could afford to do that storage and so what we do is we try to keep the data on the move.  We take it straight from the telescope and pipe it right into the back of the supercomputer to the processing, and that processing reduces the data volume, called data reduction.  So, yes, we can't store that data but we certainly  want to store the final scientific images that the astronomers are going use.  Astronomers tend to keep data for a very long time.  If you look in observatories around the world you can probably find photographic images of the sky that were taken more than a hundred years ago.  The reason why we keep those things is that sometimes we'll discover a new asteroid in the sky, or a new planet, we can go back and look at where it was a hundred years ago and figure out its orbit.  So this historical data is very important and keeping it there and keeping it available for people to do other things with it, other than the original purpose of the data, multiplies up its usefulness.  That data curation problem for data sets of the size of the SKA is a very serious one and we're going to have to pool our resources globally to figure out how to do that.

35:14 - Using DNA to store big data

When we save something on the cloud where is it actually kept and how much energy is needed to keep it there?

Using DNA to store big data
with Nick Goldman, European Bioinformatics Institute

Using The genetic information kept at the European Bioinformatics Institute - or EBI -runs to six quadrillion bytes of data; for the geeks among you, that's six thousand million megabytes. Trying to solve this problem led researcher Nick Goldman to think of a novel solution that stores thirty thousand times more data per gram than conventional methods like hard discs - he's using DNA! Rosalind Davis went to find out how he does this, starting with a look at what's currently inside the Institute's current data centre...

Nick - It looks largely like an enormous number of quite compact computers all lined up in big racks, one above the other.  Other data centres will use different systems depending on what the demand for their data is.  So, for example, the CERN data system is very interesting.  They use a combination of hard disks and magnetic tapes.  The new information that's exciting for the scientists is kept on hard disks, after a while they move it to tape.  Ours is all disk.

Rosalind - Can we go in?

Nick - Yes, absolutely.  Follow me.

Rosalind - Oh Wow, it's getting loud now.  Okay, so we're inside the data storage centre.  What have we got in the different racks?

Nick - There's a variety of machines of different ages, different size disks.

Rosalind -   But all the wiring going up to the ceiling where there are a lot of fans.  It's a bit too noisy in here Nick, so I think we'll go back outside to continue the conversation.

Nick - Okay.  So most of the noise you hear is air conditioning fans.  There's a cool system where they blow cold air in down every other aisle and they suck the warmed air up in the intermediate aisle.  So if you walk up and down the aisles, there's a cold aisle, a warm aisle, cold aisle, a warm aisle, and the put the computers facing back to back so that the air goes in at the front and always out at the back of each machine.

Rosalind -   With these disks, how long do they last?

Nick - A typical data centre policy would be a three year maximum lifetime for a disk.  After that amount of time you don't trust it any more so, even if it hasn't gone wrong yet, you'll be expecting to replace it.

Rosalind - Oh Wow.  That's quite often.  So have you go backups of all this data?

Nick - Yes.  Modern disk systems are sort of automatically self-backing up, so each disk is being partly used for data and partly used for the backup of another disk, and all the information is shared across many disks.  So in everyday use, if one disk goes wrong, there's no real impact on the system.  A little light comes on somewhere and they swap that disc out and put an new one in.  So to some extent this renewal is always going on but that doesn't reflect change in technologies so well, and so on a three or four year cycle they'll be completely replacing everything.

Rosalind - What's the kind of financial and, I guess, the environmental carbon cost of running a centre like this?

Nick - Financially, one of the biggest budget items for the EBI each year, is the cost of the computing equipment and the disks.  That runs into millions of pound a year. And the cost of doing the air conditioning on a data centre is about the same as the cost of hardware.  So, it's a very large amount of money and you can imagine yourself what the environmental impact of using that much energy would be.

Rosalind - You've looked into a novel way of storing data to avoid this problem?

Nick -   Yes, so inspired by some of the issues we had with scaling up our genome data storage facility, we were joking one day about any other way there would be for storing information that wouldn't be so costly, and realised that the DNA itself is a fantastic medium for storing digital information.

Rosalind - So you're actually storing digital data from computers and things back on to DNA?

Nick - That's right.  We devised an experiment to show that this was possible on a reasonably large scale.

Rosalind - I can imagine this is quite a complicated process. Can we go to the lab and have a look at how it works?

Nick - Yes, let's do that.

Rosalind - After entering the lab and putting on a disposable lab coat, I sat down with Nick next to a fridge full of test tubes to find out how he stores digital data on DNA.

Nick - We invented some algorithms and some codes which would start with a file on a computer, which essentially is zeros and ones and would convert that to a format that looks like fragments of DNA, letters A, C, G and T. And when we've made the designs for different fragments of DNA, we give those to a company, they're called Agelent, and they have the technology to make those fragments of DNA in large numbers, and large quantities of each fragment in their laboratories there, and they send them to us in test tubes ready for us to handle in the lab.

Rosalind - They almost look empty, but Nick you're telling me there's something in these vials?

Nick - There's a tiny drop of liquid somewhere in there, which is DNA in solution.

Rosalind - How much data can you put on DNA at the moment?

Nick - DNA is really, really tiny.  It's sort of unthinkably small.  In our experiments using a few megabytes of computer information, the actually quantity of DNA is essentially invisible.  We've calculated if you were to use the same system to record all the information currently held on computers in the whole world, it would about one or two metres cubed.

Rosalind - Wow that's tiny.  Do you get somebody else to make the DNA for you? Is it a really difficult process?

Nick - At the moment, the system they use is a bit like an inkjet printer, but it's more complicated and requires very high precision. and it's currently done in clean rooms in a dedicated laboratory.  It's a process that's getting increasingly important in biomedical research, to have DNA made to designs the scientists want.  So we are optimistic that that will get quicker and easier and cheaper, but at the moment it's still quite a specialised process.

Rosalind -   Once you've got the data and it's in the test tube.  How do you read it?

Nick - So we designed the whole system so it would fit right in with the standard technologies that are currently used for genome sequencing in biology and health care experiments.

Rosalind - What would you see the applications for this kind of storage being?

Nick -   Well the first applications would be ones where people are prepared to spend a large amount of money.  So that will be high value information, things that are culturally important or politically important.  DNA will last hundreds or thousands of years without any intervention, so long as you keep it cool and dark.  Genome scientists working in evolution, extracted DNA successfully from horses that died 700,000 years ago, and there's been some damage but they've been able to recover essentially the whole genome sequence, so we know DNA will last that long.  That wasn't even a controlled experiment, that was just a dead horse. So we are thinking about applications that would be the long term archiving of high value information.

42:19 - How private is personal data?

How much of our personal data are we sharing and should we be doing something to keep it more private?

How private is personal data?
with Timandra Harkness

When it comes to personal data, storage raises a different kind of challenge. WifiIncreasingly, and because of the way the Internet works, our personal information is being collected, processed and held by companies based in many countries. But the laws governing how our data are handled in different countries aren't the same, and knowing who can and can't access that information becomes very confusing. Timandra Harkness is a journalist specialising in big data, she starts by explaining the extent of the problem with personal data to Chris Smith.

Timandra - Well I think I'm going to refer back to what Sue Daley said "it's not just about the volume of data, it's actually about the fact you can link different things together", and I think this a large part of the problem that companies, or any kind of organisation.  It could be charities, it could be government agency who want to know stuff about us as individuals, have access to a lot of different sources of data about us.  So, they can put together something that we've voluntarily given them.  We might have registered on their site or ordered something online.  They can put that together with maybe our social media posts or information they've got from our credit card records, perhaps.  It's very hard to say how much any individual company or organisation will know about you.

Chris - What about the issues that worries some people though, which is, they give this data to a company.  If you give a company in the U.K. something, then you know that they are bound by U.K. legislation but people are finding that they give data to a company, an online presence, but because they are based in another country they find they are actually subject to the laws of that country, not our own country.

Timandra - Well that is true and obviously, a lot of companies that we routinely use, maybe as our web providers, or to do web searches, are actually based in America or elsewhere and so we don't even know what laws they are bound by, but I would say the problem starts a bit further back. I mean, how many of us actually read the conditions before we hand over our data.  We tend to be rather blindly trusting.

Chris - Stay there we'll come back to you in just a second.  But let's put what you were saying to the test and find out how easy it is for information held by companies and other organisations to fall into the wrong hands. Rosalind Davies went to an interactive art exhibition showcasing this called "Data Shadow".

Rosalind - I'm outside a blue shipping container in the middle of Cambridge as part of the Festival of Ideas.  It's an art installation by Mark Farid who joins me now to explain what it's all about.

Mark -   So it's essentially about data privacy, but for me, the lack of ownership you have over your own data.

Rosalind - Can you explain the process.  What are you actually doing her?

Mark -   You join our Wifi, and then a captive portal pops up which asks you to enter your iCloud details to verify that you are the owner of the phone and at that point we take all your information from your mobile phone.

Rosalind - What's different to this than using a normal wifi hotspot?

Mark - Essentially nothing.  This is a normal wifi hotspot.  We've just rigged it so that we can do this but the piece of equipment we've rigged it with cost £100 off the internet that anyone can buy.

Rosalind - What you're doing here today, accessing the data, could be done by anybody if they wanted to?

Mark - Completely so.  So if you've ever joined free public wifi before, the equipment that we're using tricks your phone into believing we are one of those wifis, automatically connects to it, at that point we can be tracking what you're doing in real time on your phone.

Rosalind - That sounds quite scary.  Can we go inside?  Can I give it a go?

Mark -   Of course you can.

Rosalind - I'm entering a completely black room.  Mine's not working so we'll go through the same process but with Mark's data.

Mark - So I will log in using my information.

Rosalind -   We've connected Mark's phone up to the data shadow and I'm walking through another door into a white room this time.  Okay, so I'm in the middle of the room and on one wall in front of me is a picture of somebody in boxer shorts.  Behind me is text.  Mark, where have these letters come from?

Mark - Those letters are a thousand characters of my most recent text message or messages, and then on the right hand side you'll see that we've taken 64 images off my mobile phone that were first sent by text message, then what's app and then generic pictures on the phone.

Rosalind - When you first did this, were you surprised by what you saw?

Mark - Yes.  There were lots of photos of ex-girlfriends that I thought I'd deleted.  People saw a lot of explicit photos of myself.

Rosalind -   What happens next?  Do I get to leave?

Mark - So you exit through this door.

Rosalind - And that was the door shutting behind us and deleting all the data that they've collected.

Chris - Was it the girlfriends he deleted or was it the data?

Connie - Maybe it was both. Or he thought he had at least.  So it seems anyone with the right technology could access a lot of personal information.  Timondra, is there anything we can do to keep our data private?

Timandra - Well, there's quite a lot we can do technologically.  I'm not a tech expert but I do keep an eye on this.  Edward Snowden oddly has just given an interview to the Intercept making some specific suggestions and they are things like, you can use Tor as a browser which makes it much harder to track where you are going and what you are looking at. You can encrypt everything on your phone and your laptop, you can encrypt your messages end to end.  There are apps you can use for that but he also has some more general advice, which I would echo, which is be a bit selective about what you share with whom.  "You can segment your life" is his phrase.  You don't have to tell everybody everything. Why would you enter your iCloud details to some pop up box for a wifi.  Why would anybody do that.

Connie - I guess you're right, but you know when you just really want some free internet, you know it's not sensible but it's hard not to do it, isn't it.  Are there any options being developed, maybe commercially or politically that could help us manage our personal data differently in the future, other than the things we can do personally?

Timandra -   Well I think that's the key.  There are some people saying "look we need a whole different model for this."  We need to think that our data is ours to control and if we want to trust somebody with it that should be an act of trust.  You know, I will trust you with this but not necessarily with that, and there are things like Block Chain, the technology that underlies Bit Coin.  That is quite good for this because you can encrypt things and then select who you are going to share your keys with.  There's a company called Midex who are developing a whole new model of data storage, which would be very secure and you can give, again, very specific consent to people to use certain parts of your data for certain purposes.  But the other side is that we do need to think differently about it.  Politically we need to actually value privacy and say "never mind nothing to hide, nothing to fear".  I would say  "if you haven't got anything to have, you haven't really lived". You don't have to be doing anything illegal or harmful to anybody else to think privacy is really important.  You can't have a free society without privacy.

Connie - So after doing all this research, have you done anything differently yourself?

Timandra - I am actually doing certain things differently.  A lot of stuff I'm leaving for now because it's quite interesting to just be conscious of what's happening as I write the book.  So, for example, I went and looked at a website of a company that all our political parties use to keep track of their contacts and, you know, keep an eye on what they should be using to try and campaign to you.  So, I looked at this companies website and, for the next three weeks, their tweet appeared at the top of my twitter feed, and I haven't registered with them, I haven't given them any information consciously, but there they are. I'm definitely doing things like, I'm not downloading apps that want to know my location details on my phone.  Why do they need to know where I've been.  I'm definitely more selective about registering and letting people share my data.

Chris - I've got a tweet here from Ed Wilson, Timandra, and he says @ Naked Scientists "surely the triumph of big data is how many times you get adverts, the very thing you have just bought on line."  But it reminds me of something which is that I have now begun to foil so many birthday surprises because I share an internet connection with my wife and my other members of my family.  So, I know what I've got for Christmas or for my birthday because up come all these adverts for these things and I think, well I haven't been looking at that.  Why is this suddenly appearing on my computer.

Timandra - Well you definitely need to be using some anonymous browsing technique then, like Tor and probably turning off adverts.  In fact, that's another thing Edward Snowdon recommends is using add blockers. But this is exactly it, I think big data tries to aggravate everything, put everything together.  Actually in real life, we don't want to put everything together.  That's exactly it, you don't want to know everything you family's thinking.

Chris - I certainly don't, but for more reasons than just a birthday.

Bald Man

51:35 - Why do we lose hair on our heads and not the rest of our bodies?

Why do we lose heair on our heads but still have super hairy legs and armpits? It just seems unfair.

Why do we lose hair on our heads and not the rest of our bodies?

To stop Rosalind Davis pulling her hair out trying to get to the bottom of this, she asked Professor Robert Foley, from the department of Archaeology and Anthropology at the University of Cambridge to help her out.

Robert - We do lose hair from our heads and from our bodies throughout our lives. You only have to clean a shower out or a bath out to realise that that's true and, of course, we don't really notice or see the loss of hair on our bodies because the hair is miniaturised , very small and it's not very dense, the loss is virtually invisible. In addition, of course, it's replaced so we don't see a long term effect either on our bodies or our hair. If we turn to the bigger, and obviously more important question is the permanent loss of hair. In other words, going bald and there, there is a particular pattern to it. It is men, rather some men, who go bald. Why some men become bald is partly a matter of genetics. There seems to be quite strong evidence that there are genes, and those genes are on the X chromosome, and that produces a sex-linked pattern of inheritance. So that men inherit their baldness from their mother's father.

Rosalind - Surely though, as a species, we would have all evolved to keep our glossy locks.

Robert - It might well be that baldness has actually got some evolutionary advantage. Not many men feel that, but there have been studies showing that bald men can be seen as more attractive, often because it's associated with longevity, with success, wisdom, knowledge, maturity.

Rosalind - Okay, so if you are wise and mature but also, unfortunately, bald. Is there anything you can do?

Robert - In terms of doing anything about it, all one can really say is "buy a hat and be happy." That's almost certainly a signal of long term survival-ship as anything else.


Add a comment