What is Augmented Reality?

31 January 2010

Interview with

Dr Tom Drummond, Cambridge University

Helen -   Dr. Tom Drummond is a Senior Lecturer at the Machine Intelligence Laboratory at Cambridge University where they're working on some of these technologies, and Tom has very kindly come into the studio today to talk to us about augmented reality.  Hi, Tom.  Thanks for coming.

Tom -   Hello.

Helen -   And I think we need to start off with  - what is augmented reality?  It sounds like something out of a sci-fi movie, but what is it?

Tom -   It does sound very science fiction, doesn't it?  It's about taking computer graphics off the computer screen and making them available over the natural world, over the real world.  Now obviously, the real world doesn't have a computer display capability, so you need to put those graphics there somehow.  The first way we thought of doing this was to use a head mounted display - you look through the head mounted display at the world and then a computer can display computer graphics on a part of the world too.

Helen -   So you're looking at the world and you're pushing a layer of information of some sort that refers to that world.

Tom -   That tells you something about what you want to do with the world.

iPhone using the Wikitude application, demonstrating an example of Augmented RealityHelen -   What you're looking at...

Tom -   So you might want to, in a medical application for example, use it in laparoscopic surgery to be able to see what your instrument is doing inside the patient. Where blood vessels are, maybe there's a tumour that you're trying to target or something like that.  So that's one kind of application.  There are obviously entertainment applications.  There are games available now that you use this technology or indeed, there are educational benefits, and so on.

Helen -   It seems to me as I browse around the internet that quite recently, the entertainment and advertising side is really developing quite quickly.  You can have magazines with augmented reality covers - you wave the magazine in front of a computer, and something pops out on it in three dimensions through your webcam.  And it's in sporting events as well...

Tom -   Sure, American football for example.

Helen -   ...and races and things.

Tom -   The first down line is done by augmented reality in this.

Helen -   And that counts as a way of putting information, and advertising as well, into sporting events.  But as you said, there are more worthy and useful applications of this technology as well.  You say you started off thinking about a head mounted way of doing this.  What are the alternatives?

Tom -   Well the thing that we're starting to see now is handheld augmented reality which runs on, for example, a smart phone.  In that version, what you see on the screen of the smart phone is what the camera sees of the world.  It's a bit like having a video camera or a digital camera where you're seeing the preview of the picture.  But what augmented reality does is it traps the graphics in flight between the camera and the screen, and you work out what you're looking at and where it is, and you add the virtual elements to the image at the same time, so that you can blend the piece of information that you want to add to the world over the top of it graphically.

Concept for an Augmented Reality phoneHelen -   So it feels like you're holding up a magic spy glass and you're looking through that, and you're learning something else about what you're looking at.  I could hold it up to you and it might tell me something about you, perhaps...

Tom -   Yes.  You could see my name floating above my head or something like that.

Helen -   I believe you've been looking at the pros and cons of these different approaches of a head mounted system versus something we can put in our pockets.  What are the differences between those two approaches?

Tom -   A head mounted display gives you a very immersive feel.  When you're looking at the world, the computer graphics are right there in front of your eye.  So, there's a very strong connection between the virtual elements and the real elements.  But then there are some negative consequences as well.  It's very difficult to build these systems without latency in them.  So when you move your head, the computer graphics might follow a tenth of a second later.  Unfortunately, one of the consequences of this is that it can make people feel motion sickness and it can be very unpleasant to use a system like this.  Head mounted systems are also very expensive and that could be a barrier to their useand they're also very cumbersome.  You have to put something that gets between you and the world on top of your head, whereas by in contrast, a phone is a small thing.  We all carry it and it has all of the computer hardware inside that you need to run some of these applications.  If there's some latency and the picture takes a tenth of a second to catch up as you move it, nobody really minds because it's not directly affecting what you're seeing, and conflicting with what your inner ear is telling you for example.

Helen -   How are we actually seeing this being used in the real world, outside the laboratory?  One of the possibilities that I thought was rather exciting was the use of these kind of things for tourists - for going to a site, perhaps of a ruin that's fallen down now, holding up your smart phone or perhaps even wearing your tour guide helmet and goggles, and it would recreate what the acropolis looked like when it was full of people or you know, when it was still there.  That seems to me to be quite exciting.  Are we seeing this kind of thing actually being used?

Tom -   Absolutely.  There are applications available now on the iPhone store and on other phones like the Google Android phones that use GPS to locate the smart phone and a compass to work out which direction it's pointing in, and then you can display computer graphics like "this mountain is..." whatever it is or "this building is King's College Chapel...".  These systems are appearing now and I think that they're going to become very popular this year.  In some sense, the limiting factor of those is that GPS and a compass isn't that accurate, and one of the problems is that if you want to draw your labels very precisely over what you're seeing, they tend to jitter around and often, if you look at videos of these systems in action, you can see that the labels are jittering around a bit, relative to the image.

Helen -   So it's not quite pointing it to King's College Chapel.  It's sort of hovering about in the air a bit...

Tom -   Hovering around somewhere nearby.

Helen -   Yes.

Tom -   Now, one of the things that's driven our research into this is using the image that's coming into the smart phone to locate what we're looking at.  If you can work out what every pixel in the image from your camera is looking at, then when you draw the graphics on the screen, you're going to be drawing them roughly to pixel accuracy over the top.  That tends to lead to a much more stable viewing experience and the graphical elements look very stable on the world, and really look like they belong there, which is actually quite important in terms of how the users respond to these extra elements being displayed.

Helen -   I can only imagine, being a humble marine biologist myself, the technology involved in taking a moving image of the real world and incorporating your position on that image must be extremely challenging.  We won't go into the details now, but I'm just wondering;  what are the main problems that you have to overcome to be able to put these images together and use, say a smart phone to shine at something, and tell you what it is?

Tom -   Sure.  Yes, there are a lot of issues.  In particular, smart phones are not the most powerful computers available and so there has to be a lot of effort going into shrinking the algorithms down, so that they can run in the computer capacity of a smart phone.  When you're talking about the data from a camera, there's actually a huge flow of data coming out of the camera of the smart phone.  So it's actually a serious issue to be able to process that in time, to be able to work out where you are and what you're looking at.

Helen -   And finally, I think one thing that seems to me to be very clever use of this is to communicate expertise, to be able to transfer yourself into another place, and almost get someone else's brain on the case.  Can you tell us about that quickly?

Tom -   Sure, yes.  That's one of the systems we developed and really, that came from an occasion where I was phoned up and asked, when a car is out of water, where do I put the water in for the windscreen wipers?  I'm standing there with my eyes closed, trying to picture the engine bay of the car, thinking -  well, at the back on the right, there's a translucent white bottle there somewhere...  And I was thinking, if a person could just take a photo with their phone and send it to me, I could draw an arrow and say, "its here."  And then even better, when that photo goes back to them, when they move their phone, that arrow stays pointing at the image of the water bottle, that would be brilliant!  And in fact, some very clever people in my lab built a system that did exactly that.  So, what it does is it extracts information about what it can see, for example, the engine bay of your car and then in real time, it builds a 3D model of the things that it can see, and it calculates at the same time where the camera is moving, and then all of this information together is used to help a remote expert place information into the scene that will help the local user in solving the problem that they have.

Helen -   Fantastic!  I know, next time I need to refill my water in my car, I would love to have a gadget like that on hand.  Thanks ever so much Tom for giving us a great introduction to the world of augmented reality, explaining how machines can recognize and track reality.  He comes from the Machine Intelligence Laboratory in Cambridge University.

Add a comment