Making a chatty robot
Currently, personal assistant robots are not the chattiest but one scientist at Cambridge is hoping to change that... Graihagh Jackson went to meet Milica Ga?ić to find out how she's making a system that means we can have a conversation with robots...
Milica - You've probably have all heard of Siri on i-Phone or other personal systems but these systems can be more widely used ins situations like banking or they can be used for providing healthcare information for elderly people, for instance. These systems normally have three components. The first component which is called speech understanding is trying to extract the meaning from the speech. The second component which is called dial of management, tries to decide what is the best response, or what we call action, to take to say to the user and then the final component generates this response into speech.
Graihagh - None of this is trivial. Putting speech into text, understanding that and then deciding what the best action is and turning text back into speech - it's quite complicated, especially that middle step of understanding and actioning.
Currently, systems like Siri and Google all operate on a series of rules. Someone has literally sat down and thought about all the possible things you could ever want to ask your smartphone, written it into a code and - voila! Sounds painstakingly protracted...
Milica - Now this is obviously suboptimal because a human can't think of all possible situations and it's very expensive to develop such systems so, what we are doing is trying to use machine learning to tackle this problem and to make the systems better.
Graihagh - When you say machine learning, what do you mean? Are you literally sitting down a computer and saying this is X and this is Y?
Milica - Not really. The idea of machine learning is that the machine could analyse data and then try to build a model and then, based on that model, make predictions. So the prediction could be what the user wants or the prediction could be what the system should say back to the user. A particular machine learning method which is very useful for building dialogue systems is based on reinforcement learning and reinforcement learning is all about trial and error. In machine learning we normally have two ways of learning. One is supervised learning and you can think of that as you have a teacher who is teaching you. So the teacher is showing you how to do something and then what you try to do is try to imitate your teacher and do it as good as the teacher is doing it. Reinforcement learning is very different. In reinforcement learning, you explore the different possibilities. You don't have the teacher but you have say a parent who will give you occasional rewards based on what you did and if you did something well. In similar ways like children are trying to get as many presents from their parents, the system is also trying to maximise its reward and, indeed, many reinforcement algorithms were inspired from nature and how biological systems learn.
Graihagh - You just made me think of kids wanting treats. I mean is there a chocolate equivalent for a machine?
Milica - Oh absolutely! The reward is numerical and it is an equivalent of chocolate.
And speaking of food, we decided to dial up the dialog system in search of some lunch
CRIS - What kind of food would you like?
Milica - Hi, I'm looking for a Chinese restaurant in the centre please.
CRIS - HK Fusion is a nice place - it serves Chinese food.
Milica - Is it in the centre?
CRIS - HK Fusion is a nice place. Their address is 21 Burner Street city centre.
Milica - Can you tell me their phone number?
CRIS - HK Fusion is a nice place. Their phone number is 01223 355 909.
Milica - Thank you. Goodbye.
CRIS - Thank you. Goodbye.
Graihagh - You can very easily see that too and fro. That conversation it's got context of what you're asking there. So, from that conversation it's learning every time someone dials and asked for a specific food (whether that's Chinese or pizza), but what will it be learning from that conversation or how will it be learning from that conversation?
Milica - So, from time to time it would be asking somebody to enter their feedback. At other times, it's trying to estimate feedback on its own and then what it does is it tries to analyse which actions it took and what feedback it got. One thing it wants to maximise is the chance of success. When it provides all the information that the user's asked for it's counted as a successful dialogue but that is not the only component it is trying to optimise. It also tries to offer as much information as possible in as few turns as possible because users generally don't like to hang around and talk to dialogue systems forever. So it tries to adjust its actions so that it optimises these two objectives.
Graihagh - So it sort of almost goes away and reflects? Not unlike a human; what was good and what was bad about that conversation.
Milica - Yes exactly, that's a very good comparison.
Graihagh - So in the future do you envisage this being much more broader than just ordering Chinese food in the city centre.
Milica - Yes. My goal is to model a more richer conversation. In particular, one idea that I have is to build a dialogue system that can be used for the prevention of mental health illnesses and the idea would be to develop a dialogue system that everybody could access on their phone, whenever they like, whenever they have a problem they could get anonymous instant support. So I think that would certainly have a huge impact but also from a scientific point of view, these dialogues would be much richer so it wouldn't be about ordering Chinese food but rather about trying to model real conversation.