Personality testing: no wrong answers?

The artificial intelligence standing between you and your dream job...
29 November 2022
Presented by James Tytko
Production by James Tytko.

RECRUITMENT.jpg

HR selection

Share

If you’ve recently applied for a job, you may have been asked to fill out a personality test. From banks and consultancy firms to fast-food outlets, they’re increasingly being used as a way to improve efficiency and perceived fairness in recruitment.

The most common tests used for these purposes are based on the so-called ‘big 5’ personality traits that psychologists have settled on as providing a good indication of just what makes us tick.

Such personality tests take the form of a questionnaire in which the participant indicates their alignment with statements concerning human behaviour, usually on a scale of ‘strongly agree’ to ‘strongly disagree.’

But is this a shining light of equality and recruitment virtue, or does all that glitters sometimes not turn out to be golden? Does the unpredictability of human nature mean we could be missing a trick by filtering everyone with a “one size fits all” algorithm...

In this episode

Check List

01:14 - Taking the Big 5 Personality Test

The traits which supposedly determine your professional preferences...

Taking the Big 5 Personality Test
Josephine Andresen, UCL

Keen to find out what I could, I got in touch with Cambridge University’s Psychometrics Centre where they study psychological assessment. They gave me one of these tests to have a go at myself…

James - The report explains the likely consequences of one's standing on five broad personality domains. The following pages contain phrases describing people's behaviours. Please use the rating scale next to each phrase to describe how accurately each statement describes you. Number one, 'worry about things': very inaccurate, moderately inaccurate, neither accurate nor inaccurate, moderately accurate, very accurate. I'd say I'm pretty middle of the road on that, so I'm going to go with neither accurate nor inaccurate for that one. 'Have a vivid imagination.' Moderately accurate...'

Josephine - Hi, I'm Josephine Andresen and I work at the Psychometric Centre at Cambridge Judge Business School as a business development associate.

James - Josephine, I've come here today because you guys have sent me a personality test. I've got five scores in five different categories.

Josephine - Yeah, it's a big five personality questionnaire, which is constituted of extraversion, then agreeableness, then conscientiousness (would you start right away with the task when you get it? Would you continuously work on it) Then, Neuroticism (how many negative emotions do you have? Anxiety, depression, but also anger) and openness to experience. Through a lot of research, scientists have boiled personality traits down to these traits that are stable throughout life. So you score quite highly on agreeableness - 80 and quite low on neuroticism. So I feel like the highest and lowest scores always gives you quite an indication of what the person might be like. So for agreeableness, for example, it means that you don't seek conflict with other people, which might make you a good team partner, but also you might like to have a culture that's a bit more friendly and not that competitive. Because there's also disadvantages of being quite agreeable.

James - The big five personality test is potentially something quite a few people will have heard of. What is it that has made that the sort of standard?

Josephine - It is so prominent within science, but also within organisations because, first of all, it's not biased. For example, race biases or gender biases. Usually it's very important for these questionnaires to have questions that would account for that. For example, 'I like to play or go to football matches' could be a measure for extroversion, but only if you like sports. Largely, compared to other tests, for example, the Myers Briggs, that is quite popular but not very scientifically valid. In the big five personality test, when you score high on extraversion, it actually shows with experiments, real life studies, it would actually predict certain kind of behaviour - that you actually do go to more parties or, for openness to experience, you do actually engage in these behaviours. And then it's also reliable because if you do it a lot of times it always shows the same results.

James - Why might filling out this sort of questionnaire give a good indication of what job I might like to do?

Josephine - Usually, for different jobs, but also for different companies, they have different cultures. Different jobs have different levels of stress. So, for example, you are very low on neuroticism, so you might actually fair quite well in a stressful job. Also, a company that has a very friendly culture, you might fit in there very well because you score high on agreeableness. But if it's another company, like consulting, that's a bit more competitive as I have understood it so far, you might not like that that much. A company would actually know if you fit them very well.

James - Okay. That's interesting. I suppose the question then becomes: consultancy jobs for example are paid very well, could I use what you've taught me just now, use that to apply to a consultancy firm in the future? Maybe I might think, "okay, I want to score lower than I usually would in an agreeableness score."

Josephine - Yeah, you can definitely cheat on these tests. A lot of employers like conscientious people who get to work right away, who actually do the task that they said they would do. But there are also things like social desirability tests that would ask you, "do you ever lie?" And people who might answer in a very socially desirable way say, "no, I never lie", Even though everyone lies. Or, "do you ever break the law?" Everyone sometimes breaks the law - I mean I obviously never do. But then you also need to ask yourself, as a person, if you pretend you're very low on neuroticism, then you get into a job that's very stressful and then in the end you don't actually fit into it.

James - You're only hurting yourself by lying on these tests in the long run. From what you're saying, I'm getting the sense that you are quite supportive of these as pretty useful tools almost for removing bias. Perhaps an interview where you walk in and meet someone and speak to them one on one, there's lots of scope for them to make unconscious judgements about you that then contribute to whether they progress your application or whatever.

Josephine - Definitely. Compared to an interview, it can remove biases. For example, if a woman aspires to be in a leadership position, some people might have the assumption "women are a little bit more emotional." But actually there's not much of a difference when you compare men and woman and if you see the test and she doesn't score that, then it might remove that bias. But then again, I think you shouldn't only use the questions, but also the interview. Because there can also be, for example, the reference group effect that maybe you have a lot of extroverted friends and then when you do the questionnaire you compare yourself to them and think you are an introvert. But then when you go somewhere else, like to an introverted group compared to them, you're actually quite extroverted.

James - You mentioned earlier about how the big five personality traits are meant to be quite consistent across your life as you grow older. But then one thing that struck me as I was answering some of the questions like: Do you like to go on binges? Do you love excitement? Do you jump into things without thinking? Those occurred to me as characteristics which might, as you get older, become less prominent.

Josephine - What I would say here is, for example, the question, 'Do you like to go to parties?' First of all, you might think, yeah, probably when you're 60 you wouldn't like to go to a big club or rave. But, then again, party means something different when you're 60. So it might be a dinner party and then an extroverted person might still like to go to a dinner party.

Expressions

What makes our personalities unique?
Sam Gosling, University of Texas

It’s interesting to think, isn’t it, that a 20 minute questionnaire could tell you so much about who you are. Or can it? Somewhat unsatisfied, I reached out to Sam Gosling, professor of psychology at the University of Texas at Austin, to take a step back and tell me how sturdy our scientific understanding of personality really is…

Sam - Ordinary, lay conception of personality captures much more than what the scientists study. The scientists really focus on what you could call personality traits, which are these regularities in our behaviours and our thoughts and feelings. But some people in the field have really said, that's a very superficial take on somebody. Would you want to choose someone to marry, or to become your roommate purely on the basis of, say, their big five personality traits? And the answer is probably not, because you don't really get a sense of who that person is. And so you'd need to dig a little deeper to what some researchers call personal concerns. So that would be somebody's attitudes, their values, their goals, their roles. You take something like their values, it's like what's important to them? Do they value wisdom? Do they value power? Do they value becoming rich? And those sorts of things aren't the kinds of characteristics that will show up in a big five test. And then, if you really want to get a sense of who somebody is, you have to dig even deeper to what you could call identity. And you think of identity as the narrative story we tell about ourselves, about how we became the person we are today. It takes those events in the past and it's how we make sense of them to form this conception of the self, which also has implications for who we think we are going to be in the future too - those sort of deeper things, those values. The identity isn't captured by things like the Big Five and other dispositional constructs. And one of the reasons is they're much more difficult to measure.

James - I'm glad we've clarified that, but if I now bring us back to thinking about the Big Five, can you see the usefulness of personality testing as a way of determining what jobs we might be interested in?

Sam - Yeah, I think personality has a tremendous role to play in determining what jobs we might good at. I think if you ask most people what would make somebody a good salesperson versus a good truck driver versus a good nurse versus a good teacher, then they're not going to just say intelligence. It's not that intelligent people are better at all of those things. It will be other things too. And so it might be how much they enjoy interacting with others. If you enjoy interacting with others, then being a salesperson is good, but being a truck driver isn't so good. Is somebody reliable? Are they trustworthy? Are they friendly? Are they curious? Those are all personality traits, so I think it makes good sense to try to assess those in some systematic way. By doing so, it in fact helps fairness too, because we're unlikely if we have these test scores to rely so heavily on our stereotypes or our preconceptions of what somebody's likely to be like. I think a good example is, if somebody's introverted, they say less. And so then we get to learn less about their other qualities too.

James - Can I ask you, Sam, what you see as the main limitations of using personality tests to determine our potential roles?

Sam - I see the main limitation as being the fact that it's really focusing on such a small element of personality, that it's missing out on these deeper constructs like values and goals and our identity. And - I don't know this because, to my knowledge, the research hasn't done - but I suspect there are some things (how good a teacher you are, or how good a CEO you are) where it is those values, or it is this sense of who you are that is actually where the gold is, where the action is in predicting how well you do that job.

James - Within reason, are people not quite capable... perhaps it's even healthy, that people are not entirely different in their personal and professional lives, but that they're capable of separating it and have a slightly different personality.

Sam - Yeah, that's quite possible. I think it's important to say that when we say that somebody has a certain personality, that doesn't mean that their behaviour is invariant. So both an introvert and an extrovert will both be more talkative at a party than when they're at the library. But, in both of those contexts, at least in theory, the extrovert will be more talkative than the introvert. So I think it's important to understand that we're not saying behaviour is invariant. Now there has been some research that has tried to separate these things out. So there was some research which would essentially take a normal personality questionnaire, something like "I am talkative" or "I enjoy trying new things", those sorts of personality items, and what they did was they added to the end of those items "at work" or "at home." And what they found was you do get slightly different answers if you do that. And the answers to those tests do predict better performance at work, but the differences aren't very big.

A smartphone screen displaying social media applications

14:05 - Using social media for personality profiling

How our digital footprint could soon be used by our employers to assess us...

Using social media for personality profiling
David Stillwell, University of Cambridge

We can accept that someone’s personality can give a fair indication of how good a fit they might be for a certain job: a complete introvert is unlikely to thrive as a radio presenter, for example. The controversy starts when we ask how these judgements are made. Personality tests are one thing, but what if companies were building a profile of us using metrics other than those we gave them in a personality test? What if they looked into our online activity as well? With us now is David Stillwell, Professor of Computational Social Science at the University of Cambridge, who is looking into this very possibility.

David - Personality tests have their problems. We've been looking at alternatives. Some companies do automated video interviewing, for example. This is where a computer does an interview with you, and then an algorithm tries to measure the quality of your answers. Other companies do gamified assessments, so you play games and then they use that to try to assess things. And what I've been looking into is using social media data. So you probably know when you apply for a job, quite often someone in HR will search you up on Google and see what information they can find. And there's actually data. From Ghent University, they found that those who have an attractive profile picture get 38% more job interviews than those with a less attractive profile picture. So that demonstrates the biases of humans again. So what me and my team looked into is, instead of asking all these questions on a personality test, or instead of a human looking at your social media data, maybe an algorithm can look at your social media data and try to assess your personality. So instead of saying, do you like going to parties, we just look at the data. How many parties do you actually go to? Do you talk a lot on social media?

James - I can see the value, but how does this all square with people's rights of privacy if you're snooping around their social media profiles?

David - Some people might say it's just public data, so you should be able to go ahead and use it. I don't agree with that. I think companies should ask for permission before doing this kind of thing. When you apply for a job, they should tell you the kind of information they're going to look at. The other thing that a company should do is they should share what they learned or concluded from their analysis when you ask. Under GDPR, you've got the right to get data about you and companies should share. I think what really matters is in what context it's being used. For example, SAP, the massive German multinational, they came to us, they were redoing their recruitment and they said, "well, maybe we can use social media data." And we came to the conclusion that people wouldn't like it if you use social media data to decide if they get a job. What we created instead was a job recommendation app. So you shared your data, it predicted your personality, and then it said, "well, here's a role for someone like you inside this big company, SAP." And that's much lower stakes - people still have control over what jobs they can apply for.

James - Do you have evidence for this being more effective than a big five personality test?

David - In terms of reliability, using big data is definitely less reliable than using a test which is made to measure personality because the data is more messy. On the other hand, this kind of technique has the advantage that it's based on real behaviour, so it's what people are actually doing rather than what they say they do on the test and therefore we found that it predicts future behaviour better.

James - We've been suggesting how these sorts of techniques might actually be removing bias from the recruitment process. But then there must be examples of people who perhaps these algorithms aren't designed for. To account for people with a disability, for example, where the model that these systems are working to means that they get caught in the net.

David - Yeah, and as you said at the top of this, professional personality tests are quite expensive and it's easy to slap it together, a few questions. But the reason why they're expensive is, to really create a good test, you've got to collect a lot of data and a lot of evidence around it. And part of that is the test publisher should get data on what groups it works with, what groups does it not work with, they should also provide advice to the people they're selling the test to. To say, these are the accommodations you need to use when you're using this test. Maybe it's give people more time or just, it doesn't apply to this group, you need to use some other method.

James - And how reliably do personality traits built up through this data translate into job performance? Is there a strong correlation there?

David - The answer is we don't really have evidence yet as far as I know. So when we predicted future behaviour, we found it was at least as good as a traditional test. But I'm not aware of evidence on job behaviour. So some startups are starting to provide this kind of technology, but I'd say we're more proof of concept stage.

James - And David, do you generally feel quite positive about the future of this technology? It strikes me as a bit of a can of worms we're opening up here. The science is the science, but if this were to fall into the wrong hands, could this be used to do more damage than good?

David - Yeah. There are measures in place to stop this bad behaviour when it comes to classic personality testing. In order to administer a personality test, you have to get a certificate of occupational test use. The British Psychological Society, it does reviews of tests, and you can read those reviews and see how good the test is. We are relying on those professionals to use it in a positive way. But what we always have to bear in mind is what's the alternative and what's being done right now. I mentioned, attractive people get more job offers, or more interview offers. I just think we can do so much better. It's not about getting perfection, it's about getting better than what we've got now.

Man on a video call

19:39 - Recruiting using AI: unbiased or unfair?

Can we afford to leave important decisions down to algorithms...

Recruiting using AI: unbiased or unfair?
Tomas Chamorro Premuzic, UCL

As uneasy as this might make some people feel, the proponents for introducing machine learning into the hiring process make some compelling points about the bias prevalent in the traditional alternatives (like interviews). I spoke with Professor of Business Psychology at UCL Tomas Chamorro Premuzic, who also works for staffing and human resources firm Manpower. At Manpower, they claim to help their clients build their workforce using science, and run studies on a whole range of recruitment technologies. Alongside personality tests, another technique gaining traction are video call interviews where the interviewee responds to questions without a human being on the other end to receive them. Instead, AI analyses the tone and language used by the candidate to judge their performance. I asked Tomas whether these types of technology improve efficiency without improving fairness…

Tomas - We need to have the maturity and the rationality to distrust our instincts and to understand that when people say, "well, in my experience, this is biased" or "this doesn't work," or "this isn't very helpful," their experience is always based on an N of one and conflated with theor preferences, etc.. I mean, the point of scientific research is to provide evidence that comes from thousands if not millions of people. I think taking into account those studies and also understanding that this isn't rocket science. You are never going to be able to predict somebody's future job performance or somebody's fit to a team, group or organisation with a 100% degree of accuracy. The point is to do it as well as we can and as reliably as we can. It's possible to do this with, let's say, 70 or 75% degree of accuracy. And of course you can tell me, "but my cousin, she was really, really brilliant and she was unfairly rejected for this job by these recruiters." And perhaps you're right. But the point is that we want to minimise the number, or the incidence, of false positives and false negatives. And if you do that, more often than not you become a more meritocratic organisation and you become a more talent centric organisation. It's interesting to me that some of the same organisations that are championing diversity and inclusion are still looking for talent or trying to assess potential in the same old ways: looking at people's resumes and their qualifications and their educational credentials. And while it is absolutely possible for somebody who doesn't come from a high social class background to go to Cambridge, Oxford or Harvard and do really well, the vast majority of people that have these degrees are rich and they come from very affluent areas of society whereas if you look at people's personality and you try to understand what they're like, how they differ from others, you can not truly focus on diversity because we're all different. And if you don't try to understand what makes us unique and how we differ from others, then you truly don't care about diversity. And also look at qualities that are not conflated with social class, with socioeconomic status, with privilege. You can be more or less curious, more or less creative, more or less extroverted, more or less conscientious, more or less ambitious, more or less likable, irrespective of your class.

James - The difficulty that comes with accountability when we become more reliant on data and algorithms, because it's perhaps easier to blame a recruiter who demonstrated some bias, but it becomes a bit more difficult. Who do we hold to account when people slip through the net and find it difficult to overcome the low score that the computer gives them.

Tomas - Here I disagree. I have to say there's two issues really at stake. We rightly worry and have been concerned and have been raising attention as to the potential consequences and drawbacks of so-called black box AI models: algorithms or systems scoring you high or low or rejecting you for jobs without any explanation. But the only truly black box algorithm is the human brain. The only decision that is impossible to unpack, decode, and reverse engineer is what humans do. If I'm interviewing you and I reject you, I can come up with the best possible explanation of why you weren't a good fit for that job. I can say you didn't seem confident or you don't have expertise, or you were rude or you didn't make eye contact. And sometimes I truly believe that; it's not like I'm deliberately trying to deceive others and look for excuses because I have a nepotistic candidate that I prefer. However, with AI, you can always reverse engineer the decision making that underpins the algorithm. Algorithms are basically like recipes. And the only thing that is novel about AI is that it's a self generating recipe. You give it data and then can find out what the key ingredients are and identify patterns, and then influence or make decisions on the basis of those patterns. AI that is ethical by design has competent humans overseeing these algorithms, testing them for bias and adverse impact, and ideally still being involved in the decision making process. So I think it's very unlikely today that anybody is hired purely as a function of what a fully autonomous AI or algorithmic system does, which is also quite interesting because sometimes adding a human in the loop actually increases the bias and doesn't decrease it. I'll give you an example. Some of the video interviewing software technologies that have been developed in the last 5 or 10 years can actually give us a sense of whether, for example, you are more confident, whether you're more narcissistic, whether you have a higher or a lower integrity score. And when these scores are confirmed or checked by humans that come in the loop and they look at the same videos of people, actually they don't become more accurate. They often become less accurate because the person is driven by a lot of signals that actually have to do with things like race or class or attractiveness. Humans are very good at learning, but very bad at unlearning. No matter how much unconscious or conscious bias training you undergo, you cannot suddenly forget that the person sitting in front of you is male or female, white or black, old or young, attractive, you know? And in fact, the more you try to suppress that information, the more prominent it becomes in your mind. In the near future, we're probably going to see humans enhanced by AI, including assessments of people's personalities, scored with machine learning and artificial intelligence enhanced by human expertise. And the combination of both will be better than one way or the other.

Comments

Add a comment