I understand why you think that way, but there are reasons HR-related folks get involved in these types of things. Note that I cannot speak for this particular company or their processes, but I can speak generally.
Properly validated psychometric tests are valid predictors of job performance (by which I mean objectively and factually correlated with actual work performance), and a properly constructed work-relevant test will be on the whole will be more valid than subjective judgments from most technical professionals. Of course there will be hits and misses, but the consistent application is the key to its use. If you don't consistently apply it, it becomes worthless.
But, let's throw that out of the window for a second because in this particular anecdote I don't think it's relevant. This test they were using seems more like a cultural fit, specifically a person-organization fit test. These tests are less about predicting job performance than they are about predicting organizational commitment and ultimately tenure [1]. Surely we can agree that those things are important too.
But, let's again say that this candidate's aptitude is so impressive that we don't mind an increased statistical chance that his or her tenure may short. The real issue is that legally you are obligated to apply the same set of criteria all of your employees for a given role (not necessarily the exact same position, but comparable positions). If the company is applying cognitive testing, personality testing, drug testing, person-org. fit testing, etc. to one group of applicants, you now cannot fail to adhere to that criteria because it is an invitation for legal liability, however seemingly-spurious.
Having said that, it's important that your selection processes have been validated previously, or at least are in the process of being validated (which provides some protection). I also have a big concern with the linked situation because: 1) it's a cultural-related test, which means the possibility to show cultural-related bias is high and 2) they were unwilling to make language accommodations, which seems unnecessarily rigid when native-level English skills are likely not a core component of the job. Point #1 can be accounted for in some shape or form (at a minimum through a low cutoff score), but Point #2 is more problematic.
Anyway, there is a reason companies adopt "Big Corp" characteristics as the scale. The primary reason is for legal compliance through standardization of process, and the other is that data is supportive of the validity of objective predictors of job success relative to subjective judgments.
I realize I may be opening myself up to some criticism and scorn from HN crowd for seemingly representing "Big Corporate" or acting like a Bob from Office Space, but so be it. Despite what it may seem, there are often smart people using evidenced-based processes driving seemingly asinine HR processes that drive you crazy. And sometimes there aren't, and it's just a big pain in the ass for poor reasons.
The problem with psychometric tests is that they're easy to game by answering them with the values the company is almost certainly looking for. Any company issuing a psychometric test is looking for obedience, hard work, discipline, profit orientation, civility, non-threatening creativity, etc. etc. etc.
So what you get by discouraging a broad diversity of personalities is a mix of people who actually have those specific characteristics and sociopaths. The latter group ends up running the company, and of course they issue utterly sociopathic "psychometric evaluations", which are basically just encoded discrimination against entire classes of people, regardless of job fitness, because the people up top are such incompetent leaders that they need subservient workers to obey their crazy orders without question.
Actually, spotting and weeding out sociopaths is one of the points. I know that you may believe that you're a good judge of character, but actually a smart sociopath (not the norm, btw) is likely to be able to game you just as easily as the test.
There are three buckets of people you don't want to hire:
- helpless
- depressed
- jerks
In my subjective experience:
The top one has about %50 shot of being a character flaw, eg, no amount of confidence-building will help.
The middle one has the most hope, if the talent potential is significant. The best in the field can be painfully shy as well as depressed. The problem is that depression (not shyness) hurts morale. [1]
The last has a %75 chance of being a lost cause. Impossible to manage as they are incredibly disruptive to morale and productivity. [2] Their survival is often linked to mastery of politics, which inculcates their position.
Your buckets are quite simplistic and I doubt your company will be successful. First, if this is the states, you are not allowed to discriminate against clinical depression, so (2) will just lead to a lawsuit eventually. As for (1), if they have the skills are they helpless? Jerks are quite deplorable, but its a gradient right? How much jerkiness do you tolerate, or do you want saints?
I agree. I am working with people have these 3 types of characteristics. This 3-type categorization is way too simplistic to describe an individual. I find no problem working with them. People are interesting. They all have some kind of quirks and behavior patterns. It's important to be tolerant and adaptive. But people with strong technical depth is really hard to come by.
Duh. Value is holistic. We can find anything we want to find to support an argument.
When you have worked with someone that screams at their coworkers, can barely feed/clothe themselves or refuses to lift a finger to help you... then you will know how much time and emotional effort these behavioral traits can waste.
Toxic employees exist at the extremes of those buckets. Avoid hiring toxic employees, but the filters available to you might not be good enough to screen them out, so fire them quickly. During the interview, you might just be able to observe minor quirks which are hardly indicators either way.
I cringe whenever I read job ads that are aiming for a culture fit: "we only want to hire harmonious, independent, extroverted employees;" when in my experience, the best employees do not hold strongly to those ideals, but neither to do they hangout at the other extremes.
All or nothing hiring is too risky. Gradual formality makes the selection from both sides cooler, you know. We don't do interviews either; hazing sure, but no stupid hypothetical questions to test loyalty. There is no loyalty. Greed is good. =)
There is the another option. Make something people want and build it up slowly, on the side. If you do, you may free yourself from being at the beckon call of other masters (other than users). Others have, so it's possible. We are each masters of our own fate.
Well, my main objection is to hiring people on the basis of their character, whether it's with a test or not. Assess for whether or not the person can do the job and make you a profit.
Can you spot the person who interviews well, but is a skilled sociopath who will destroy the morale of your team, causing other skilled talent to leave?
That's the value of these tests (when properly applied): They allow an evaluator to make statistically accurate predictions of specific types of behavior. The key is the two words: statistically and specific. A single composite score will not tell you anything, but the idea that there's a 68% chance that the person you're about to hire is a schizophrenic kleptomaniac should give you pause.
> there's a 68% chance that the person you're about to hire is a schizophrenic kleptomaniac
Sorry, I don't believe "psychometric tests" can do that either. At least not at the level that we're talking about: distinguishing an otherwise qualified person from someone who's got some sort of psychopathology.
Then you didn't read the other replies in the original thread from actual research psychologists. They described the process and posted links to peer reviewed academic papers. Your baseless disbelief is equivalent to disavowing climate change or vaccines.
You can't catch psychopaths using tests like these. The whole point of being a psychopaths is that you can change the perception of your personality at will.
So what your letting in is a subgroup of people who pass, and psychopaths.
It's not what it means, but it is something they are capable of. They can be charming, and they can be horrible. They decide what they are going to be like, to fit into an overall selfish strategy. If being shy is an advantage, they will become that. If being confident is advantage, they will become that. They can gain your trust, then flip when it is an advantage to do so.
In these tests, they know what to say, what to do, to give a certain impression. These tests can't catch them, because they are not honest, and lack integrity(No consistent values, only ones that benefit them).
More specifically, the psychopaths that you're likely to run into in a tech company fall into this class.
There are certainly psychopaths who are hopeless at concealing the fact. These are typically the less smart ones, so you wouldn't meet them; they are more likely to be in and out of prison than your office.
There are also psychopaths who are just about perfect at pretending not to be, and have no intention of ever doing otherwise. You can work with one of those for decades, and never notice a thing unless something extreme enough to make 'acting normal' seem a long-term liability happens.
Again, go read the links that the actual PhDs in psychology posted in replies to the parent thread. They provided documentation in the form of peer-reviewed studies that back the claims that this kind of personality is detectable despite your belief that they are not.
If one's character meaningfully predicts job performance on top of whatever other methods you are using, then assessing on their character is the same as assessing on whether or not they can do they job (or at least, how well they can do it). Integrity tests have demonstrated this in many contexts, even taking into account the possibility of faking.
In the case of the military it can be the opposite: they weed out applicants who they are not confident will pull the trigger every time they are asked to.
I don't remember where I got the number, but apparently something like two-thirds of the regular army still end up unwilling/unable to aim properly when a combat situation actually happens. That's from the Great War, though; procedures may have gotten better since.
>The problem with psychometric tests is that they're easy to game by answering them with the values the company is almost certainly looking for.
You'd think that, but a quality psychometric test can't be gamed if it was developed in-house and is only taken once by the candidate. This particular company might not have had one of those, however.
Professional personality tests can't be gamed - you also can't find these on the internet, because they're very strictly copyrighted.
Can you go into more detail? It's hard to imagine how a test could be devised that wouldn't be gameable by giving answers that would be expected of a person with a normal personality.
It's not particularly difficult to devise a personality exam that way - in the best case scenario, the candidate cannot game the test, and in the worst case scenario, they game it but the proctor is fully aware of the gaming, so you simply reject the test (or the candidate).
Instead of trying to get rid of questions that are gameable, the psychometrists design the test with statistical consistency - things as obscure as, say, what color the candidate answers in a multiple choice question out of orange, yellow, blue and red can be used to correlate with other questions (not literally, but to a candidate it would be equally obscure and seemingly innocuous as a question). If a candidate answers one question in what they perceive to be the societal ideal, this will be exposed in other questions where it's not possible to cross-consistently game the question unless you've read and thoroughly understand a manual of psychology or psychometry.
Tests designed this way allow for a certain amount of "gaming" by candidates before it crosses a statistical threshold, at which point it essentially tells the proctor, "The answers are so inconsistent that the candidate wasn't being honest." and you have to throw out the whole test (which in the context of hiring, means rejecting the candidate).
The reason why this works is because while people will answer "Would you consider yourself hard working?" with "Always" or some other unrealistically gamed superlative, they won't realize that other questions that are seemingly unrelated are highly correlated with that quality - if you answer yes to one and no to another, you probably lied on the transparent one, with high confidence.
Tests like this[1] have been around for so long that they consistently get better, though there are some valid criticisms of them being easier for non-minorities (for a variety of reasons). They are used in clinical and professional contexts, and while the quality personality tests will theoretically be consistent across multiple-test takings (in other words, results are relatively immutable), in practice you should probably avoid giving one individual a lot of exposure to the same exact test twice.
I'm a psychologist, and I game personality tests for fun.
The MMPI is harder than the Myers Briggs, but not that difficult. Additionally its really only optimal for clinical samples, at which it is very good.
However, everyone in the field knows that these tests can be gamed, the only open question is how many people game them consistently.
You can probably estimate the proportion of social desirability exhibited in job interviews by comparing to non-job situations (such as a sample matched on all relevant covariates (whatever they are) from the general population.
Nonetheless, believing in personality tests as an accurate indicator of personality is as misguided as believing that Facebook represents the social graph of all its daily active users accurately, i.e., somewhat misguided.
And I am aware of lie scale, and they are trivial if you actually read the question. Protip: If a question says always or never, its probably designed to trip you up.
I do agree that personality tests are more accurate than this thread makes them out to be, but they are certainly not as useful as your comment implies.
Find the scoring manuals, do loads of personality tests, rinse, repeat. Its not particularly difficult.
Despite this, I have ran many surveys back when I worked in academia. You can detect some of this stuff with Guttman errors, but these are not often used, and as long as you are consistent, its very difficult to spot.
> The reason why this works is because while people will answer "Would you consider yourself hard working?" with "Always" or some other unrealistically gamed superlative, they won't realize that other questions that are seemingly unrelated are highly correlated with that quality
Can you provide a deeper example? I'm honestly curious - what sort of question is innocuous enough to be answered honestly, yet useful enough to provide information? (i.e., do slackers like the color yellow or something?)
This is actually part of the so-called "bogus pipeline." Test-takers are told that the test can detect any attempt to lie, causing them to answer the questions more honestly than they would otherwise. The MMPI does detect inconsistent and unrealistically extreme answers, but it's not quite as foolproof as claimed above.
Okay, that is the MMPI. That is not evidence that any test that is not the MMPI cannot be gamed. The MMPI (and I disgree with it in so many ways) was researched and tested for decades, and perhaps they can detect lying. Perhaps. I don't buy that any lesser test cannot be easily gamed absent a heck of a lot of experimental evidence for that specific test.
Often, at least in the work contexts, it doesn't matter if they can be gamed. Validity testing is done with the effects of gaming built-in, and there is some evidence that gaming has relations with performance benefits.
A acquaintance of mine is finishing his dissertation on faking on personality and integrity tests, so I'll know more on that soon.
Yes, but in that context gaming is known and accounted for. My definition of gaming was manipulation without the proctor or evaluation showing any statistically significant deviation, which is virtually impossible on modern personality tests. The "gaming" becomes transparent and used to score the candidate, if it's present at all.
No, it can certainly be gamed. It's just so hard without prior knowledge and preparation as to be statistically negligent. This is also why I said you should avoid giving the same test twice. But a single, cold test administration should be very difficult to game.
You should also read upthread, what the actual psychologist said. It clarified my comment really well.
The following are very obviously the "correct" true or false answers to these questions from the MMPI-2:
T * My mother is a good woman.
F * Evil spirits possess me at times.
F * There seems to be a lump in my throat much of the time.
T * At times I feel like swearing.
T * My hands and feet are usually warm enough.
F * Ghosts or spirits can influence people for good or bad.
F * Someone has been trying to poison me.
F * Everything tastes the same.
F * Someone has been trying to rob me.
F * Bad words, often terrible words, come into my mind
and I cannot get rid of them.
F * Often I feel as if there is a tight band around my head.
F * Peculiar odors come to me at times.
F * My soul sometimes leaves my body.
F * When a man is with a woman he is usually thinking
about things related to her sex.
F * I often feel as if things are not real.
F * Someone has it in for me.
F * My neck spots with red often.
T * Once in awhile I laugh at a dirty joke.
F * I hear strange things when I am alone.
F * In walking I am very careful to step over sidewalk cracks.
F * At one or more times in my life, I felt that someone
was making me do things by hypnotizing me.
I cut out a few because I couldn't see how true or false mattered.
Those are the obvious ones. I've taken the MMPI. There's a lot more than just those. None of these are the seemingly innocuous ones I was talking about.
Again, I'll reiterate: can they be gamed? Yes. Is it likely? Emphatically, no.
Okay, I didn't know you've taken the MMPI, so fair enough. I would be surprised if answers to the innocuous questions were that important though; they strike me more as luring you into a sense of complacency and so that you end up answering honestly when it comes to the important ones.
Properly validated psychometric tests are valid predictors of job performance (by which I mean objectively and factually correlated with actual work performance), and a properly constructed work-relevant test will be on the whole will be more valid than subjective judgments from most technical professionals.
Do you have any support for this where the employees are both technical and creative? I'm obviously thinking about software development, but it should apply to any kind of engineering or technical job where the employees need to have a deep technical knowledge, know how to apply that knowledge practically, and have to use that knowledge to create new solutions to new problems.
I've been personally involved in validity testing for graphic designers, and while the validity coefficients were reduced they were still of practical significance, and had incremental validity over cognitive ability testing (which is always the best predictor, but tends to show racial bias). I will see if I can find any published research as I've not seen any and am now curious myself.
If so, under which circumstances are they used? For a graphic designer, the natural "test" would be for fellow graphic designers and potential managers to look at an applicant's work samples, or to ask them to produce one. This method directly tests the applicant's ability to do the type of job, although there is no objective metric. You are relying on people's subjective assessment. How do cognitive ability tests compare?
Yes, in essence. When I refer to CATs I'm talking about measures testing g (http://en.wikipedia.org/wiki/G_intelligence). And I know it's hard to believe, but g-centric tests like cognitive ability test do a better job than other seemingly more relevant selection measures like work sample tests and assessment centers. The benefit of work sample tests, assessment centers, integrity tests, etc. is that their validity is decent and that a significant portion is independent of g-centric measures.
The difficulty I have with that article is that I don't know how the jobs in these studies map to engineering or research jobs. I'm thinking of jobs where one has accumulated close to a decade's worth of knowledge before starting.
Related to this, I think it's important to consider that this correlation between g and job performance is conditioned on the fact that the person applied for the job. That sounds trivially true at first, but it means that the applicant felt like they were competent for the job (in the best case; in the worst case, it meant they felt that they had a chance of appearing competent at the job). In other words, what we're saying is, "Of the people who thought they could do the job, the smartest ones tended to do the best."
But if our candidate pool was everyone, I'm skeptical that g would still hold as a good predictor. I think I'm a bright guy, but I'm pretty sure I'd make a terrible nuclear engineer. And with that in mind, we may need to keep the non-g related selection around to prevent such a situation.
You've made a good comment. I can't specifically point you to a study with research or engineering (though I know there has been some that involved academic research performance as an outcome). The finding tends to be that 1) g is more, not less, important for jobs with higher complexity and 2) job knowledge acts as a mediator between the predicted relationship of g and work performance.
You are right to think that the results would be different if the test was just given on the general population. It's an academic consideration that tends to resolve itself in the field.
And you'll never hear me, or anyone else, suggest that g should be the only predictor for just the reason you describe. Biographical data (e.g., years experience, work history) is much better first hurdle.
The Wikipedia article talks a lot about correlation between performance in different subjects in school.
I've also read that two of the strongest predictors of performance in school are your parents' performance in school, and (after controlling for that) your parents' income.
Do studies of general intelligence usually control for the impact of parents' education and income?
> cognitive ability testing (which is always the best predictor, but tends to show racial bias).
I'm intrigued, is the "bias" because the test is unfair to one racial or more racial sub-groups or because the test is "fair" and that is how they actually perform or is it a language thing.
I do reasonably well on standardised IQ tests but I suspect if I did a German one I might struggle.
It's a bit of a mystery, truly, and bias can mean different things (e.g., slope vs. intercept predictive bias). Note also that predictive bias is not the same thing as mean subgroup differences (e.g., mean score differences of White vs. Black candidates).
An interesting thing is that the predictive bias is reduced for open-form response questions vs. select-one tests. It's an indication that there is more at work than just subgroup differences.
When my mother was doing sociology at Uni I read one of her texts (as you do) and it had an example of an IQ test been flawed, they gave the same test to two different groups of children and found that the poorer (working class as they where grouped in the study) children performed less well consistently.
One of the people looking at the results then looked at the breakdown of questions by group and noticed immediately that questions like "The cup goes on the a) saucer b) floor c) table d) shelf" where consistently "wrong" (correct answer was a) for the poorer group at which point he realised that working class children drink tea from mugs and saucers where a middle class thing.
The story might be apocryphal but its stuck with me since I was 12-13 whenever I run into any kind of standardised testing/results.
As, I'm sure you can tell, this isn't specifically for technical and creative employees, but in general it appears that structured interviews, IQ tests and work samples significantly out perform unstructured interviews.
> If you don't consistently apply it, it becomes worthless.
Why?
Having read Thinking Fast and Slow and been convinced by this book's many useful views, I would agree that a simple questionnaire-based ranking is actually better than any subjective assessment of candidates. And it should be as neutral as possible, ie not letting one's "first impression" influence the ranking (because it is almost always based on irrelevant physical features).
But even then, there is no reason for this neutral evaluation to become worthless if not 100% consistently applied. There is the broken leg case: If you want to evaluate the probability of someone going to see a movie tonight, you just base yourself on simple statistical facts (how frequently people go to the movie in this country), and should not try to infer more from subjective context, except if the guy in question broke his leg this morning.
These tests and evaluations help much in reducing system bias due to halo fallacy, framing effect, and even time of the day for the evaluator (it has been proven juges are more lenient after lunch!), but they still are only helpers, they do not need to be decision-blocking.
Well, legally consistent application is very important. If getting an 80% if the cutoff for candidate A, and getting a 65% is acceptable for candidate B, then you're now setting different standards for different candidates. What if those different standards align with something like gender, or religion? It may not happen on purpose, but it's a problem.
Statistically, you've created a model for consistent application, and if you don't consistently apply then the anticipated value of the tools you're using is now in question as the model has a large source or error involved now.
I don't mean to say subject evaluations have no place in the process. Think of each part of the selection process as an independent stage. The candidates can pass the objective tests, perhaps with a minimum cutoff score, then pass the subjective judgment test. Or vice versa (though that may limit utility further).
> Anyway, there is a reason companies adopt "Big Corp" characteristics as the scale. The primary reason is for legal compliance through standardization of process, and the other is that data is supportive of the validity of objective predictors of job success relative to subjective judgments.
I can't speak to your second point, but as to your first point, I should note that law firms themselves definitely do not use such HR-driven hiring processes.
If they discard the second point, the first point may become less relevant. Companies that choose not to follow evidenced-based hiring practices, instead focusing more on pedigree or a few interviews, may do just fine as long as everyone is treated consistently and their processes don't show observable bias. They do so at their own detriment, but legally they can be OK.
Also worth noting: sometimes less rigorous way is the better way. Developing validated tests doesn't work as well with smaller numbers, and can be expensive. There are also considerations to be made for the selection ratio (candidates selected over candidates applied). The utility isn't always worth it, and I shouldn't make it out to seem so.
Huh. Hewlett-Packard is as BigCorp as BigCorps come, and we don't use psychometric tests -- at least not here in the software division, and not anywhere else that I've ever heard of.
Here's a very oversimplified model of people that would allow a test to strongly correlate with job performance, but still be wrong.
Let's assume for a moment that when a person is told to do something, and they disagree, 90% of all employees are typically wrong when they disagree and 10% of all employees are typically right when they disagree.
We are considering now just engineering type jobs (since that's what the article is about).
Now consider anti-authoritarianism: 90% of the people who are anti-authoritarianism will perform terribly worse than the typical population (since they will tend to go with their mistaken belief of what is right, or be obstructionist when poeple want to do it differently, etc). They will be your least valuable employees.
The remaining 10% of those scoring high on anti-authoritarianism are among the most valuable of your employees, as they won't allow their teammates to go down a dead-end.
So anti-authoritarianism is highly correlated with poor performance (even trouble-makers) but rejecting all applicants based on this will keep you from getting some of the most valuable employees.
>>> Properly validated psychometric tests are valid predictors of job performance
Here I have a problem - which job performance? Different jobs are obviously demanding different qualities - highly creative but impatient person may be an asset in job requiring instant creativity but a liability in a job requiring steady repetitive tasks and constant attention. So the test has to be matched to a position. But in the original article not only HR gave the same test to everybody, people who decide on positions don't even seem to have information or input on any correlation between position and test results. Given this, I highly doubt such application of testing can have any meaningful correlation with job performance.
Job performance can be defined many different ways, and the most common in field studies is what is most common in the workplace: supervisory, and possibly peer, evaluations. It's not the perfect criteria, but the "perfect criteria" for many jobs exists only in the theoretical. When appropriate, "work performance" is also defined as sales numbers, widgets produced, time-to-complete, etc. It just depends on the study.
> So the test has to be matched to a position.
In many cases you're right (e.g., work samples), but not always. g-centric tests have demonstrated high validity across jobs and contexts. Tests for cultural fit (in which the desired outcome is typically workplace satisfaction and tenure) are typically validated across a wide range of jobs, and so they are applied across those jobs. Things like integrity tests and personality tests can have meaningful validity across jobs, even without seemingly job-relevant content.
Properly validated psychometric tests are valid predictors of job performance (by which I mean objectively and factually correlated with actual work performance), and a properly constructed work-relevant test will be on the whole will be more valid than subjective judgments from most technical professionals. Of course there will be hits and misses, but the consistent application is the key to its use. If you don't consistently apply it, it becomes worthless.
But, let's throw that out of the window for a second because in this particular anecdote I don't think it's relevant. This test they were using seems more like a cultural fit, specifically a person-organization fit test. These tests are less about predicting job performance than they are about predicting organizational commitment and ultimately tenure [1]. Surely we can agree that those things are important too.
But, let's again say that this candidate's aptitude is so impressive that we don't mind an increased statistical chance that his or her tenure may short. The real issue is that legally you are obligated to apply the same set of criteria all of your employees for a given role (not necessarily the exact same position, but comparable positions). If the company is applying cognitive testing, personality testing, drug testing, person-org. fit testing, etc. to one group of applicants, you now cannot fail to adhere to that criteria because it is an invitation for legal liability, however seemingly-spurious.
Having said that, it's important that your selection processes have been validated previously, or at least are in the process of being validated (which provides some protection). I also have a big concern with the linked situation because: 1) it's a cultural-related test, which means the possibility to show cultural-related bias is high and 2) they were unwilling to make language accommodations, which seems unnecessarily rigid when native-level English skills are likely not a core component of the job. Point #1 can be accounted for in some shape or form (at a minimum through a low cutoff score), but Point #2 is more problematic.
Anyway, there is a reason companies adopt "Big Corp" characteristics as the scale. The primary reason is for legal compliance through standardization of process, and the other is that data is supportive of the validity of objective predictors of job success relative to subjective judgments.
I realize I may be opening myself up to some criticism and scorn from HN crowd for seemingly representing "Big Corporate" or acting like a Bob from Office Space, but so be it. Despite what it may seem, there are often smart people using evidenced-based processes driving seemingly asinine HR processes that drive you crazy. And sometimes there aren't, and it's just a big pain in the ass for poor reasons.
--
[1] PDF of meta-analytic findings about person-job/org fit: http://nreilly.asp.radford.edu/kristof-brown%20et%20al.pdf