Harvard’s Project Implicit website has informed millions of visitors about their racial prejudices. It has also fueled a decade-long academic feud.
Can We Really Measure Implicit Bias? Maybe Not
harbor a moderate preference for white faces. You probably do, too: About 70 percent of people who take the race version of the Implicit Association Test show the same tendency — that is, they prefer faces with typically European-American features over those with African-American features. Since it first went online in 1998, millions have visited Harvard’s
and the results have been cited in thousands of peer-reviewed papers. No other measure has been as influential in the conversation about unconscious bias.
That influence extends well beyond the academy. The findings come up often in discussions of police shootings of black men, and the concept of implicit bias circulated widely after Hillary Clinton mentioned it during the presidential campaign . The test provides scientific grounding for the idea that unacknowledged prejudice often lurks just below society’s surface. “When we relax our active efforts to be egalitarian, our implicit biases can lead to discriminatory behavior,” according to the Project Implicit website, “so it is critical to be mindful of this possibility if we want to avoid prejudice and discrimination.”
In other words, beware your inner bigot.
But the link between unconscious bias, as measured by the test, and biased behavior has long been debated among scholars, and a new analysis casts doubt on the supposed connection.
Researchers from the University of Wisconsin at Madison, Harvard, and the University of Virginia examined 499 studies over 20 years involving 80,859 participants that used the IAT and other, similar measures. They discovered two things: One is that the correlation between implicit bias and discriminatory behavior appears weaker than previously thought. They also conclude that there is very little evidence that changes in implicit bias have anything to do with changes in a person’s behavior. These findings, they write, “produce a challenge for this area of research.”
That’s putting it mildly. “When you actually look at the evidence we collected, there’s not necessarily strong evidence for the conclusions people have drawn,” says Patrick Forscher, a co-author of the paper, which is currently under review at Psychological Bulletin. The finding that changes in implicit bias don’t lead to changes in behavior, Forscher says, “should be stunning.”
Hart Blanton was not stunned. For the last decade, Blanton, a professor of psychology at the University of Connecticut, has been arguing that the Implicit Association Test isn’t all it’s cracked up to be. In a 2013 meta-analysis of papers, Blanton and his co-authors declared that, despite its frequent characterization as a window into the unconscious, “the IAT provides little insight into who will discriminate against whom, and provides no more insight than explicit measures of bias.” (By “explicit measures” they mean simply asking people if they are biased against a particular group.)
The link between unconscious bias and biased behavior has long been debated by scholars.
The test works by measuring how quickly people can, for instance, associate African-American faces with positive words versus European American faces with those same positive words. In one round of the test, you’re instructed to press a particular key if a positive word like “pleasure” or “wonderful” flashes on the screen and to press that same key if a white face appears. Then, in another round, the program will tell you to press the same key for darker faces and positive words. It tracks how many mistakes you make and measures how quickly you press those keys, right down to fractions of a second. The site also offers tests that measure bias against other groups, including obese people, the disabled, and the elderly, though it’s the race results that tend to dominate the discussion.
It generally takes people longer to associate a positive word with an African-American face than a European-American face. What’s uncanny is that the test usually works even on people who, like me, know what’s being measured ahead of time and are doing their best to answer at the same speed so as not to be deemed biased.
But those results, Blanton has been saying in paper after paper, year after year, don’t tell us much, if anything.
For the record, Blanton is a 49-year-old white guy who considers himself a liberal and became a psychologist because of an early interest in social justice. A journalist once referred to him as a “conservative intellectual,” which Blanton jokes is wrong on both counts.
Over coffee recently, he sketched out an analogy in his notebook. He drew a graph illustrating how high IQ scores tend to predict achievement, a claim backed up by reams of data. In contrast, the IAT — a sort of IQ test for bias — doesn’t reveal whether a person will tend to act in a biased manner, nor are the scores on the test consistent over time. It’s possible to be labeled “moderately biased” on your first test and “slightly biased” on the next. And even within those categories the numbers fluctuate in a way that, Blanton contends, undermines the test’s value. “The IAT isn’t even predicting the IAT two weeks later,” Blanton says. “How can a test predict behavior if it can’t even predict itself?”
Anthony G. Greenwald doesn’t think much of Blanton’s critique — or, it seems, of Blanton himself. He is the co-author, along with Mahzarin R. Banaji, of the 2013 best seller Blind Spot: Hidden Biases of Good People (Delacorte), a book that’s based on the IAT, a test the two helped create. Greenwald, a psychology professor at the University of Washington, points to errors he found in a recent paper of Blanton’s as proof that his results are not to be trusted. Blanton, for his part, says that the mistakes were the result of a copy-editing error and that they didn’t affect the thrust of the article. The two engaged in a cordial email back-and-forth about the errors, though neither is impressed by the other’s rigor. “He’s not a great scientist,” Greenwald says.
Greenwald also raised the possibility that Blanton and other critics of the IAT, which Greenwald has overseen for two decades, are motivated by the paid work they do as consultants in legal cases involving discrimination and implicit bias. Blanton said he’s worked as an expert consultant on two IAT-related legal cases over the years, and maintains that “the idea I started doing this in 2003 because I thought there would be some payoff is ludicrous.” (Greenwald has taken consulting gigs as well, working on 20 or so such cases, he says.)
Banaji, a professor of psychology at Harvard, doesn’t question Blanton’s motives. She does, however, point to the multitude of papers that have made use of the implicit-bias measure, and the relatively few that have questioned its accuracy. In an email, she likened IAT doubters to climate-change deniers. “I’m sure Hart Blanton believes himself to be saving humanity from the dangers of the IAT,” she wrote, noting that Blanton has “dedicated so many precious years of his career to improving our work.”
You might forgive Banaji and Greenwald for sounding annoyed by Blanton’s quibbles and broadsides. They have been responding to them in peer-reviewed journals and with reporters since George W. Bush was in the White House. And, in fairness, Blanton has been known to toss off some barbed remarks about the test to which they’ve dedicated so many precious years of their own careers. When we met, Blanton compared the IAT to a Facebook quiz that tells you which Disney princess you’re most like — “though that at least has some marketing data behind it.”
“It makes us feel important to say, Aha, we have these measures that can tell us what the problem is, and, not only that, we can tell them how to fix the problem.”
What’s striking, though, is how, in some respects, their conclusions about the IAT don’t seem all that far apart. Greenwald acknowledges that a person’s score can vary significantly, depending on when the test is taken, and he doesn’t think it’s reliable enough to be used to, say, select bias-free juries. “We do not regard the IAT as diagnosing something that inevitably results in racist or prejudicial behavior,” he says.
Everyone agrees that the statistical effect linking bias to behavior is slight. They only disagree about how slight. Blanton’s 2013 meta-analysis found less of a link than a 2009 meta-analysis by Banaji and Greenwald. Blanton sees the correlation as so small as to be trivial. Banaji and Greenwald, in a 2015 paper, argue that “statistically small effects” can have “societally large effects.”
The new analysis seems to bolster Blanton’s less-sanguine take. It found that the correlation between implicit bias and behavior was even smaller than what Blanton had reported. That came as a surprise, the researchers write.
Another surprise is that one of the co-authors of the paper is Brian Nosek, who is — along with Greenwald and Banaji — one of the three founders of the IAT. Nosek, best known these days as the director of the Center for Open Science and an advocate for better research practices, is well aware that this paper will provide aid and comfort to critics of the test he helped create. “It sometimes shocks people when I say that the two people I have disagreed with most in my career are Mahzarin and Tony,” Nosek wrote in an email.
He does defend the IAT, noting that it’s engaged millions of people in a conversation about the science of bias. He points to the test’s successes, like experiments that show how it can predict who someone would favor in a presidential election by tracking their associations. But what he calls the “very weak overall” connection between implicit bias and discriminatory behavior should, he believes, put researchers on notice. “You would think that if you change the associations, and the associations predict behavior, then the behavior would change too,” Nosek says. “But the evidence is really limited on it.”
Patrick Forscher, who shares the title of first author of the paper with Calvin Lai, a Harvard postdoc, thinks that there’s been pressure on researchers over the years to make the science of implicit bias sound more definitive and relevant than the evidence justifies. “A lot of people want to know, How do we tackle these disparities?” says Forscher, a postdoc at the University of Wisconsin at Madison. “It makes us feel important to say, Aha, we have these measures that can tell us what the problem is, and, not only that, we can tell them how to fix the problem.”
That’s essentially Blanton’s argument as well. Public discussion about implicit bias has been based largely on the results from one particular test, and that test, in his view, has been falsely sold as solid science. “They have engaged the public in a way that has wrapped the feeling of science and weight around a lot of ‘cans’ and ‘maybes,’” Blanton says. “Most of your score on this test is noise, and what signal there is, we don’t know what it is or what it means.”
Blanton is not saying there’s no such thing as unconscious bias, nor is he arguing that racial discrimination isn’t a deep and abiding problem in American life (though at least one white-supremacist-friendly website has mentioned his research in an attempt to make that case — illustrating how such discussions can be misconstrued). He just thinks that scientists don’t know how to measure implicit bias with any confidence and that they shouldn’t pretend otherwise. “It is such an important problem that it deserves a stronger science,” he says.
Forscher hopes the discussion will move beyond the long-running, sometimes caustic back-and-forth over the IAT. He wants to focus on understanding the root causes of discrimination in order to combat its pernicious effects. As part of that mission, he’s for several years helped train police officers in Madison about bias. He intends to continue that work while also trying to figure out how best to go about it.
“I see implicit bias as a potential means to an end, something that tells us what to do and some possible remedies for what we see in the world,” Forscher says. “So if there’s little evidence to show that changing implicit bias is a useful way of changing those behaviors, my next question is ‘What should we do?’”
Tom Bartlett is a senior writer at The Chronicle.