The implicit-association test (IAT) is a measure within social psychology designed to detect the strength of a person's automatic association between mental representations of objects (concepts) in memory. The IAT was introduced in the scientific literature in 1998 by Anthony Greenwald, Debbie McGhee, and Jordan Schwartz. The IAT is now widely used in social psychology research and, to some extent, in clinical, cognitive, and developmental psychology research. The IAT is the subject of much controversy regarding precisely what it measures, and the lack of reproducibility of many of its results.
In 1995, social psychology researchers Anthony Greenwald and Mahzarin Banaji asserted that the idea of implicit and explicit memory can apply to social constructs as well. If memories that are not accessible to awareness can influence our actions, associations can also influence our attitudes and behavior. Thus, measures that tap into individual differences in associations of concepts should be developed. This would allow researchers to understand attitudes that cannot be measured through explicit self-report methods due to lack of awareness or social-desirability bias. The first IAT article was published three years later in 1998.
A computer-based measure, the IAT requires that users rapidly categorize two target concepts with an attribute (e.g. the concepts "male" and "female" with the attribute "logical"), such that easier pairings (faster responses) are interpreted as more strongly associated in memory than more difficult pairings (slower responses).
The IAT is thought to measure implicit attitudes: "introspectively unidentified (or inaccurately identified) traces of past experience that mediate favorable or unfavorable feeling, thought, or action toward social objects." In research, the IAT has been used to develop theories to understand implicit cognition (i.e. cognitive processes of which a person has no conscious awareness). These processes may include memory, perception, attitudes, self-esteem, and stereotypes. Because the IAT requires that users make a series of rapid judgments, researchers believe that IAT scores may also reflect attitudes which people are unwilling to reveal publicly. The IAT may allow researchers to get around the difficult problem of social-desirability bias and for that reason it has been used extensively to assess people's attitudes towards commonly stigmatized groups.
Task 1 (practice):
Task 2 (practice):
Press E to classify as Pleasant
or I to classify as Unpleasant
Tasks 3 and 4 (data collection):
Press E to classify as Black or Pleasant
or I to classify as White or Unpleasant
Task 5 (practice):
Press E to classify as White
or I to classify as Black
Tasks 6 and 7 (data collection):
Press E to classify as White or Pleasant
or I to classify as Black or Unpleasant
Example of a typical IAT procedure
A typical IAT procedure involves a series of seven tasks. In the first task, an individual is asked to categorize stimuli into two categories. For example, a person might be presented with a computer screen on which the word "Black" appears in the top left-hand corner and the word "White" appears in the top right-hand corner. In the middle of the screen a word, such as a first name, that is typically associated with either the categories of "Black" or "White". For each word that appears in the middle of the screen, the person is asked to sort the word into the appropriate category by pressing the appropriate left-hand or right-hand key. On the second task, the person would complete a similar sorting procedure with an attribute of some kind. For example, the word "Pleasant" might now appear in the top left-hand corner of the screen and the word "Unpleasant" in the top right-hand corner. In the middle of the screen would appear a word that is either pleasant or unpleasant. Once again, the person would be asked to sort each word as being either pleasant or unpleasant by pressing the appropriate key. On the third task, individuals are asked to complete a combined task that includes both the categories and attributes from the first two tasks. In this example, the words "Black/Pleasant" might appear in the top left-hand corner while the words "White/Unpleasant" would appear in the top right-hand corner. Individuals would then see a series of stimuli in the center of the screen consisting of either a name or word. They would be asked to press the left-hand key if the name or word belongs to the "Black/Pleasant" category or the right-hand key if it belongs to the "White/Unpleasant" category. The fourth task is a repeat of the third task but with more repetitions of the names, words, or images.
The fifth task is a repeat of the first task with the exception that the position of the two target words would be reversed. For example, "Black" would now appear in the top right-hand corner of the screen and "White" in the top left-hand corner. The sixth task would be a repeat of the third, except that the objects and subjects of study would be in opposite pairings from previous trials. In this case, "Black/Unpleasant" would now appear in the top right-hand corner and "White/Pleasant" would now appear in the top left-hand corner. The seventh task is a repeat of the sixth task but with more repetitions of the names, words, or images. If the categories under study (e.g. Black or White) are associated with the presented attributes (e.g. Pleasant/Unpleasant) to differing degrees, the pairing reflecting the stronger association (or the "compatible" pairing) should be easier for the participant. In the Black/White-Pleasant/Unpleasant example, a participant will be able to categorize more quickly when White and Pleasant are paired together than when Black and Pleasant are paired if he or she has more positive associations with White people than with Black people (and vice versa if Black and Pleasant are categorized more quickly).
Variations of the IAT include the Go/No-go Association Test (GNAT), the Brief-IAT and the Single-Category IAT. An idiographic approach using the IAT and the SC-IAT for measuring implicit anxiety showed that personalized stimulus selection did not affect the outcome, reliabilities and correlations to outside criteria.
Valence IATs measure associations between concepts and positive or negative valence. They are generally interpreted as a preference for one category over another. For example, the Race IAT shows that more than 70% of individuals have an implicit preference for Whites over Blacks. On the other hand, only half of Black individuals prefer Blacks over Whites (cf. the earlier "doll experiment" developed by psychologists Kenneth and Mamie Clark during the early civil rights era). Similarly, the Age IAT generally shows that most individuals have an implicit preference for young over old, regardless of the age of the person taking the IAT. Some other valence IATs include the Weight IAT, the Sexuality IAT, the Arab-Muslim IAT, and the Skin-tone IAT.
Stereotype IATs measure associations between concepts that often reflect the strength to which a person holds a particular societal stereotype. For example, the Gender-Science IAT reveals that most people associate women more strongly with liberal arts and men more strongly with science. Similarly, the Gender-Career IAT indicates that most people associate women more strongly with family and men more strongly with careers. The Asian IAT shows that many people more strongly associate Asian Americans with foreign landmarks and European Americans more strongly with American landmarks. Some other stereotype IATs include the Weapons IAT and the Native IAT.
The self-esteem IAT measures implicit self-esteem by pairing "self" and "other" words with words of positive and negative valence. Those who find it easier to pair "self" with positive words than negative words are purported to have higher implicit self-esteem. Generally, measures of implicit self-esteem, including the IAT, are not strongly related to one another and are not strongly related to explicit measures of self-esteem.
The Brief IAT (BIAT) uses a similar procedure to the standard IAT but requires fewer classifications. It involves approximately four to six tasks rather than seven, only uses combined tasks (corresponding most closely to tasks 3, 4, 6, and 7 on the standard IAT), and has fewer repetitions. Additionally, it requires specification of a focal concept in each task as well as a single attribute, instead of two. For example, although, White, Black, Pleasant, and Unpleasant stimuli all appear, participants would press one key when White and Pleasant words appear and another key when "anything else" appears. Subsequently, participants would press one key when Black and Pleasant words appear and another key when "anything else" appears.
The Child IAT (Ch-IAT) allows for children as young as four years of age to take the IAT. Rather than words and pictures, the Ch-IAT uses sound and pictures. For example, positive and negative valence are indicated with smiling and frowning faces. Positive and negative words to be classified are voiced out loud to children.
Studies using the Ch-IAT have revealed that six-year-old White children, ten-year-old White children, and White adults have comparable implicit attitudes on the Race IAT.
According to Greenwald, the IAT provides a "window" into a level of mental operation that operates in unthinking (unconscious, automatic, implicit, impulsive, intuitive, etc.) fashion because associations operating without active thought (automatically) can help performance in one of the IAT's two "combined" tasks, while interfering with the other. Respondents to the IAT experience a higher (conscious, controlled, explicit, reflective, analytic, rational, etc.) level of mental operation, when they try to overcome the effects of the automatic associations. The IAT succeeds as a measure because the higher level fails to completely overcome the lower level.
The interpretation that the IAT provides a "window" to unconscious mental contents has been challenged by Hahn and colleagues, whose results indicated that people are highly accurate in predicting their own IAT scores for a variety of social groups.
In 1958, Fritz Heider proposed the balance theory, which stated that a system of liking and disliking relationships is balanced if the product of the valence of all relationships within the system is positive. In the theory, there are concepts and associations. Concepts are persons, groups, or attributes; and among attribute concepts, there are positive and negative valences. Associations are relations between pairs of concepts, and the strength of association is the potential for one concept to activate another, either by external stimuli or by excitation through their associations with other, already active, concepts. The theory followed the assumption of associative social knowledge: an important portion of social knowledge could be represented as a network of variable-strength associations among person concepts (including self and groups) and attributes (including valence).
When two unlinked or weakly linked nodes are linked to the same third node, the association between these two should strengthen. This is the principle of balance-congruity. The nodes in the principle of balance-congruity are equivalent to the concepts in Heider's balance theory, and the three involved nodes/concepts make up a system. Since every relationship within the system here is positively associated, this, according to a derivation of Heider's theory, also represents a balanced system where the product of the direction of all associations within the system is positive.
In 2002, Greenwald and his colleagues introduced the balanced-identity design as a method to test correlational predictions of Heider's balance theory. The balanced identity design incorporated Heider's theory, the balance-congruity principle, and the assumption of centrality of self. The assumption of centrality of self is that in an associative knowledge structure, the self's centrality can be represented by its being associated with many other concepts that are themselves highly connected in the structure. The concepts in a typical balanced identity design are the self, a social group/object, and either a valence attribute or nonvalence attribute. There are thus five important associations possible in a typical balanced identity design that connect these three categories of concepts. An attitude is the association of a social group/object with a valence attribute; a stereotype is the association of a social group with one or more nonvalence attribute(s); self-esteem is the association of the self with a valence attribute; a self-concept is the association of the self with one or more nonvalence attribute(s); and the last important association is between the self and a social group/object, which is called an identity. However, in a typical balanced identity design, only three of the five possible associations come into play, and they are usually either identity, self-concept, and stereotype or identity, self-esteem, and attitude. Researchers using a balanced identity design are the ones to determine the set of concepts they want to investigate, and each one of the associations within the system that the researchers created will then be tested and analyzed statistically with both implicit and explicit measures.
A typical result of a balanced identity design usually shows that a group's identity is balanced, at least with implicit measures. According to a derivation of Heider's balance theory, since there are three concepts in a typical balanced identity design, the identity is balanced either when all three relations are positive or when one positive and two negative relations are present in the triad system. The triad system of "me--male--being good at math" will be used as an example here, and its typical result acquired from the Implicit Association Test (IAT) will be shown below. For male subjects, the three associations within the triad are usually all positive. For female subjects, the "me--male" association is usually negative, the "male--being good at math" association is usually positive, and the "me--being good at math" association is usually negative. As it's shown, for both the male and female subjects, their group identities are balanced.
Self-reporting is also usually used in a balanced identity design. Although self-reports don't necessarily reflect the predicted consistency patterns from Heider's theory, it is often used to compare with the results from the Implicit Association Test (IAT). Any discrepancies between the self-reports and the IAT results on the same association in a balanced identity design can be an indication of an experience of conflict. The above triad system of "me--male--being good at math" is a good example. For female subjects, whereas the Implicit Association Test (IAT) typically shows a stronger positive association of "male" and "being good at math," the explicit self-reporting usually shows a weaker positive association or even a weaker negative association of "male" and "being good at math." Also, whereas the IAT typically shows a stronger negative association of "me" and "being good at math" for the same female subjects, the self-reporting usually shows a weaker negative or even a weaker positive association of "me" and "being good at math." In this case, the female group is believed to be experiencing a conflict. The common explanation for a group experiencing a conflict is that in an effort to change a stereotypical view that has been around in the society for a really long time, even though people who belong to a certain social group believe that they are able to reject this stereotype (shown in explicit measures), the exact stereotypical thought is still going to remain in the back of their heads (shown in implicit measures), maybe not as much as those who actually believe in that thought. So maybe with time, as a stereotype gradually fades away, that conflict will fade away as well.
The IAT has been widely used as a measure for the balanced identity design because data obtained with this method revealed that predicted consistency patterns from Heider's theory were strongly apparent in the data for implicit measures by IAT but not in those for parallel explicit measures by self-report. The general explanation for why explicit measures by self-report did not reflect the predicted consistency patterns from Heider's theory was that self-report measures can go astray when respondents are either unwilling or unable to report accurately, and these problems could be more than enough to obscure the operation of consistency processes. There are, however, still limitations to the theory. For example, the balanced identity IAT measures only give group results rather than individual results, so it has its limitations when an analysis requires for individual pinpoint data to analyze, for instance, how balanced one's identity is relative to others'. It is hopeful, however, that researchers working with the Implicit Association Test (IAT) are trying hard to overcome challenges such as the one described above.
The IAT has engendered some controversy in both the scientific literature and in the public sphere (e.g. in the Wall Street Journal). For example, it has been interpreted as assessing familiarity,perceptual salience asymmetries, or mere cultural knowledge irrespective of personal endorsement of that knowledge. A more recent critique argued that there is a lack of empirical research justifying the diagnostic statements that are given to the lay public. For instance, feedback may report that someone has a [slight/moderate/strong] automatic preference for [European Americans/African Americans]. Proponents of the IAT have responded to these charges, but the debate continues. According to The New York Times, "there isn't even that much consistency in the same person's scores if the test is taken again". In addition, researchers have recently claimed that results of the IAT might be biased by the participant's lacking cognitive capability to adjust to switching categories, thus biasing results in favor of the first category pairing (e.g. pairing "Asian" with positive stimuli first, instead of pairing "Asian" with negative stimuli first).
According to Jesse Singal, some of these issues have been settled in the research literature, but others continue to inspire debate among researchers and lay people alike.
Since its introduction into the scientific literature in 1998, a great deal of research has been conducted in order to examine the psychometric properties of the IAT as well as to address other criticisms on validity and reliability.
The IAT is purported to measure relative strength of associations. However, some researchers have asserted that the IAT may instead be measuring constructs such as salience of attributes or cultural knowledge.
A recent meta-analysis has concluded that the IAT has predictive validity independent of the predictive validity of explicit measures. However, a follow-up meta-analysis questioned some of these results, finding that implicit measures were only weakly predictive of behaviors and no better than explicit measures. Some research has found that the IAT tends to be a better predictor of behavior in socially sensitive contexts (e.g. discrimination and suicidal behaviour) than traditional "explicit" self-report methods, whereas explicit measures tend to be better predictors of behavior in less socially sensitive contexts (e.g. political preferences). Specifically, the IAT has been shown to predict voting behavior (e.g. ultimate candidate choice of undecided voters), mental health (e.g. a self-injury IAT differentiated between adolescents who injured themselves and those who did not), medical outcomes (e.g. medical recommendations by physicians), employment outcomes (e.g. interviewing Muslim-Arab versus Swedish job applicants), and education outcomes (e.g. gender-science stereotypes predict gender disparities in nations' science and math test scores).
In applied settings, the IAT has been used in marketing and industrial psychology. For example, in determining the predictors of risk-taking behaviour of pilots in general aviation, attitudes towards risky flight behaviour as measured through an IAT have shown to be a more accurate forecast of risky flight behaviour than traditional explicit attitude or personality scales. The IAT has also been used in clinical psychology research to test the hypothesis that implicit associations may be a causal factor in the development of anxiety disorders.
Researchers have argued that the IAT may measure salience of concepts rather than associations. Whereas IAT proponents claim that faster response times when pairing concepts indicate stronger associations, critics claim that faster response times indicate that concepts are similar in salience (and slower response times indicate that concepts differ in salience). There is some support for this claim. For example, in an old-young IAT, old faces would be more salient than young faces. As a result, researchers created an old-young IAT that involved pairing young and old faces with neutral words (non-salient attribute) and non-words (salient attribute). Response times were faster when old faces (salient) were paired with non-words (salient) than when old faces (salient) were paired with neutral words (non-salient), supporting the assertion that faster response time can be facilitated by matching salience.
Although proponents of the IAT acknowledge that it may be influenced by salience asymmetry, they argue that this does not preclude interpreting the IAT as a measure of associations.
Another criticism of the IAT is that it may measure associations that are picked up from cultural knowledge rather than associations actually residing within a person. The counter-argument is that such associations may indeed arise from the culture, but they can nonetheless influence behavior.
To address the possibility that the IAT picks up on cultural knowledge rather than beliefs that are present in a person, some critics of the standard IAT created the personalized IAT. The primary difference between a standard valence IAT and the personalized IAT is that rather than using pleasant and unpleasant words as category labels, it uses "I like" and "I don't like" as category labels. Additionally, the Personalized IAT does not provide error feedback for an incorrect response as in the standard IAT. This form of the IAT is more strongly related to explicit self-report measures of bias.
Proponents of the standard IAT argue that the Personalized IAT increases the likelihood that those taking it will evaluate the concept rather than classify it. This would increase its relationship with explicit measures without necessarily removing the effect of cultural knowledge. In fact, some researchers have examined the relationship between perceptions of general American attitudes and Personalized IAT scores and have concluded that the relationship between the IAT and cultural knowledge is not decreased by personalizing it. However, it is important to note that there was no relationship between cultural knowledge and standard IAT scores either.
The IAT has also demonstrated a reasonable amount of resistance to social-desirability bias. Individuals asked to fake their responses on the IAT have demonstrated difficulty in doing so in some studies. For example, participants who were asked to present a positive impression of themselves were able to do so on a self-report measure of anxiety but not an IAT measuring anxiety. Nonetheless, faking is possible, and recent research indicates that the most effective method of faking the IAT is to intentionally slow down responses for pairings that should be relatively easy. Most subjects, however, do not discover this strategy on their own, so faking is relatively rare. An algorithm developed to estimate IAT faking can identify those who are faking with approximately 75% accuracy.
There is a recent study showing that participants can even speed up their responses during the relatively difficult response pairings in an autobiographical implicit association test that aims to test the veracity of autobiographical statement. Specifically, participants who were instructed to speed up their responses to fake the test were able to do so. The effect was larger when participants were trained in speeding up. Most importantly, guilty participants who speed up their responses during the difficult response pairing successfully beat the test to obtain an innocent results. In other words, participants can reverse their test outcome without being detected. Clearly, this poses new challenges to the IATs.
Distinct from faking (the deliberate obscuring of a true association), some studies have shown that heightening awareness about the nature of the test can change the outcome, potentially by activating different fluencies and associations. For example, in one study, a simple reminder from the experimenter ("Please be careful not to stereotype on the next section of the task") was sufficient to significantly reduce the expression of biased associations on a race IAT. Notably, there was not a significant decrease in overall reaction time in this experiment, indicating that this "control" may also be implicit.
A common criticism of the IAT is that it may be difficult to associate positive attributes with less familiar concepts. For example, if a person has had less contact with members of a particular ethnic group, he or she may have a more difficult time associating members of that ethnic group with positive words simply because of this lack of familiarity. There is some evidence against the familiarity based on studies that have ensured equal familiarity with the African American and White names as well as the faces appearing on the Race IAT.
As the IAT relies on a comparison of response times in different tasks pairing concepts and attributes, researchers and others taking the IAT have speculated that the pairing on the first combined task may affect performance on the next combined task. For example, a participant who begins a gender stereotype IAT by pairing female names with family words may subsequently find the task of pairing female names with career words more difficult. Research has indeed shown a small effect of order. As a result, it is recommended to increase the number of classifications required in the fifth IAT task. This gives participants more practice before doing the second pairing, thus reducing the order effect. When studying groups of people, this effect could be countered by giving pairings first to different participants (e.g. half of participants pair female names and family words first, the other half pair female names with career words first).
The IAT is influenced by individual differences in average IAT response times such that those with slower overall response times tend to have more extreme IAT scores. Older subjects also tend to have more extreme IAT scores, and this may be related to cognitive fluency, or slower overall response times.
An improved scoring algorithm for the IAT, which reduces the effect of cognitive fluency on the IAT, has been introduced. A summary of the scoring algorithm can be found on Greenwald's webpage.
Repeated administrations of the IAT tend to decrease the magnitude of the effect for a particular person. This issue is somewhat ameliorated with the improved scoring algorithm. An additional safeguard to control for IAT experience is to include a different type of IAT as a comparison. This allows researchers to evaluate the degree of magnitude decrease when administering subsequent IATs.
The IAT demonstrates inconsistent internal consistency and its test-retest reliability stands at 0.60, a relatively weak level. However, IAT scores do seem to vary between multiple administrations, indicating that it may measure a combination of trait (stable characteristics of people) and state (subject to variation based on situation-specific circumstances) characteristics. One example of the latter case is that scores on the Race IAT are known to be less biased against African Americans when those taking it imagine positive Black exemplars beforehand (e.g. Martin Luther King). Similarly, the Race IAT scores for an individual may indicate bias, but that bias is diminished on another IAT administered after associating with a mixed-race group. In fact, Race IAT scores can be changed even more easily; administering the IAT in different languages yields significantly different scores for bilingual individuals. For example, studies conducted with Moroccan participants fluent in both French and Arabic showed that participants are biased when completing an IAT in their native language; however, that bias is diminished when completing an IAT in another language. Similar results were found in the United States when administering an English and Spanish IAT on bilingual Hispanic Americans. Another state characteristic that may well influence IAT scores is the time of day a person completes the task, with findings that holding a preference for one's own racial group is lowest in the morning, but increases over the course of the day and into the evening; however, this may be more to do with who completes the task at each time of day than a function of circadian rhythms.
After establishing the IAT in the scientific literature, Greenwald, along with Mahzarin Banaji (Professor of Psychology at Harvard University) and Brian Nosek (Associate Professor of Psychology at the University of Virginia), co-founded Project Implicit, a virtual laboratory and educational outreach organization that facilitates research on implicit cognition.
The IAT has been profiled in major media outlets (e.g. in the Washington Post) and in the popular book Blink, where it was suggested that one could score better on the implicit racism test by visualizing respected black leaders such as Nelson Mandela. The IAT was also discussed in a 2006 episode of The Oprah Winfrey Show.
In the episode "Racist Dawg" on King of the Hill, Hank and Peggy take an IAT, colloquially referred to as the "racist test" to see if they prefer the company of white or black people.
This article's use of external links may not follow Wikipedia's policies or guidelines. (January 2017) (Learn how and when to remove this template message)