3.1. Covariation illusions
К оглавлению1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1617 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
85 86 87 88 89 90 91 92 93 94
In order to reason well about social matters, we need to be able to reliably
detect correlations. But in a classic series of studies, Chapman and
Chapman (1967, 1969) found that we can be quite bad at this on tasks that
represent the ordinary challenges facing us. We often don’t recognize
covariations that exist, particularly when they do not conform to our
background beliefs; and we often report covariations where there are none,
particularly when we expect there to be covariation. In the past, many
psychologists used Draw-a-Person (or DAP) tests to make initial diagnoses.
The Amazing Success of Statistical Prediction Rules 37
It was thought that patients’ disorders could be diagnosed from their
drawings of people. For example, it was thought that paranoid patients
would draw large eyes; the drawings of impotent patients would emphasize
male genitalia or would be particularly macho. By the mid-1960s, it was
well known that DAP tests were bunk. There are no such correlations. And
yet clinicians continued to use them. Chapman and Chapman (1967) asked
clinicians who used the DAP test to describe the features of patients’
drawings they thought were associated with six diagnoses. Once they had
these reports, Chapman and Chapman obtained 45 DAP drawings made by
patients in a state hospital and randomly paired those drawings with the six
diagnoses. Each drawing-diagnosis pair was then presented to introductory
psychology students for 30 seconds, and then the students were asked to
report which features of the drawings were most frequently associated with
each diagnosis. Even though there were no systematic relationships in the
data, subjects claimed to detect covariations. Further, they were virtually
the same covariations the clinicians claimed to find in real data! It is
plausible to suppose in this case that widely shared background assumptions
(or perhaps just thoughtless stereotypes) led both expert clinicians
and naıЁve subjects to ‘‘see’’ covariations in data that simply weren’t there.
Interestingly, when Chapman and Chapman built in massive negative covariations
between the features of the drawings and the diagnoses subjects
were likely to make, naıЁve subjects still reported positive covariations—
though somewhat reduced in magnitude.
In another fascinating study, Chapman and Chapman focused on the
famous Rorschach test. While most of the associations clinicians have
believed they detected in Rorschach tests are actually not present, it turns
out that two responses to the Rorschach test are correlated with male
homosexuality. However, these responses are not particularly ‘‘face valid’’
(i.e., they do not strike most people as particularly intuitive). For example,
male homosexuals are not more likely to identify in the Rorschach blots
feminine clothing, anuses or genitalia, or humans with confused or uncertain
sexes. In fact, homosexual men more frequently report seeing monsters
on Card IV and a part-human-part-animal on Card V. (Again, Chapman
and Chapman found that clinicians of the day believed there was a significant
correlation between the ‘‘face valid’’ signs and homosexuality. Only 2
of the 32 clinicians they polled even listed one of the valid signs.) NaıЁve
subjects (1969) were given 30 cards with traits (homosexual or nonhomosexual)
on one side and Rorschach responses on the other (a valid sign,
an invalid but ‘‘face valid’’ sign, or a filler sign) and were given 60 seconds
to review each card. Even though the cards contained no correlations between the traits and the Rorschach responses, subjects reported frequent
correlations between the ‘‘face valid’’ signs and homosexuality. This finding
essentially replicates the DAP test result.
Next, Chapman and Chapman changed the cards so that the valid signs
were associated more often with homosexuality than were the other signs.
Even when the valid signs were associated with homosexuality 100% of the
time, naıЁve observers failed to detect the covariation. So it’s not just that
subjects see correlations when there are none. In fact, we often don’t see
correlations that are actually there, and sometimes we see positive correlations
when in fact the correlations are negative.
It should be noted that Chapman and Chapman did not draw particularly
pessimistic conclusions from their experiments. Nor do we. In
fact, when Chapman and Chapman took out the misleading invalid signs,
subjects were capable of detecting the actual covariations in the data. Nisbett
and Ross (1980) draw the following conclusion from these experiments:
[R]eported covariation was shown to reflect true covariation far less than it
reflected theories or preconceptions of the nature of the associations that
‘‘ought’’ to exist. Unexpected, true covariations can sometimes be detected
but they will be underestimated and are likely to be noticed only when the
covariation is very strong, and the relevant data set excludes ‘‘decoy features’’
that bring into play popular but incorrect theories. (97)
When it comes to social judgment, the evidential situation is likely to be
quite complex—with many signs that are valid but counterintuitive and
other signs that are ‘‘face valid’’ but not predictive. In such an environment,
we are not likely to do a particularly good job of detecting covariations.
And so, unless the theories, background assumptions, and stereotypes
we bring to a particular prediction are accurate, we are not likely to be very
good at identifying what cues are most likely to covary with and so predict
our target property.
In order to reason well about social matters, we need to be able to reliably
detect correlations. But in a classic series of studies, Chapman and
Chapman (1967, 1969) found that we can be quite bad at this on tasks that
represent the ordinary challenges facing us. We often don’t recognize
covariations that exist, particularly when they do not conform to our
background beliefs; and we often report covariations where there are none,
particularly when we expect there to be covariation. In the past, many
psychologists used Draw-a-Person (or DAP) tests to make initial diagnoses.
The Amazing Success of Statistical Prediction Rules 37
It was thought that patients’ disorders could be diagnosed from their
drawings of people. For example, it was thought that paranoid patients
would draw large eyes; the drawings of impotent patients would emphasize
male genitalia or would be particularly macho. By the mid-1960s, it was
well known that DAP tests were bunk. There are no such correlations. And
yet clinicians continued to use them. Chapman and Chapman (1967) asked
clinicians who used the DAP test to describe the features of patients’
drawings they thought were associated with six diagnoses. Once they had
these reports, Chapman and Chapman obtained 45 DAP drawings made by
patients in a state hospital and randomly paired those drawings with the six
diagnoses. Each drawing-diagnosis pair was then presented to introductory
psychology students for 30 seconds, and then the students were asked to
report which features of the drawings were most frequently associated with
each diagnosis. Even though there were no systematic relationships in the
data, subjects claimed to detect covariations. Further, they were virtually
the same covariations the clinicians claimed to find in real data! It is
plausible to suppose in this case that widely shared background assumptions
(or perhaps just thoughtless stereotypes) led both expert clinicians
and naıЁve subjects to ‘‘see’’ covariations in data that simply weren’t there.
Interestingly, when Chapman and Chapman built in massive negative covariations
between the features of the drawings and the diagnoses subjects
were likely to make, naıЁve subjects still reported positive covariations—
though somewhat reduced in magnitude.
In another fascinating study, Chapman and Chapman focused on the
famous Rorschach test. While most of the associations clinicians have
believed they detected in Rorschach tests are actually not present, it turns
out that two responses to the Rorschach test are correlated with male
homosexuality. However, these responses are not particularly ‘‘face valid’’
(i.e., they do not strike most people as particularly intuitive). For example,
male homosexuals are not more likely to identify in the Rorschach blots
feminine clothing, anuses or genitalia, or humans with confused or uncertain
sexes. In fact, homosexual men more frequently report seeing monsters
on Card IV and a part-human-part-animal on Card V. (Again, Chapman
and Chapman found that clinicians of the day believed there was a significant
correlation between the ‘‘face valid’’ signs and homosexuality. Only 2
of the 32 clinicians they polled even listed one of the valid signs.) NaıЁve
subjects (1969) were given 30 cards with traits (homosexual or nonhomosexual)
on one side and Rorschach responses on the other (a valid sign,
an invalid but ‘‘face valid’’ sign, or a filler sign) and were given 60 seconds
to review each card. Even though the cards contained no correlations between the traits and the Rorschach responses, subjects reported frequent
correlations between the ‘‘face valid’’ signs and homosexuality. This finding
essentially replicates the DAP test result.
Next, Chapman and Chapman changed the cards so that the valid signs
were associated more often with homosexuality than were the other signs.
Even when the valid signs were associated with homosexuality 100% of the
time, naıЁve observers failed to detect the covariation. So it’s not just that
subjects see correlations when there are none. In fact, we often don’t see
correlations that are actually there, and sometimes we see positive correlations
when in fact the correlations are negative.
It should be noted that Chapman and Chapman did not draw particularly
pessimistic conclusions from their experiments. Nor do we. In
fact, when Chapman and Chapman took out the misleading invalid signs,
subjects were capable of detecting the actual covariations in the data. Nisbett
and Ross (1980) draw the following conclusion from these experiments:
[R]eported covariation was shown to reflect true covariation far less than it
reflected theories or preconceptions of the nature of the associations that
‘‘ought’’ to exist. Unexpected, true covariations can sometimes be detected
but they will be underestimated and are likely to be noticed only when the
covariation is very strong, and the relevant data set excludes ‘‘decoy features’’
that bring into play popular but incorrect theories. (97)
When it comes to social judgment, the evidential situation is likely to be
quite complex—with many signs that are valid but counterintuitive and
other signs that are ‘‘face valid’’ but not predictive. In such an environment,
we are not likely to do a particularly good job of detecting covariations.
And so, unless the theories, background assumptions, and stereotypes
we bring to a particular prediction are accurate, we are not likely to be very
good at identifying what cues are most likely to covary with and so predict
our target property.