1.2. Bootstrapping models: Experts vs. virtual experts
К оглавлению1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1617 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
85 86 87 88 89 90 91 92 93 94
A proper linear model assigns weights to cues so as to optimize the relationship
between those cues and the target property in a data set. Improper
linear models do not best fit the available data. Bootstrapping
models are perhaps the most fascinating kind of improper linear models.
These are proper linear models of a person’s judgments. Goldberg (1970)
constructed the classic example of a bootstrapping model. Many clinical
psychologists have years of training and experience in predicting whether a
psychiatric patient is neurotic or psychotic on the basis of a Minnesota
Multiphasic Personality Inventory (MMPI) profile. The MMPI profile
consists of 10 clinical (personality) scales and a number of validity scales.
Goldberg asked 29 clinical psychologists to judge, only on the basis of an
MMPI profile, whether a patient would be diagnosed as neurotic or psychotic.
Goldberg then constructed 29 proper linear models that would
mimic each psychologist’s judgments. The predictor cues consisted of the
MMPI profile; the target property was the psychologist’s predictions.
Weights were assigned to the cues so as to best fit the psychologist’s judgments
about whether the patient was neurotic or psychotic. So while a
bootstrapping model is a proper linear model of a human’s judgments,
it is an improper linear model of the target property—in this case, the
patient’s condition.
One might expect that the bootstrapping model would predict reasonably
well. It is built to mimic a fairly reliable expert, so we might expect
it to do nearly as well as the expert. In fact, the mimic is more reliable than
the expert. Goldberg found that in 26 of the 29 cases, the bootstrapping
model was more reliable in its diagnoses than the psychologist on which it
was based! (For other studies with similar results, see Wiggins and Kohen
1971, Dawes 1971.) This is surprising. The bootstrapping model is built to
ape an expert’s predictions. And it will occasionally be wrong about the
expert. But when it is wrong about the expert, it’s more likely to be right
about the target property!
At this point, it is natural to wonder why the bootstrapping model is
more accurate than the person on which it is based. In fact, it seems
paradoxical that this could be true: If the bootstrapping model ‘‘learns’’ to
predict from an expert, how can the model ‘‘know’’ more than the expert? This way of putting the finding makes it appear that the model is adding
some kind of knowledge to what it learns from the expert. But how on
earth can that be? The early hypothesis for the success of the bootstrapping
model was not that the model was adding something to the expert’s knowledge
(or reasoning competence), but that the model was adding something
to the expert’s reasoning performance. In particular, the hypothesis
was that the model did not fall victim to performance errors (errors that
were the result of lack of concentration or a failure to properly execute
some underlying predictive algorithm). The idea was that bootstrapping
models somehow capture the underlying reliable prediction strategy humans
use; but since the models are not subject to extraneous variables that
degrade human performance, the models are more accurate (Bowman
1963, Goldberg 1970, Dawes 1971). This is a relatively flattering hypothesis,
in that it grants us an underlying competence in making social judgments.
Unfortunately, this flattering hypothesis soon came crashing down.
A proper linear model assigns weights to cues so as to optimize the relationship
between those cues and the target property in a data set. Improper
linear models do not best fit the available data. Bootstrapping
models are perhaps the most fascinating kind of improper linear models.
These are proper linear models of a person’s judgments. Goldberg (1970)
constructed the classic example of a bootstrapping model. Many clinical
psychologists have years of training and experience in predicting whether a
psychiatric patient is neurotic or psychotic on the basis of a Minnesota
Multiphasic Personality Inventory (MMPI) profile. The MMPI profile
consists of 10 clinical (personality) scales and a number of validity scales.
Goldberg asked 29 clinical psychologists to judge, only on the basis of an
MMPI profile, whether a patient would be diagnosed as neurotic or psychotic.
Goldberg then constructed 29 proper linear models that would
mimic each psychologist’s judgments. The predictor cues consisted of the
MMPI profile; the target property was the psychologist’s predictions.
Weights were assigned to the cues so as to best fit the psychologist’s judgments
about whether the patient was neurotic or psychotic. So while a
bootstrapping model is a proper linear model of a human’s judgments,
it is an improper linear model of the target property—in this case, the
patient’s condition.
One might expect that the bootstrapping model would predict reasonably
well. It is built to mimic a fairly reliable expert, so we might expect
it to do nearly as well as the expert. In fact, the mimic is more reliable than
the expert. Goldberg found that in 26 of the 29 cases, the bootstrapping
model was more reliable in its diagnoses than the psychologist on which it
was based! (For other studies with similar results, see Wiggins and Kohen
1971, Dawes 1971.) This is surprising. The bootstrapping model is built to
ape an expert’s predictions. And it will occasionally be wrong about the
expert. But when it is wrong about the expert, it’s more likely to be right
about the target property!
At this point, it is natural to wonder why the bootstrapping model is
more accurate than the person on which it is based. In fact, it seems
paradoxical that this could be true: If the bootstrapping model ‘‘learns’’ to
predict from an expert, how can the model ‘‘know’’ more than the expert? This way of putting the finding makes it appear that the model is adding
some kind of knowledge to what it learns from the expert. But how on
earth can that be? The early hypothesis for the success of the bootstrapping
model was not that the model was adding something to the expert’s knowledge
(or reasoning competence), but that the model was adding something
to the expert’s reasoning performance. In particular, the hypothesis
was that the model did not fall victim to performance errors (errors that
were the result of lack of concentration or a failure to properly execute
some underlying predictive algorithm). The idea was that bootstrapping
models somehow capture the underlying reliable prediction strategy humans
use; but since the models are not subject to extraneous variables that
degrade human performance, the models are more accurate (Bowman
1963, Goldberg 1970, Dawes 1971). This is a relatively flattering hypothesis,
in that it grants us an underlying competence in making social judgments.
Unfortunately, this flattering hypothesis soon came crashing down.