1.2. Bootstrapping models: Experts vs. virtual experts

К оглавлению1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 
85 86 87 88 89 90 91 92 93 94 

A proper linear model assigns weights to cues so as to optimize the relationship

between those cues and the target property in a data set. Improper

linear models do not best fit the available data. Bootstrapping

models are perhaps the most fascinating kind of improper linear models.

These are proper linear models of a person’s judgments. Goldberg (1970)

constructed the classic example of a bootstrapping model. Many clinical

psychologists have years of training and experience in predicting whether a

psychiatric patient is neurotic or psychotic on the basis of a Minnesota

Multiphasic Personality Inventory (MMPI) profile. The MMPI profile

consists of 10 clinical (personality) scales and a number of validity scales.

Goldberg asked 29 clinical psychologists to judge, only on the basis of an

MMPI profile, whether a patient would be diagnosed as neurotic or psychotic.

Goldberg then constructed 29 proper linear models that would

mimic each psychologist’s judgments. The predictor cues consisted of the

MMPI profile; the target property was the psychologist’s predictions.

Weights were assigned to the cues so as to best fit the psychologist’s judgments

about whether the patient was neurotic or psychotic. So while a

bootstrapping model is a proper linear model of a human’s judgments,

it is an improper linear model of the target property—in this case, the

patient’s condition.

One might expect that the bootstrapping model would predict reasonably

well. It is built to mimic a fairly reliable expert, so we might expect

it to do nearly as well as the expert. In fact, the mimic is more reliable than

the expert. Goldberg found that in 26 of the 29 cases, the bootstrapping

model was more reliable in its diagnoses than the psychologist on which it

was based! (For other studies with similar results, see Wiggins and Kohen

1971, Dawes 1971.) This is surprising. The bootstrapping model is built to

ape an expert’s predictions. And it will occasionally be wrong about the

expert. But when it is wrong about the expert, it’s more likely to be right

about the target property!

At this point, it is natural to wonder why the bootstrapping model is

more accurate than the person on which it is based. In fact, it seems

paradoxical that this could be true: If the bootstrapping model ‘‘learns’’ to

predict from an expert, how can the model ‘‘know’’ more than the expert? This way of putting the finding makes it appear that the model is adding

some kind of knowledge to what it learns from the expert. But how on

earth can that be? The early hypothesis for the success of the bootstrapping

model was not that the model was adding something to the expert’s knowledge

(or reasoning competence), but that the model was adding something

to the expert’s reasoning performance. In particular, the hypothesis

was that the model did not fall victim to performance errors (errors that

were the result of lack of concentration or a failure to properly execute

some underlying predictive algorithm). The idea was that bootstrapping

models somehow capture the underlying reliable prediction strategy humans

use; but since the models are not subject to extraneous variables that

degrade human performance, the models are more accurate (Bowman

1963, Goldberg 1970, Dawes 1971). This is a relatively flattering hypothesis,

in that it grants us an underlying competence in making social judgments.

Unfortunately, this flattering hypothesis soon came crashing down.

A proper linear model assigns weights to cues so as to optimize the relationship

between those cues and the target property in a data set. Improper

linear models do not best fit the available data. Bootstrapping

models are perhaps the most fascinating kind of improper linear models.

These are proper linear models of a person’s judgments. Goldberg (1970)

constructed the classic example of a bootstrapping model. Many clinical

psychologists have years of training and experience in predicting whether a

psychiatric patient is neurotic or psychotic on the basis of a Minnesota

Multiphasic Personality Inventory (MMPI) profile. The MMPI profile

consists of 10 clinical (personality) scales and a number of validity scales.

Goldberg asked 29 clinical psychologists to judge, only on the basis of an

MMPI profile, whether a patient would be diagnosed as neurotic or psychotic.

Goldberg then constructed 29 proper linear models that would

mimic each psychologist’s judgments. The predictor cues consisted of the

MMPI profile; the target property was the psychologist’s predictions.

Weights were assigned to the cues so as to best fit the psychologist’s judgments

about whether the patient was neurotic or psychotic. So while a

bootstrapping model is a proper linear model of a human’s judgments,

it is an improper linear model of the target property—in this case, the

patient’s condition.

One might expect that the bootstrapping model would predict reasonably

well. It is built to mimic a fairly reliable expert, so we might expect

it to do nearly as well as the expert. In fact, the mimic is more reliable than

the expert. Goldberg found that in 26 of the 29 cases, the bootstrapping

model was more reliable in its diagnoses than the psychologist on which it

was based! (For other studies with similar results, see Wiggins and Kohen

1971, Dawes 1971.) This is surprising. The bootstrapping model is built to

ape an expert’s predictions. And it will occasionally be wrong about the

expert. But when it is wrong about the expert, it’s more likely to be right

about the target property!

At this point, it is natural to wonder why the bootstrapping model is

more accurate than the person on which it is based. In fact, it seems

paradoxical that this could be true: If the bootstrapping model ‘‘learns’’ to

predict from an expert, how can the model ‘‘know’’ more than the expert? This way of putting the finding makes it appear that the model is adding

some kind of knowledge to what it learns from the expert. But how on

earth can that be? The early hypothesis for the success of the bootstrapping

model was not that the model was adding something to the expert’s knowledge

(or reasoning competence), but that the model was adding something

to the expert’s reasoning performance. In particular, the hypothesis

was that the model did not fall victim to performance errors (errors that

were the result of lack of concentration or a failure to properly execute

some underlying predictive algorithm). The idea was that bootstrapping

models somehow capture the underlying reliable prediction strategy humans

use; but since the models are not subject to extraneous variables that

degrade human performance, the models are more accurate (Bowman

1963, Goldberg 1970, Dawes 1971). This is a relatively flattering hypothesis,

in that it grants us an underlying competence in making social judgments.

Unfortunately, this flattering hypothesis soon came crashing down.