1.1. Proper linear models

К оглавлению1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 
85 86 87 88 89 90 91 92 93 94 

A particularly successful kind of SPR is the proper linear model (Dawes

1982, 391). Proper linear models have the following form:

P ј w1c1 ю w2c2 ю w3c3 ю w4c4

where cn is the value for the nth cue, and wn is the weight assigned to the

nth cue. Our favorite proper linear model predicts the quality of the vintage

for a red Bordeaux wine. For example, c1 reflects the age of the vintage,

while c2 , c3 , and c4 reflect climatic features of the relevant Bordeaux

region. Given a reasonably large set of data showing how these cues correlate

with the target property (the market price of mature Bordeaux

wines), weights are then chosen so as to best fit the data. This is what

makes this SPR a proper linear model: The weights optimize the relationship

between P (the weighted sum of the cues) and the target property as

given in the data set. A wine predicting SPR was developed by Ashenfelter,

Ashmore, and Lalonde (1995). It has done a better job predicting the price

of mature Bordeaux red wines at auction (predicting 83% of the variance)

26 Epistemology and the Psychology of Human Judgment

than expert wine tasters. Reaction in the wine-tasting industry to such

SPRs has been ‘‘somewhere between violent and hysterical’’ (Passell 1990).

Whining wine tasters might derive a small bit of comfort from the fact

that they are not the only experts trounced by a mechanical formula. We

have already introduced The Golden Rule of Predictive Modeling: When

based on the same evidence, the predictions of SPRs are at least as reliable

as, and are typically more reliable than, the predictions of human

experts for problems of social prediction. The most definitive case for the

Golden Rule has been made by Grove and Meehl (1996). They report on

an exhaustive search for studies comparing human predictions to those of

SPRs in which (a) the humans and SPRs made predictions about the same

individual cases and (b) the SPRs never had more information than the humans

(although the humans often had more information than the SPRs).

They

found 136 studies which yielded 617 distinct comparisons between the two

methods of prediction. These studies concerned a wide range of predictive

criteria, including medical and mental heath diagnosis, prognosis, treatment

recommendations and treatment outcomes; personality description; success

in training or employment; adjustment to institutional life (e.g., military,

prison); socially relevant behaviors such as parole violation and violence;

socially relevant behaviors in the aggregate, such as bankruptcy of firms; and

many other predictive criteria. (1996, 297)

Of the 136 studies, 64 clearly favored the SPR, 64 showed approximately

equivalent accuracy, and 8 clearly favored the clinician. The 8 studies that

favored the clinician appeared to have no common characteristics; they

‘‘do not form a pocket of predictive excellence in which clinicians could

profitably specialize’’ (299). What’s more, Grove and Meehl argue plausibly

that these 8 outliers are likely the result of random sampling errors

(i.e., given 136 chances, the better reasoning strategy is bound to lose

sometimes) ‘‘and the clinicians’ informational advantage in being provided

with more data than the actuarial formula’’ (298).

There is an intuitively plausible explanation for the success of proper

linear models. Proper linear models are constructed so as to best fit a large

set of (presumably accurate) data. But the typical human predictor does

not have all the correlational data easily available; and even if he did,

he couldn’t perfectly calculate the complex correlations between the cues

and the target property. As a result, we should not find it surprising that

proper linear models are more accurate than (even expert) humans. While

The Amazing Success of Statistical Prediction Rules 27

this explanation is intuitively satisfying, it is mistaken. To see why, let’s

look at the surprising but robust success of some improper linear models.

A particularly successful kind of SPR is the proper linear model (Dawes

1982, 391). Proper linear models have the following form:

P ј w1c1 ю w2c2 ю w3c3 ю w4c4

where cn is the value for the nth cue, and wn is the weight assigned to the

nth cue. Our favorite proper linear model predicts the quality of the vintage

for a red Bordeaux wine. For example, c1 reflects the age of the vintage,

while c2 , c3 , and c4 reflect climatic features of the relevant Bordeaux

region. Given a reasonably large set of data showing how these cues correlate

with the target property (the market price of mature Bordeaux

wines), weights are then chosen so as to best fit the data. This is what

makes this SPR a proper linear model: The weights optimize the relationship

between P (the weighted sum of the cues) and the target property as

given in the data set. A wine predicting SPR was developed by Ashenfelter,

Ashmore, and Lalonde (1995). It has done a better job predicting the price

of mature Bordeaux red wines at auction (predicting 83% of the variance)

26 Epistemology and the Psychology of Human Judgment

than expert wine tasters. Reaction in the wine-tasting industry to such

SPRs has been ‘‘somewhere between violent and hysterical’’ (Passell 1990).

Whining wine tasters might derive a small bit of comfort from the fact

that they are not the only experts trounced by a mechanical formula. We

have already introduced The Golden Rule of Predictive Modeling: When

based on the same evidence, the predictions of SPRs are at least as reliable

as, and are typically more reliable than, the predictions of human

experts for problems of social prediction. The most definitive case for the

Golden Rule has been made by Grove and Meehl (1996). They report on

an exhaustive search for studies comparing human predictions to those of

SPRs in which (a) the humans and SPRs made predictions about the same

individual cases and (b) the SPRs never had more information than the humans

(although the humans often had more information than the SPRs).

They

found 136 studies which yielded 617 distinct comparisons between the two

methods of prediction. These studies concerned a wide range of predictive

criteria, including medical and mental heath diagnosis, prognosis, treatment

recommendations and treatment outcomes; personality description; success

in training or employment; adjustment to institutional life (e.g., military,

prison); socially relevant behaviors such as parole violation and violence;

socially relevant behaviors in the aggregate, such as bankruptcy of firms; and

many other predictive criteria. (1996, 297)

Of the 136 studies, 64 clearly favored the SPR, 64 showed approximately

equivalent accuracy, and 8 clearly favored the clinician. The 8 studies that

favored the clinician appeared to have no common characteristics; they

‘‘do not form a pocket of predictive excellence in which clinicians could

profitably specialize’’ (299). What’s more, Grove and Meehl argue plausibly

that these 8 outliers are likely the result of random sampling errors

(i.e., given 136 chances, the better reasoning strategy is bound to lose

sometimes) ‘‘and the clinicians’ informational advantage in being provided

with more data than the actuarial formula’’ (298).

There is an intuitively plausible explanation for the success of proper

linear models. Proper linear models are constructed so as to best fit a large

set of (presumably accurate) data. But the typical human predictor does

not have all the correlational data easily available; and even if he did,

he couldn’t perfectly calculate the complex correlations between the cues

and the target property. As a result, we should not find it surprising that

proper linear models are more accurate than (even expert) humans. While

The Amazing Success of Statistical Prediction Rules 27

this explanation is intuitively satisfying, it is mistaken. To see why, let’s

look at the surprising but robust success of some improper linear models.