1.5. SPRs vs. Humans: An unfair test?

К оглавлению1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 
85 86 87 88 89 90 91 92 93 94 

Before we turn to an explanation for the success of SPRs, we should

consider a common objection against the SPR findings described above.

The objection proceeds as follows: ‘‘The real reason human experts do

worse than SPRs is that they are restricted to the sort of objective information

that can be plugged into a formula. So of course this tilts the

playing field in favor of the formula. People can base their predictions on

evidence that can’t be quantified and put in a formula. By denying experts

this kind of evidence, the above tests aren’t fair. Indeed, we can be confident

that human experts will defeat SPRs when they can use a wider

range of real world, qualitative evidence.’’

There are three points to make against this argument. First, this argument

offers no actual evidence that might justify the belief that human

experts are handicapped by being unable to use qualitative evidence in the

above examples. The argument offers only a speculation. Second, it is

possible to quantitatively code virtually any kind of evidence. For example,

consider an SPR that predicts the length of hospitalization for schizophrenic

and manic-depressive patients (Dunham and Meltzer 1946). This

SPR employs a rating of the patients’ insight into their condition. Prima

facie, this is a subjective, nonquantitative variable because it relies on a

clinician’s diagnosis of a patient’s mental state. Yet clinicians are able to

quantitatively code their diagnoses of the patient’s insight into his or her

condition. The clinician’s quantitatively coded diagnosis is then used by

the SPR to make more accurate predictions than the clinician. Third, the

The Amazing Success of Statistical Prediction Rules 31

speculation that humans armed with ‘‘extra’’ qualitative evidence can

outperform SPRs has been tested and has failed repeatedly. One example

of this failure is known as the interview effect : Unstructured interviews

degrade human reliability (Bloom and Brundage 1947, DeVaul et al. 1957,

Oskamp 1965, Milstein et al. 1981). When gatekeepers (e.g., hiring and

admissions officers, parole boards, etc.) make judgments about candidates

on the basis of a dossier and an unstructured interview, their judgments

come out worse than judgments based simply on the dossier (without the

unstructured interview). So when human experts and SPRs are given the

same evidence, and then humans get more information in the form of

unstructured interviews, clinical prediction is still less reliable than SPRs.

In fact, as would be expected given the interview effect, giving humans the

‘‘extra’’ qualitative evidence actually makes it easier for SPRs to defeat the

predictions of expert humans. To be fair, however, there are cases in which

experts can defeat SPRs. We will discuss these exceptions below.

Before we turn to an explanation for the success of SPRs, we should

consider a common objection against the SPR findings described above.

The objection proceeds as follows: ‘‘The real reason human experts do

worse than SPRs is that they are restricted to the sort of objective information

that can be plugged into a formula. So of course this tilts the

playing field in favor of the formula. People can base their predictions on

evidence that can’t be quantified and put in a formula. By denying experts

this kind of evidence, the above tests aren’t fair. Indeed, we can be confident

that human experts will defeat SPRs when they can use a wider

range of real world, qualitative evidence.’’

There are three points to make against this argument. First, this argument

offers no actual evidence that might justify the belief that human

experts are handicapped by being unable to use qualitative evidence in the

above examples. The argument offers only a speculation. Second, it is

possible to quantitatively code virtually any kind of evidence. For example,

consider an SPR that predicts the length of hospitalization for schizophrenic

and manic-depressive patients (Dunham and Meltzer 1946). This

SPR employs a rating of the patients’ insight into their condition. Prima

facie, this is a subjective, nonquantitative variable because it relies on a

clinician’s diagnosis of a patient’s mental state. Yet clinicians are able to

quantitatively code their diagnoses of the patient’s insight into his or her

condition. The clinician’s quantitatively coded diagnosis is then used by

the SPR to make more accurate predictions than the clinician. Third, the

The Amazing Success of Statistical Prediction Rules 31

speculation that humans armed with ‘‘extra’’ qualitative evidence can

outperform SPRs has been tested and has failed repeatedly. One example

of this failure is known as the interview effect : Unstructured interviews

degrade human reliability (Bloom and Brundage 1947, DeVaul et al. 1957,

Oskamp 1965, Milstein et al. 1981). When gatekeepers (e.g., hiring and

admissions officers, parole boards, etc.) make judgments about candidates

on the basis of a dossier and an unstructured interview, their judgments

come out worse than judgments based simply on the dossier (without the

unstructured interview). So when human experts and SPRs are given the

same evidence, and then humans get more information in the form of

unstructured interviews, clinical prediction is still less reliable than SPRs.

In fact, as would be expected given the interview effect, giving humans the

‘‘extra’’ qualitative evidence actually makes it easier for SPRs to defeat the

predictions of expert humans. To be fair, however, there are cases in which

experts can defeat SPRs. We will discuss these exceptions below.