1.2. The starting point of the philosophy of science approach to epistemology

К оглавлению1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 
85 86 87 88 89 90 91 92 93 94 

We view epistemology as a branch of the philosophy of science. From our

perspective, epistemology begins with a branch of cognitive science that

investigates good reasoning. It includes work in psychology, statistics,

Laying Our Cards on the Table 11

machine learning, and Artificial Intelligence. Some of this work involves

‘‘predictive modeling,’’ and it includes discussion of models such as linear

models, multiple regression formulas, neural networks, naıЁve Bayes classifiers,

Markov Chain Monte Carlo algorithms, decision tree models, and

support vector machines; but much of this work comes from traditional

psychology and includes the well-known heuristics and biases program

launched by Kahneman and Tversky (Kahneman, Slovic, and Tversky

1982). It will be useful to give this wide-ranging literature a name. We call

it Ameliorative Psychology. The essential feature of Ameliorative Psychology

is that it aims to give positive advice about how we can reason better.

We will introduce many findings of Ameliorative Psychology (particularly

in chapters 2 and 9). But it will be useful here to introduce some of its

noteworthy features.

In the course of this book, we will introduce a number of reasonguiding

prescriptions offered by Ameliorative Psychology. This advice

includes making statistical judgments in terms of frequencies rather than

probabilities, considering explanations for propositions one doesn’t believe,

ignoring certain kinds of evidence (e.g., certain selected cues that

improve accuracy only very moderately, and certain kinds of impressionistic

information, such as opinions gleaned from unstructured personal

interviews), and many others (Bishop 2000). These recommendations are

bluntly normative: They tell us how we ought to reason about certain sorts

of problems.

A particularly interesting branch of Ameliorative Psychology begins in

earnest in 1954 with the publication of Paul Meehl’s classic book Clinical

Versus Statistical Prediction: A Theoretical Analysis and a Review of the

Evidence. Meehl reported on twenty experiments that showed that very

simple prediction rules were more reliable predictors than human experts.

Since then, psychologists have developed many of these Statistical Prediction

Rules (or SPRs). (In fact, in the past decade or so, there has been

an explosion of predictive models in AI and machine learning.) There is

now considerable evidence for what we call The Golden Rule of Predictive

Modeling: When based on the same evidence, the predictions of SPRs are

at least as reliable, and are typically more reliable, than the predictions

of human experts. Except for an important qualification we will discuss

in chapter 2, section 4.2, the evidence in favor of the Golden Rule is

overwhelming (see Grove and Meehl 1996; Swets, Dawes, and Monahan

2000).

The Golden Rule of Predictive Modeling has been woefully neglected.

Perhaps a good way to begin to undo this state of affairs is to briefly describe ten of its instances. This will give the reader some idea of the

range and robustness of the Golden Rule.

1. An SPR that takes into account a patient’s marital status, length of psychotic

distress, and a rating of the patient’s insight into his or her condition

predicted the success of electroshock therapy more reliably than

a hospital’s medical and psychological staff members (Wittman 1941).

2. A model that used past criminal and prison records was more reliable

than expert criminologists in predicting criminal recidivism (Carroll

et al., 1988).

3. On the basis of a Minnesota Multiphasic Personality Inventory (MMPI)

profile, clinical psychologists were less reliable than an SPR in diagnosing

patients as either neurotic or psychotic. When psychologists were

given the SPR’s results before they made their predictions, they were

still less accurate than the SPR (Goldberg 1968).

4. A number of SPRs predict academic performance (measured by

graduation rates and GPA at graduation) better than admissions officers.

This is true even when the admissions officers are allowed to use

considerably more evidence than the models (DeVaul et al. 1957), and

it has been shown to be true at selective colleges, medical schools

(DeVaul et al. 1957), law schools (Swets, Dawes, and Monahan 2000,

18), and graduate school in psychology (Dawes 1971).

5. SPRs predict loan and credit risk better than bank officers. SPRs are

now standardly used by banks when they make loans and by credit

card companies when they approve and set credit limits for new

customers (Stillwell et al. 1983).

6. SPRs predict newborns at risk for Sudden Infant Death Syndrome

(SIDS) much better than human experts (Carpenter et al. 1977, Golding

et al. 1985).

7. Predicting the quality of the vintage for a red Bordeaux wine decades

in advance is done more reliably by an SPR than by expert wine tasters,

who swirl, smell, and taste the young wine (Ashenfelter, Ashmore, and

Lalonde 1995).

8. An SPR correctly diagnosed 83% of progressive brain dysfunction on

the basis of cues from intellectual tests. Groups of clinicians working

from the same data did no better than 63%. When clinicians were given

the results of the actuarial formula, clinicians still did worse than the

model, scoring no better than 75% (Leli and Filskov 1984).

9. In predicting the presence, location, and cause of brain damage, an

SPR outperformed experienced clinicians and a nationally prominent

neuropsychologist (Wedding 1983).

10. In legal settings, forensic psychologists often make predictions of violence.

One will be more reliable than forensic psychologists simply by

Laying Our Cards on the Table 13

predicting that people will not be violent. Further, SPRs are more reliable

than forensic psychologists in predicting the relative likelihood of

violence, that is, who is more prone to violence (Faust and Ziskin 1988).

Upon reviewing this evidence in 1986, Paul Meehl said: ‘‘There is no

controversy in social science which shows such a large body of qualitatively

diverse studies coming out so uniformly in the same direction as this

one. When you are pushing [scores of ] investigations, predicting everything

from the outcomes of football games to the diagnosis of liver disease

and when you can hardly come up with a half dozen studies showing even

a weak tendency in favor of the clinician, it is time to draw a practical

conclusion’’ (Meehl 1986, 372–73). Ameliorative Psychology has had

consistent success in recommending reasoning strategies in a wide variety

of important reasoning tasks. Such success is worth exploring.

The descriptive core of our approach to epistemology consists of the

empirical findings of Ameliorative Psychology. And yet, Ameliorative

Psychology is deeply normative in the sense that it makes (implicitly or

explicitly) evaluative ‘‘ought’’ claims that are intended to guide people’s

reasoning. Let’s look at three examples of the reason-guiding prescriptions

of Ameliorative Psychology.

A well-documented success of Ameliorative Psychology is the Goldberg

Rule (the third item on the above list). It predicts whether a psychiatric

patient is neurotic or psychotic on the basis of an MMPI profile.

Lewis Goldberg (1965) found that the following rule outperformed 29

clinical judges (where L is a validity scale and Pa, Sc, Hy, and Pt are clinical

scales of the MMPI):

xј(LюPaюSc)_(HyюPt)

If x < 45, diagnose patient as neurotic.

If x_45, diagnose patient as psychotic.

When tested on a set of 861 patients, the Goldberg Rule had a 70% hit rate;

clinicians’ hit rates varied from a low of 55% to a high of 67%. (13 of the 29

clinical judges in the above study were experienced Ph.D.s, while the other

16 were Ph.D. students. The Ph.D.s were no more accurate than the students.

This is consistent with the findings reported in Dawes 1994.) So here

we have a prediction rule that could literally turn a smart second-grader

into a better psychiatric diagnostician than highly credentialed, highly

experienced psychologists—at least for this diagnostic task. In fact, more

than 3 decades after the appearance of Goldberg’s results, making an initial diagnosis on the basis of an MMPI profile by using subjective judgment

rather than the Goldberg Rule would bespeak either willful irresponsibility

or deep ignorance. So here is a finding of Ameliorative Psychology: people

(in an epistemic sense) ought to use the Goldberg Rule in making preliminary

diagnoses of psychiatric patients.

Another example of Ameliorative Psychology making evaluative oughtclaims

is a 1995 paper by Gigerenzer and Hoffrage entitled ‘‘How to

Improve Bayesian Reasoning Without Instruction: Frequency Formats’’

(emphasis added). As the title of the paper suggests, Gigerenzer and Hoffrage

show how people charged with making high-stakes diagnoses (e.g.,

about cancer or HIV) can improve their reasoning. They suggest a reasoning

strategy that enhances reasoners’ ability to identify, on the basis of

medical tests, the likelihood that an individual will have cancer or HIV. We

will discuss these ‘‘frequency formats’’ in chapter 9, section 1. For now, it is

enough to note that a finding of Ameliorative Psychology is that people

ought to use frequency formats when diagnosing rare conditions on the

basis of well-understood diagnostic tests.

Another particularly successful example of Ameliorative Psychology is

credit scoring (the fifth item on the above list). Many financial institutions

no longer rely primarily on financial officers to make credit decisions—

they now make credit decisions on the basis of simple SPRs developed as

the result of research by psychologists and statisticians (Lovie and Lovie

1986). Once again, this finding of Ameliorative Psychology seems to be

normative through and through: When it comes to making predictions

about someone’s creditworthiness, one ought to use a credit-scoring model.

Not only does Ameliorative Psychology recommend particular reasoning

strategies for tackling certain kinds of problems, it also suggests

generalizations about how people ought to reason. (See, for example, the

flat maximum principle, discussed in chapter 2, section 2.1.) On our view,

the goal of epistemology is to articulate the epistemic generalizations that

guide the prescriptions of Ameliorative Psychology. In this way, epistemology

is simply a branch of philosophy of science. Just as the philosopher

of biology might aim to uncover and articulate the metaphysical assumptions

of evolutionary theory, the epistemologist aims to uncover and articulate

the normative, epistemic principles behind the long and distinguished

tradition of Ameliorative Psychology. (There are two objections philosophers

are likely to immediately raise against our approach. We consider

them in the Appendix, sections 1 and 2.)

Ameliorative Psychology is normative in the sense that it yields explicit,

reason-guiding advice about how people ought to reason. Some

Laying Our Cards on the Table 15

might fix us with a jaundiced eye and wonder whether the recommendations

of Ameliorative Psychology are really normative in the same way

as the recommendations of SAE are normative. Admittedly, there does

seem to be one telling difference. People outside academia have on occasion

actually changed the way they reason about significant matters as a

result of the recommendations of Ameliorative Psychology.

We view epistemology as a branch of the philosophy of science. From our

perspective, epistemology begins with a branch of cognitive science that

investigates good reasoning. It includes work in psychology, statistics,

Laying Our Cards on the Table 11

machine learning, and Artificial Intelligence. Some of this work involves

‘‘predictive modeling,’’ and it includes discussion of models such as linear

models, multiple regression formulas, neural networks, naıЁve Bayes classifiers,

Markov Chain Monte Carlo algorithms, decision tree models, and

support vector machines; but much of this work comes from traditional

psychology and includes the well-known heuristics and biases program

launched by Kahneman and Tversky (Kahneman, Slovic, and Tversky

1982). It will be useful to give this wide-ranging literature a name. We call

it Ameliorative Psychology. The essential feature of Ameliorative Psychology

is that it aims to give positive advice about how we can reason better.

We will introduce many findings of Ameliorative Psychology (particularly

in chapters 2 and 9). But it will be useful here to introduce some of its

noteworthy features.

In the course of this book, we will introduce a number of reasonguiding

prescriptions offered by Ameliorative Psychology. This advice

includes making statistical judgments in terms of frequencies rather than

probabilities, considering explanations for propositions one doesn’t believe,

ignoring certain kinds of evidence (e.g., certain selected cues that

improve accuracy only very moderately, and certain kinds of impressionistic

information, such as opinions gleaned from unstructured personal

interviews), and many others (Bishop 2000). These recommendations are

bluntly normative: They tell us how we ought to reason about certain sorts

of problems.

A particularly interesting branch of Ameliorative Psychology begins in

earnest in 1954 with the publication of Paul Meehl’s classic book Clinical

Versus Statistical Prediction: A Theoretical Analysis and a Review of the

Evidence. Meehl reported on twenty experiments that showed that very

simple prediction rules were more reliable predictors than human experts.

Since then, psychologists have developed many of these Statistical Prediction

Rules (or SPRs). (In fact, in the past decade or so, there has been

an explosion of predictive models in AI and machine learning.) There is

now considerable evidence for what we call The Golden Rule of Predictive

Modeling: When based on the same evidence, the predictions of SPRs are

at least as reliable, and are typically more reliable, than the predictions

of human experts. Except for an important qualification we will discuss

in chapter 2, section 4.2, the evidence in favor of the Golden Rule is

overwhelming (see Grove and Meehl 1996; Swets, Dawes, and Monahan

2000).

The Golden Rule of Predictive Modeling has been woefully neglected.

Perhaps a good way to begin to undo this state of affairs is to briefly describe ten of its instances. This will give the reader some idea of the

range and robustness of the Golden Rule.

1. An SPR that takes into account a patient’s marital status, length of psychotic

distress, and a rating of the patient’s insight into his or her condition

predicted the success of electroshock therapy more reliably than

a hospital’s medical and psychological staff members (Wittman 1941).

2. A model that used past criminal and prison records was more reliable

than expert criminologists in predicting criminal recidivism (Carroll

et al., 1988).

3. On the basis of a Minnesota Multiphasic Personality Inventory (MMPI)

profile, clinical psychologists were less reliable than an SPR in diagnosing

patients as either neurotic or psychotic. When psychologists were

given the SPR’s results before they made their predictions, they were

still less accurate than the SPR (Goldberg 1968).

4. A number of SPRs predict academic performance (measured by

graduation rates and GPA at graduation) better than admissions officers.

This is true even when the admissions officers are allowed to use

considerably more evidence than the models (DeVaul et al. 1957), and

it has been shown to be true at selective colleges, medical schools

(DeVaul et al. 1957), law schools (Swets, Dawes, and Monahan 2000,

18), and graduate school in psychology (Dawes 1971).

5. SPRs predict loan and credit risk better than bank officers. SPRs are

now standardly used by banks when they make loans and by credit

card companies when they approve and set credit limits for new

customers (Stillwell et al. 1983).

6. SPRs predict newborns at risk for Sudden Infant Death Syndrome

(SIDS) much better than human experts (Carpenter et al. 1977, Golding

et al. 1985).

7. Predicting the quality of the vintage for a red Bordeaux wine decades

in advance is done more reliably by an SPR than by expert wine tasters,

who swirl, smell, and taste the young wine (Ashenfelter, Ashmore, and

Lalonde 1995).

8. An SPR correctly diagnosed 83% of progressive brain dysfunction on

the basis of cues from intellectual tests. Groups of clinicians working

from the same data did no better than 63%. When clinicians were given

the results of the actuarial formula, clinicians still did worse than the

model, scoring no better than 75% (Leli and Filskov 1984).

9. In predicting the presence, location, and cause of brain damage, an

SPR outperformed experienced clinicians and a nationally prominent

neuropsychologist (Wedding 1983).

10. In legal settings, forensic psychologists often make predictions of violence.

One will be more reliable than forensic psychologists simply by

Laying Our Cards on the Table 13

predicting that people will not be violent. Further, SPRs are more reliable

than forensic psychologists in predicting the relative likelihood of

violence, that is, who is more prone to violence (Faust and Ziskin 1988).

Upon reviewing this evidence in 1986, Paul Meehl said: ‘‘There is no

controversy in social science which shows such a large body of qualitatively

diverse studies coming out so uniformly in the same direction as this

one. When you are pushing [scores of ] investigations, predicting everything

from the outcomes of football games to the diagnosis of liver disease

and when you can hardly come up with a half dozen studies showing even

a weak tendency in favor of the clinician, it is time to draw a practical

conclusion’’ (Meehl 1986, 372–73). Ameliorative Psychology has had

consistent success in recommending reasoning strategies in a wide variety

of important reasoning tasks. Such success is worth exploring.

The descriptive core of our approach to epistemology consists of the

empirical findings of Ameliorative Psychology. And yet, Ameliorative

Psychology is deeply normative in the sense that it makes (implicitly or

explicitly) evaluative ‘‘ought’’ claims that are intended to guide people’s

reasoning. Let’s look at three examples of the reason-guiding prescriptions

of Ameliorative Psychology.

A well-documented success of Ameliorative Psychology is the Goldberg

Rule (the third item on the above list). It predicts whether a psychiatric

patient is neurotic or psychotic on the basis of an MMPI profile.

Lewis Goldberg (1965) found that the following rule outperformed 29

clinical judges (where L is a validity scale and Pa, Sc, Hy, and Pt are clinical

scales of the MMPI):

xј(LюPaюSc)_(HyюPt)

If x < 45, diagnose patient as neurotic.

If x_45, diagnose patient as psychotic.

When tested on a set of 861 patients, the Goldberg Rule had a 70% hit rate;

clinicians’ hit rates varied from a low of 55% to a high of 67%. (13 of the 29

clinical judges in the above study were experienced Ph.D.s, while the other

16 were Ph.D. students. The Ph.D.s were no more accurate than the students.

This is consistent with the findings reported in Dawes 1994.) So here

we have a prediction rule that could literally turn a smart second-grader

into a better psychiatric diagnostician than highly credentialed, highly

experienced psychologists—at least for this diagnostic task. In fact, more

than 3 decades after the appearance of Goldberg’s results, making an initial diagnosis on the basis of an MMPI profile by using subjective judgment

rather than the Goldberg Rule would bespeak either willful irresponsibility

or deep ignorance. So here is a finding of Ameliorative Psychology: people

(in an epistemic sense) ought to use the Goldberg Rule in making preliminary

diagnoses of psychiatric patients.

Another example of Ameliorative Psychology making evaluative oughtclaims

is a 1995 paper by Gigerenzer and Hoffrage entitled ‘‘How to

Improve Bayesian Reasoning Without Instruction: Frequency Formats’’

(emphasis added). As the title of the paper suggests, Gigerenzer and Hoffrage

show how people charged with making high-stakes diagnoses (e.g.,

about cancer or HIV) can improve their reasoning. They suggest a reasoning

strategy that enhances reasoners’ ability to identify, on the basis of

medical tests, the likelihood that an individual will have cancer or HIV. We

will discuss these ‘‘frequency formats’’ in chapter 9, section 1. For now, it is

enough to note that a finding of Ameliorative Psychology is that people

ought to use frequency formats when diagnosing rare conditions on the

basis of well-understood diagnostic tests.

Another particularly successful example of Ameliorative Psychology is

credit scoring (the fifth item on the above list). Many financial institutions

no longer rely primarily on financial officers to make credit decisions—

they now make credit decisions on the basis of simple SPRs developed as

the result of research by psychologists and statisticians (Lovie and Lovie

1986). Once again, this finding of Ameliorative Psychology seems to be

normative through and through: When it comes to making predictions

about someone’s creditworthiness, one ought to use a credit-scoring model.

Not only does Ameliorative Psychology recommend particular reasoning

strategies for tackling certain kinds of problems, it also suggests

generalizations about how people ought to reason. (See, for example, the

flat maximum principle, discussed in chapter 2, section 2.1.) On our view,

the goal of epistemology is to articulate the epistemic generalizations that

guide the prescriptions of Ameliorative Psychology. In this way, epistemology

is simply a branch of philosophy of science. Just as the philosopher

of biology might aim to uncover and articulate the metaphysical assumptions

of evolutionary theory, the epistemologist aims to uncover and articulate

the normative, epistemic principles behind the long and distinguished

tradition of Ameliorative Psychology. (There are two objections philosophers

are likely to immediately raise against our approach. We consider

them in the Appendix, sections 1 and 2.)

Ameliorative Psychology is normative in the sense that it yields explicit,

reason-guiding advice about how people ought to reason. Some

Laying Our Cards on the Table 15

might fix us with a jaundiced eye and wonder whether the recommendations

of Ameliorative Psychology are really normative in the same way

as the recommendations of SAE are normative. Admittedly, there does

seem to be one telling difference. People outside academia have on occasion

actually changed the way they reason about significant matters as a

result of the recommendations of Ameliorative Psychology.