3.1 The Hierarchical Organization of the Mammalian Visual System

К оглавлению1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 
34 35 36 37 38 39 40 41 

In the fi rst chapter, I established that an organism is a living entity, the

components of which are hierarchically organized in subsystems and processes

operating so as to achieve particularized and generalized homeostasis

(HOV). The subsystems and processes possess certain properties,

including abilities to exchange data fl exibly, convert data to information

in a selection process, integrate information, and process information from

environments. In the second chapter, after using HOV and as-if realism to

give credence to nomological and representational emergence, respectively,

I argued that the Cummins organizational view and the Griffi ths/Godfrey-

Smith view can be made compatible with one another in providing a

complete defi nition of biological function. We should expect that the traits

of an organism function the way they do because such traits presently

contribute to the overall organization of the organism (Cummins) as well

as were selected for in the organism’s species’ recent ancestry (Griffi ths/

Godfrey-Smith). The work of the fi rst two chapters was accomplished for

the twofold purpose of giving further elucidation to Mayr’s description of

organisms as hierarchically organized systems that operate on the basis of

historically acquired programs of information, as well as ratifying Plotkin’s

claim that biological phenomena only make complete sense in light of

evolutionary theory.

In the next chapter, I deal with the evolution of the mammalian visual

system. In this chapter, building upon the work of the previous two chapters,

I show how the processes associated with vision in mammals comprise

a hierarchically organized system exhibiting the same kinds of properties

of information exchange, selectivity, and integration found in organisms

in general (also see Arp, 2005b). I restrict my analysis of the brain to the

primary processes and mechanisms associated with the mammalian visual system and visual cognition. I do this for three reasons. First, there is much

empirical evidence supporting our understanding of the mammalian visual

system’s structure and layout (van Essen, 1985, 1997; van Essen &

Maunsell, 1983; van Essen & Gallant, 1994; van Essen, Anderson, & Felleman,

1992; van Essen, Anderson, & Olshausen, 1994; van Essen et al.,

1998; Allman & Kaas, 1971; Desimone & Ungerleider, 1989; Mishkin,

Ungerleider, & Macko, 1983; Rueckl, Cave, & Kosslyn, 1989; Casagrande

& Kaas, 1994; Kosslyn & Koenig, 1995). Second, the visual system is present

in many kinds of vertebrate species thought to be homologous (i.e., having

evolved from a common ancestor) to human beings (Kaas, 1993, 1995,

1996; Northcutt & Kaas, 1995; Preuss, Qi, & Kaas, 1999; Harvey & Pagel,

1991; Desimone, 1992; Desimone, Albright, Gross, & Bruce, 1984; Karten

& Shimazu, 1989; Butler & Hodos, 1996; Tyler et al., 1998). Finally, I restrict

my scope to the visual system because it plays a central role in the evolutionary

account I give of the progression from noncognitive visual processing

to conscious cognitive visual processing in terms of scenario

visualization. I fortify what thinkers like Barton (1998), Crick (1994),

Carruthers (2002), and Allman (1977, 1982, 2000) have maintained,

namely, that visual processing is an important factor in the evolution of

conscious behavior, including creative problem solving.

The mammalian visual system is situated within the vertebrate nervous

system, while at the same time it is composed of neurons that are specialized

in their own processes. Recalling the schematization of triangles in

fi gure 1.1 in the fi rst chapter, the visual system is like a medium-sized triangle

made up of smaller triangles (the neuronal processes), existing in a

larger triangle (the nervous system) along with other systems like the auditory,

olfactory, and so forth. All of these triangles exist within the largest

triangle (the organism) as is schematized in fi gure 3.1. There is an elegant

consistency in the hierarchical organization exhibited from the microlevel

of the neuron to the macrolevel of the vertebrate nervous system. This

consistency is echoed in Bear, Connors, & Pardiso’s (2001, p. 161) claim

that the “signaling network within a single neuron resembles in some ways

the neural networks of the brain itself.” Hierarchies exist within hierarchies,

and, as we will see, the visual system is one of those hierarchies that

functions so as to aid in producing the architectonic organization of the

nervous system of an animal.

In the fi rst chapter, I proposed that these hierarchies are able to interact

with one another because of internal–hierarchical data exchange, whereby

data—the raw material that are of the kind that have the potential to

become useful for a process or operation—are exchanged between and

among the processes and subsystems at various levels of operation in an

organism. In their textbook devoted to the principles of neuroscience,

Kandel et al. (2000, p. 353) describe the processes associated with perception

in the cerebral cortex using a hierarchical model: “Sensory information

is fi rst received and interpreted by the primary sensory areas, then

sent to unimodal association areas, and fi nally to the multimodal sensory

areas. At each successive stage of this stream more complex analysis is

achieved, culminating eventually, as with vision, for example, in object

and pattern recognition in the inferotemporal cortex.”

Kandel et al.’s text is a standard work in neuroscience, and I use it as my

primary reference throughout this book. Kandel et al. actually divvy up

the hierarchy of sensory systems into four parts, namely, (1) the primary

sensory areas, (2) the unimodal areas, (3) the unimodal association areas,

and (4) the multimodal association areas.

The primary sensory areas act as base levels, and they refer to the parts

and processes associated with information that is initially communicated

to the spinal cord and/or brain through one of the fi ve sensory modalities,

namely, touch, hearing, taste, smell, and vision. For example, in the

visual system the primary sensory area comprises the eye, the lateral

geniculate nucleus (LGN), and the primary visual cortex located in the

occipital lobe of the brain. The unimodal areas build upon the data

The Organism

The Auditory System

The Visual System

The Olfactory System

The Vertebrate Nervous System

Neuronal Processes

Figure 3.1

The visual system hierarchy in relation to the organism received from some prior particular primary sensory area and refer to the

parts and processes associated with a higher level integration of the data

received from one of the primary sensory areas. In the visual system,

there are two primary unimodal areas that process information concerning

where an object is and what an object is, located along trajectories

between the occipital lobe and parietal and temporal regions, respectively.

The unimodal association areas, in turn, refer to parts and processes associated

with an even higher level integration of the data received from

two or more unimodal areas. In the visual system, the unimodal association

area integrates data about the color, motion, and form of objects

and is located in the occiptotemporal (also called occipitotemporal) area of

the brain. Finally, the multimodal association areas refer to parts and

processes associated with integrating the data received from the unimodal

association areas and, depending upon the sensory modality, process this

information in either the parietotemporal, parietal, temporal, and/or

frontal areas of the brain.

Having given this general overview of the hierarchy concerning perception

in the cerebral cortex and related areas, we now can give a

more specifi ed description of the visual hierarchy, along with its components

and processes. The components of the visual hierarchy are

comprised of groups of specialized neurons that “fi re” according to certain

external and internal stimulus cues, and the various processes of the

visual hierarchy are active when an object “comes into view,” as it were,

namely, when an object is recognized as present in a mammal’s visual

fi eld. In essence, what follows is a description of the neural wiring and

functioning associated with mammalian object recognition in the visual

system.

The primary sensory area of the visual system comprises the pathway

that starts with the retina of the eye and projects through the LGN of

the thalamus to the primary visual cortex of the occipital lobe (V1 or

Brodmann area 17). Photons of light are transduced into electrical signals

by the photoreceptor neurons that lie on the innermost layer of the retina

known as rods and cones. Rods are sensitive to dim light, while cones are

sensitive to brighter light. The photoreceptors make synapses with other

kinds of neurons known as horizontal and bipolar cells. The horizontal

cells primarily are responsible for the center-surround organization of the

receptive fi eld of the bipolar cell. The bipolar cells receive synapses from

photoreceptors, horizontal cells, and other neurons known as amacrine

cells and relay data from the photoreceptors to the ganglion cells, which

send their axons to the brain via the optic nerve. The ganglion cells project to a number of sites, including several cortical areas through the thalamus,

the hypothalamus, and midbrain. The major cortical projection is via the

LGN of the thalamus to the primary visual cortex in the occipital lobe

(Kandel et al., 2000; Zigmond, Bloom, Landis, Roberts, Squire, & Wooley,

1999; Bear et al., 2001).

The LGN consists of six layers in primates. The inner two layers, with

their large neurons, form the magnocellular laminae (literally, big-celled

layers); while the remaining four layers, with their smaller neurons, constitute

the parvocellular laminae (small-celled layers). Intercalated between

these principal laminae are the koniocellular neurons (K cells). In their

fi ring responses, the magnocellular neurons (M cells) are sensitive to

motion especially, while the parvocellular neurons (P cells) are responsive

to color.

Like the LGN, the primary visual cortex in the occipital lobe (again,

known as V1 or Brodmann area 17) is made up of six primary layers in

primates. The LGN mainly projects to layer IV of V1, and to a lesser extent

to layer VI, with the M and P channels having different synaptic targets

within these laminae. There is also a projection from cells in the intralaminar

part of the LGN directly to layers II and III of V1. The layer IV

neurons project on to adjacent neurons in such a way as to form what are

known as orientation-specifi c columns, ocular-dominance columns, and

blobs. Orientation-specifi c columns are responsible for the decomposition

of objects of the visual fi eld into short line segments of varying orientation

form. Ocular-dominance columns are responsible for the combination of

input from the two eyes so as to perceive the depth associated with an

object and its background. Blobs are responsible for processing wavelength

information, which ultimately contributes to the recognition of various

colors of objects.

The occipital lobe is split into many visual-related areas, each processing

an aspect of an object in the visual fi eld. V1 is responsible for initial visual

processing and can be subdivided into different subregions, each containing

a full representation of the visual fi eld for the contralateral world.

However, after the initial processing in V1, the processing that takes place

in other regions of the occipital lobe is more specialized: V2 is responsible

for stereo vision, V3 for distance, V4 for color, V5 for motion, and V6 for

object position. Van Essen et al. (1992) have recorded more than thirty

primary visual areas in the macaque monkey. Through positron-emission

tomography (PET) and functional magnetic resonance imaging (fMRI)

scans, Zeki, Watson, Weck, Friston, Kennard, & Frackowiak (1991) and

Sereno et al. (1995) have demonstrated that there are multiple visual areas

62 Chapter 3

in humans devoted to specifi c analysis of the properties of an object in the

visual fi eld.

So far, I have described what Kandel et al. would call the visual primary

sensory area of the visual hierarchy. From this area, another level is added

to the hierarchy as cortical projections are laid out along two visual unimodal

areas, namely, the M cell and P cell pathways. The M cell pathway

is also known as the parietal or dorsal pathway, and it consists of visual

areas laid out along a trajectory from the occipital region, through V1, V2,

V3, V5, and V6, to the parietal region of the brain. Research suggests that

the M cell pathway is responsible for guiding our actions in our visual

environment, since depth, motion, and object position—that is, where an

object is, independent of what the object is—appear to be processed along

its stream. Conversely, the P cell pathway is also known as the temporal,

or ventral, pathway, and it consists of visual areas laid out along a trajectory

from the occipital region, through V1, V2, and V4, to the temporal

region of the brain. Research suggests that the P cell pathway is responsible

for the color and form recognition of an object—that is, what an object is,

independent of where the object is (see Desimone & Ungerleider, 1989;

Ungerleider & Mishkin, 1982; Ungerleider & Haxby, 1994; Goodale et al.,

1994).

Yet another level is added to the visual hierarchy as neurons from

the parietal and temporal visual unimodal areas project to the visual

unimodal association area in the occiptotemporal cortex of the brain. This

area is considered more complex than the unimodal areas because neurons

there are involved in integrating the processed data received from the

parietal and temporal unimodal areas concerning color, motion, depth,

form, distance, and the like. Research has shown that there is a division

of labor concerning an animal’s abilities to distinguish what an object

is from where an object is. However, there are times when an animal

must perform both of these tasks, and given the neuronal projections

from the parietal and temporal areas to this common site in the occiptotemporal

cortex, it makes sense that an animal be able to integrate

visual information about what and where an object is in its visual fi eld

at the same time.

At the highest level of the visual hierarchy, the visual unimodal association

area projects to the multimodal association areas of the prefrontal,

parietotemporal, and limbic cortices. This is the level at which visual information

is integrated with other sensory information, as well as where

motor planning, attention, emotion, language production, and judgment

take place.

There are a number of other parts of the central nervous system (CNS)

that exchange data with the visual system, including the posterior parietal

cortex, the subcortical structures of the hypothalamus, and upper brainstem.

The neurons in the posterior parietal cortex respond to stimuli of

interest and are probably involved in visual fi xation and tracking. The

superior colliculus in the midbrain is a multilayered structure wherein the

outer layers are involved in mapping the visual fi eld, the intermediate areas

are involved in saccadic eye movements, and the deeper layers are involved

with more complex sensory integration involving visual, auditory, and

somatosensory stimuli. There is a projection from the optic tract to the

pretectal nuclei of the midbrain that, in turn, projects to the Edinger–

Westphal nucleus. This projection provides the parasympathetic (i.e., autonomic)

input to the pupil of the eye, allowing it to constrict. Also, there

is a direct retinal input to the suprachiasmatic nucleus of the hypothalamus

that is important in the generation and control of circadian

rhythms.

Other extrastriate cortical areas receive projections from the LGN, as well

as the pulvinar region of the thalamus. An area in the inferotemporal (IT)

cortex has been found to respond selectively to faces. This area has been

described as face-selective without implying that when one recognizes a

face, only cells in this area participate in this recognition (Tovee &

Cohen-Tovee, 1993; Tovee, 1998). IT cortex also has been shown to be

involved in storing visual memories, as well as encoding the properties of

objects (Kosslyn & Koenig, 1995). Finally, the visual unimodal and multimodal

association areas are linked up with other areas in the frontal, temporal,

and parietal lobes and the hippocampus so that visual images can

be made, stored, recalled, inspected, and possibly utilized in planning,

judging, feeling, and other complex voluntary activity.

As has been said, the systematic distributions of parts and processes in

the visual system, and related systems, are hierarchically organized. The

multimodal areas build upon information received from the unimodal

areas, and the unimodal areas build upon information received from the

primary sensory area, as is schematized in fi gure 3.2. It should be clear

from the aforementioned description of the neural wiring and functioning

associated with the mammalian visual system that there must be a massive

coordination and organization of processes in the CNS for a seemingly

simple activity—like recognizing an object—to occur. When I visited him

in his lab, Michael Ariel, a neuroscientist at Saint Louis University in St.

Louis who specializes in the visual systems of turtles, affi rmed this point

(see, e.g., Martin, Kogo, Fan, & Ariel, 2003). The realization that the

The Primary Visual Area and Pathways

Eye

LGN

V1 in the

Occipital Cortex

The Unimodal Visual Area and Pathways

Eye

Where System

Occiptotemporal

Cortex

What System

Occipital Cortex

Figure 3.2

The primary sensory, unimodal, and multimodal areas and pathways of the visual

System nervous system is such a grandiose architectonic has caused Gray (1999,

p. 31) to maintain, “The inescapable conclusion is that sensory, cognitive,

and motor processes result from parallel interactions among large populations

of neurons distributed among multiple cortical and subcortical

structures.”

Data are exchanged at the various levels of the visual system and CNS

and, because of these exchanges, an animal is able to form a coherent

picture of an object in its visual fi eld. However, a fi nal qualifi cation must

be made about the hierarchical processes of the visual system. We must

draw a distinction between a serial hierarchy and a dynamic, or interactive,

hierarchy. In a serial hierarchy, information fl ows in a one-way direction

from the lowest level to the highest level of the hierarchy. Conversely, in

an interactive hierarchy, information fl ows bidirectionally among and

between the lower and higher levels of the hierarchy.

Consider a small, fi ctitious corporation consisting of a worker, a manager,

and a CEO. The worker is the lowest member of the hierarchy, the manager

is one step above the worker, and the CEO is at the top level of the hierarchy.

The worker communicates two ideas to the manager who, in turn,

communicates these two ideas plus two more of his own ideas to the CEO.

Once the fi rst two ideas are communicated from worker to manager, there

is no further contact between the two people; likewise, once the four ideas

are communicated from manager to CEO, there is no further contact

The Multimodal Visual Area and Pathways

Eye

Parietotemporal

Limbic Cortex

Cortices

Prefrontal Cortex

Occiptotemporal

Cortex

Figure 3.2 (continued)

between those two people. This worker →manager →CEO setup would

be an example of one-way information fl ow in a serial hierarchy.

There is only one sense in which the visual system can be considered as

a serial hierarchy; otherwise it is most appropriately envisioned as an

interactive hierarchy. The information fl ow from retina through LGN to

V1 occurs in a one-way direction, like the fl ow of information from worker

to manager to CEO in the small corporation. There is no information

feedback from V1 to the retina, just as there is no information feedback

from CEO to worker in our fi ctitious corporation. This makes sense, since

the inputs of primary sensory areas themselves are passive and automatic

in-takers of information (see Sekuler & Blake, 2002).

Unlike the one-way fl ow of information between retina and V1, there is

the possibility for a dynamic, interactive, two-way fl ow of information

between and among the primary visual, unimodal, and multimodal areas

of the visual hierarchy. For example, Kosslyn & Koenig (1995) present evidence

that emotions (present at a higher level in the hierarchy) can affect

the visual system’s performance in terms of visual priming and coded

anticipation of certain visual scenes (present at a lower level in the visual

hierarchy).

Also, in their experiments with monkeys and humans, Sigala & Logothetis

(2002) show that visual categorization determines, in many ways,

what specifi c features of a face will be focused upon. In effect, the experiments

show that if a face is categorized generally in a certain way as expressive

of some particular emotion, then this categorization will infl uence

what particular features of a face—for example, basic lines, symmetries, and

the like—the animal subsequently will focus upon. The neural correlates

of visual categorization are found higher up in the visual hierarchy associated

with the occiptotemporal and parietotemporal cortices, while the

neural correlates concerning the processing of particular features of the

face, in terms of lines and symmetries, are found at a more basic spot in

the visual hierarchy associated with the trajectory between the temporal

and occipital cortices of the what system.

Here, we have an instance of a dynamic, interactive fl ow of information

in the visual hierarchy because, in addition to information regarding a

face’s particular features fl owing from the what system to the occiptotemporal

and parietotemporal cortices in facial categorization, there is fl ow of

information back from these cortices to the what system in terms of this

categorization’s determination of the particular features of a face that are

focused upon by an animal. Referring to the fi ctitious corporation, this

kind of information fl ow would be analogous to the CEO’s being in some

form of dialogue with the manager about his or her ideas, such that the

CEO has infl uence upon the manager’s ideas, and vice versa.

In the fi rst chapter, I established that an organism is a living entity, the

components of which are hierarchically organized in subsystems and processes

operating so as to achieve particularized and generalized homeostasis

(HOV). The subsystems and processes possess certain properties,

including abilities to exchange data fl exibly, convert data to information

in a selection process, integrate information, and process information from

environments. In the second chapter, after using HOV and as-if realism to

give credence to nomological and representational emergence, respectively,

I argued that the Cummins organizational view and the Griffi ths/Godfrey-

Smith view can be made compatible with one another in providing a

complete defi nition of biological function. We should expect that the traits

of an organism function the way they do because such traits presently

contribute to the overall organization of the organism (Cummins) as well

as were selected for in the organism’s species’ recent ancestry (Griffi ths/

Godfrey-Smith). The work of the fi rst two chapters was accomplished for

the twofold purpose of giving further elucidation to Mayr’s description of

organisms as hierarchically organized systems that operate on the basis of

historically acquired programs of information, as well as ratifying Plotkin’s

claim that biological phenomena only make complete sense in light of

evolutionary theory.

In the next chapter, I deal with the evolution of the mammalian visual

system. In this chapter, building upon the work of the previous two chapters,

I show how the processes associated with vision in mammals comprise

a hierarchically organized system exhibiting the same kinds of properties

of information exchange, selectivity, and integration found in organisms

in general (also see Arp, 2005b). I restrict my analysis of the brain to the

primary processes and mechanisms associated with the mammalian visual system and visual cognition. I do this for three reasons. First, there is much

empirical evidence supporting our understanding of the mammalian visual

system’s structure and layout (van Essen, 1985, 1997; van Essen &

Maunsell, 1983; van Essen & Gallant, 1994; van Essen, Anderson, & Felleman,

1992; van Essen, Anderson, & Olshausen, 1994; van Essen et al.,

1998; Allman & Kaas, 1971; Desimone & Ungerleider, 1989; Mishkin,

Ungerleider, & Macko, 1983; Rueckl, Cave, & Kosslyn, 1989; Casagrande

& Kaas, 1994; Kosslyn & Koenig, 1995). Second, the visual system is present

in many kinds of vertebrate species thought to be homologous (i.e., having

evolved from a common ancestor) to human beings (Kaas, 1993, 1995,

1996; Northcutt & Kaas, 1995; Preuss, Qi, & Kaas, 1999; Harvey & Pagel,

1991; Desimone, 1992; Desimone, Albright, Gross, & Bruce, 1984; Karten

& Shimazu, 1989; Butler & Hodos, 1996; Tyler et al., 1998). Finally, I restrict

my scope to the visual system because it plays a central role in the evolutionary

account I give of the progression from noncognitive visual processing

to conscious cognitive visual processing in terms of scenario

visualization. I fortify what thinkers like Barton (1998), Crick (1994),

Carruthers (2002), and Allman (1977, 1982, 2000) have maintained,

namely, that visual processing is an important factor in the evolution of

conscious behavior, including creative problem solving.

The mammalian visual system is situated within the vertebrate nervous

system, while at the same time it is composed of neurons that are specialized

in their own processes. Recalling the schematization of triangles in

fi gure 1.1 in the fi rst chapter, the visual system is like a medium-sized triangle

made up of smaller triangles (the neuronal processes), existing in a

larger triangle (the nervous system) along with other systems like the auditory,

olfactory, and so forth. All of these triangles exist within the largest

triangle (the organism) as is schematized in fi gure 3.1. There is an elegant

consistency in the hierarchical organization exhibited from the microlevel

of the neuron to the macrolevel of the vertebrate nervous system. This

consistency is echoed in Bear, Connors, & Pardiso’s (2001, p. 161) claim

that the “signaling network within a single neuron resembles in some ways

the neural networks of the brain itself.” Hierarchies exist within hierarchies,

and, as we will see, the visual system is one of those hierarchies that

functions so as to aid in producing the architectonic organization of the

nervous system of an animal.

In the fi rst chapter, I proposed that these hierarchies are able to interact

with one another because of internal–hierarchical data exchange, whereby

data—the raw material that are of the kind that have the potential to

become useful for a process or operation—are exchanged between and

among the processes and subsystems at various levels of operation in an

organism. In their textbook devoted to the principles of neuroscience,

Kandel et al. (2000, p. 353) describe the processes associated with perception

in the cerebral cortex using a hierarchical model: “Sensory information

is fi rst received and interpreted by the primary sensory areas, then

sent to unimodal association areas, and fi nally to the multimodal sensory

areas. At each successive stage of this stream more complex analysis is

achieved, culminating eventually, as with vision, for example, in object

and pattern recognition in the inferotemporal cortex.”

Kandel et al.’s text is a standard work in neuroscience, and I use it as my

primary reference throughout this book. Kandel et al. actually divvy up

the hierarchy of sensory systems into four parts, namely, (1) the primary

sensory areas, (2) the unimodal areas, (3) the unimodal association areas,

and (4) the multimodal association areas.

The primary sensory areas act as base levels, and they refer to the parts

and processes associated with information that is initially communicated

to the spinal cord and/or brain through one of the fi ve sensory modalities,

namely, touch, hearing, taste, smell, and vision. For example, in the

visual system the primary sensory area comprises the eye, the lateral

geniculate nucleus (LGN), and the primary visual cortex located in the

occipital lobe of the brain. The unimodal areas build upon the data

The Organism

The Auditory System

The Visual System

The Olfactory System

The Vertebrate Nervous System

Neuronal Processes

Figure 3.1

The visual system hierarchy in relation to the organism received from some prior particular primary sensory area and refer to the

parts and processes associated with a higher level integration of the data

received from one of the primary sensory areas. In the visual system,

there are two primary unimodal areas that process information concerning

where an object is and what an object is, located along trajectories

between the occipital lobe and parietal and temporal regions, respectively.

The unimodal association areas, in turn, refer to parts and processes associated

with an even higher level integration of the data received from

two or more unimodal areas. In the visual system, the unimodal association

area integrates data about the color, motion, and form of objects

and is located in the occiptotemporal (also called occipitotemporal) area of

the brain. Finally, the multimodal association areas refer to parts and

processes associated with integrating the data received from the unimodal

association areas and, depending upon the sensory modality, process this

information in either the parietotemporal, parietal, temporal, and/or

frontal areas of the brain.

Having given this general overview of the hierarchy concerning perception

in the cerebral cortex and related areas, we now can give a

more specifi ed description of the visual hierarchy, along with its components

and processes. The components of the visual hierarchy are

comprised of groups of specialized neurons that “fi re” according to certain

external and internal stimulus cues, and the various processes of the

visual hierarchy are active when an object “comes into view,” as it were,

namely, when an object is recognized as present in a mammal’s visual

fi eld. In essence, what follows is a description of the neural wiring and

functioning associated with mammalian object recognition in the visual

system.

The primary sensory area of the visual system comprises the pathway

that starts with the retina of the eye and projects through the LGN of

the thalamus to the primary visual cortex of the occipital lobe (V1 or

Brodmann area 17). Photons of light are transduced into electrical signals

by the photoreceptor neurons that lie on the innermost layer of the retina

known as rods and cones. Rods are sensitive to dim light, while cones are

sensitive to brighter light. The photoreceptors make synapses with other

kinds of neurons known as horizontal and bipolar cells. The horizontal

cells primarily are responsible for the center-surround organization of the

receptive fi eld of the bipolar cell. The bipolar cells receive synapses from

photoreceptors, horizontal cells, and other neurons known as amacrine

cells and relay data from the photoreceptors to the ganglion cells, which

send their axons to the brain via the optic nerve. The ganglion cells project to a number of sites, including several cortical areas through the thalamus,

the hypothalamus, and midbrain. The major cortical projection is via the

LGN of the thalamus to the primary visual cortex in the occipital lobe

(Kandel et al., 2000; Zigmond, Bloom, Landis, Roberts, Squire, & Wooley,

1999; Bear et al., 2001).

The LGN consists of six layers in primates. The inner two layers, with

their large neurons, form the magnocellular laminae (literally, big-celled

layers); while the remaining four layers, with their smaller neurons, constitute

the parvocellular laminae (small-celled layers). Intercalated between

these principal laminae are the koniocellular neurons (K cells). In their

fi ring responses, the magnocellular neurons (M cells) are sensitive to

motion especially, while the parvocellular neurons (P cells) are responsive

to color.

Like the LGN, the primary visual cortex in the occipital lobe (again,

known as V1 or Brodmann area 17) is made up of six primary layers in

primates. The LGN mainly projects to layer IV of V1, and to a lesser extent

to layer VI, with the M and P channels having different synaptic targets

within these laminae. There is also a projection from cells in the intralaminar

part of the LGN directly to layers II and III of V1. The layer IV

neurons project on to adjacent neurons in such a way as to form what are

known as orientation-specifi c columns, ocular-dominance columns, and

blobs. Orientation-specifi c columns are responsible for the decomposition

of objects of the visual fi eld into short line segments of varying orientation

form. Ocular-dominance columns are responsible for the combination of

input from the two eyes so as to perceive the depth associated with an

object and its background. Blobs are responsible for processing wavelength

information, which ultimately contributes to the recognition of various

colors of objects.

The occipital lobe is split into many visual-related areas, each processing

an aspect of an object in the visual fi eld. V1 is responsible for initial visual

processing and can be subdivided into different subregions, each containing

a full representation of the visual fi eld for the contralateral world.

However, after the initial processing in V1, the processing that takes place

in other regions of the occipital lobe is more specialized: V2 is responsible

for stereo vision, V3 for distance, V4 for color, V5 for motion, and V6 for

object position. Van Essen et al. (1992) have recorded more than thirty

primary visual areas in the macaque monkey. Through positron-emission

tomography (PET) and functional magnetic resonance imaging (fMRI)

scans, Zeki, Watson, Weck, Friston, Kennard, & Frackowiak (1991) and

Sereno et al. (1995) have demonstrated that there are multiple visual areas

62 Chapter 3

in humans devoted to specifi c analysis of the properties of an object in the

visual fi eld.

So far, I have described what Kandel et al. would call the visual primary

sensory area of the visual hierarchy. From this area, another level is added

to the hierarchy as cortical projections are laid out along two visual unimodal

areas, namely, the M cell and P cell pathways. The M cell pathway

is also known as the parietal or dorsal pathway, and it consists of visual

areas laid out along a trajectory from the occipital region, through V1, V2,

V3, V5, and V6, to the parietal region of the brain. Research suggests that

the M cell pathway is responsible for guiding our actions in our visual

environment, since depth, motion, and object position—that is, where an

object is, independent of what the object is—appear to be processed along

its stream. Conversely, the P cell pathway is also known as the temporal,

or ventral, pathway, and it consists of visual areas laid out along a trajectory

from the occipital region, through V1, V2, and V4, to the temporal

region of the brain. Research suggests that the P cell pathway is responsible

for the color and form recognition of an object—that is, what an object is,

independent of where the object is (see Desimone & Ungerleider, 1989;

Ungerleider & Mishkin, 1982; Ungerleider & Haxby, 1994; Goodale et al.,

1994).

Yet another level is added to the visual hierarchy as neurons from

the parietal and temporal visual unimodal areas project to the visual

unimodal association area in the occiptotemporal cortex of the brain. This

area is considered more complex than the unimodal areas because neurons

there are involved in integrating the processed data received from the

parietal and temporal unimodal areas concerning color, motion, depth,

form, distance, and the like. Research has shown that there is a division

of labor concerning an animal’s abilities to distinguish what an object

is from where an object is. However, there are times when an animal

must perform both of these tasks, and given the neuronal projections

from the parietal and temporal areas to this common site in the occiptotemporal

cortex, it makes sense that an animal be able to integrate

visual information about what and where an object is in its visual fi eld

at the same time.

At the highest level of the visual hierarchy, the visual unimodal association

area projects to the multimodal association areas of the prefrontal,

parietotemporal, and limbic cortices. This is the level at which visual information

is integrated with other sensory information, as well as where

motor planning, attention, emotion, language production, and judgment

take place.

There are a number of other parts of the central nervous system (CNS)

that exchange data with the visual system, including the posterior parietal

cortex, the subcortical structures of the hypothalamus, and upper brainstem.

The neurons in the posterior parietal cortex respond to stimuli of

interest and are probably involved in visual fi xation and tracking. The

superior colliculus in the midbrain is a multilayered structure wherein the

outer layers are involved in mapping the visual fi eld, the intermediate areas

are involved in saccadic eye movements, and the deeper layers are involved

with more complex sensory integration involving visual, auditory, and

somatosensory stimuli. There is a projection from the optic tract to the

pretectal nuclei of the midbrain that, in turn, projects to the Edinger–

Westphal nucleus. This projection provides the parasympathetic (i.e., autonomic)

input to the pupil of the eye, allowing it to constrict. Also, there

is a direct retinal input to the suprachiasmatic nucleus of the hypothalamus

that is important in the generation and control of circadian

rhythms.

Other extrastriate cortical areas receive projections from the LGN, as well

as the pulvinar region of the thalamus. An area in the inferotemporal (IT)

cortex has been found to respond selectively to faces. This area has been

described as face-selective without implying that when one recognizes a

face, only cells in this area participate in this recognition (Tovee &

Cohen-Tovee, 1993; Tovee, 1998). IT cortex also has been shown to be

involved in storing visual memories, as well as encoding the properties of

objects (Kosslyn & Koenig, 1995). Finally, the visual unimodal and multimodal

association areas are linked up with other areas in the frontal, temporal,

and parietal lobes and the hippocampus so that visual images can

be made, stored, recalled, inspected, and possibly utilized in planning,

judging, feeling, and other complex voluntary activity.

As has been said, the systematic distributions of parts and processes in

the visual system, and related systems, are hierarchically organized. The

multimodal areas build upon information received from the unimodal

areas, and the unimodal areas build upon information received from the

primary sensory area, as is schematized in fi gure 3.2. It should be clear

from the aforementioned description of the neural wiring and functioning

associated with the mammalian visual system that there must be a massive

coordination and organization of processes in the CNS for a seemingly

simple activity—like recognizing an object—to occur. When I visited him

in his lab, Michael Ariel, a neuroscientist at Saint Louis University in St.

Louis who specializes in the visual systems of turtles, affi rmed this point

(see, e.g., Martin, Kogo, Fan, & Ariel, 2003). The realization that the

The Primary Visual Area and Pathways

Eye

LGN

V1 in the

Occipital Cortex

The Unimodal Visual Area and Pathways

Eye

Where System

Occiptotemporal

Cortex

What System

Occipital Cortex

Figure 3.2

The primary sensory, unimodal, and multimodal areas and pathways of the visual

System nervous system is such a grandiose architectonic has caused Gray (1999,

p. 31) to maintain, “The inescapable conclusion is that sensory, cognitive,

and motor processes result from parallel interactions among large populations

of neurons distributed among multiple cortical and subcortical

structures.”

Data are exchanged at the various levels of the visual system and CNS

and, because of these exchanges, an animal is able to form a coherent

picture of an object in its visual fi eld. However, a fi nal qualifi cation must

be made about the hierarchical processes of the visual system. We must

draw a distinction between a serial hierarchy and a dynamic, or interactive,

hierarchy. In a serial hierarchy, information fl ows in a one-way direction

from the lowest level to the highest level of the hierarchy. Conversely, in

an interactive hierarchy, information fl ows bidirectionally among and

between the lower and higher levels of the hierarchy.

Consider a small, fi ctitious corporation consisting of a worker, a manager,

and a CEO. The worker is the lowest member of the hierarchy, the manager

is one step above the worker, and the CEO is at the top level of the hierarchy.

The worker communicates two ideas to the manager who, in turn,

communicates these two ideas plus two more of his own ideas to the CEO.

Once the fi rst two ideas are communicated from worker to manager, there

is no further contact between the two people; likewise, once the four ideas

are communicated from manager to CEO, there is no further contact

The Multimodal Visual Area and Pathways

Eye

Parietotemporal

Limbic Cortex

Cortices

Prefrontal Cortex

Occiptotemporal

Cortex

Figure 3.2 (continued)

between those two people. This worker →manager →CEO setup would

be an example of one-way information fl ow in a serial hierarchy.

There is only one sense in which the visual system can be considered as

a serial hierarchy; otherwise it is most appropriately envisioned as an

interactive hierarchy. The information fl ow from retina through LGN to

V1 occurs in a one-way direction, like the fl ow of information from worker

to manager to CEO in the small corporation. There is no information

feedback from V1 to the retina, just as there is no information feedback

from CEO to worker in our fi ctitious corporation. This makes sense, since

the inputs of primary sensory areas themselves are passive and automatic

in-takers of information (see Sekuler & Blake, 2002).

Unlike the one-way fl ow of information between retina and V1, there is

the possibility for a dynamic, interactive, two-way fl ow of information

between and among the primary visual, unimodal, and multimodal areas

of the visual hierarchy. For example, Kosslyn & Koenig (1995) present evidence

that emotions (present at a higher level in the hierarchy) can affect

the visual system’s performance in terms of visual priming and coded

anticipation of certain visual scenes (present at a lower level in the visual

hierarchy).

Also, in their experiments with monkeys and humans, Sigala & Logothetis

(2002) show that visual categorization determines, in many ways,

what specifi c features of a face will be focused upon. In effect, the experiments

show that if a face is categorized generally in a certain way as expressive

of some particular emotion, then this categorization will infl uence

what particular features of a face—for example, basic lines, symmetries, and

the like—the animal subsequently will focus upon. The neural correlates

of visual categorization are found higher up in the visual hierarchy associated

with the occiptotemporal and parietotemporal cortices, while the

neural correlates concerning the processing of particular features of the

face, in terms of lines and symmetries, are found at a more basic spot in

the visual hierarchy associated with the trajectory between the temporal

and occipital cortices of the what system.

Here, we have an instance of a dynamic, interactive fl ow of information

in the visual hierarchy because, in addition to information regarding a

face’s particular features fl owing from the what system to the occiptotemporal

and parietotemporal cortices in facial categorization, there is fl ow of

information back from these cortices to the what system in terms of this

categorization’s determination of the particular features of a face that are

focused upon by an animal. Referring to the fi ctitious corporation, this

kind of information fl ow would be analogous to the CEO’s being in some

form of dialogue with the manager about his or her ideas, such that the

CEO has infl uence upon the manager’s ideas, and vice versa.