3.1 The Hierarchical Organization of the Mammalian Visual System
К оглавлению1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1617 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
34 35 36 37 38 39 40 41
In the fi rst chapter, I established that an organism is a living entity, the
components of which are hierarchically organized in subsystems and processes
operating so as to achieve particularized and generalized homeostasis
(HOV). The subsystems and processes possess certain properties,
including abilities to exchange data fl exibly, convert data to information
in a selection process, integrate information, and process information from
environments. In the second chapter, after using HOV and as-if realism to
give credence to nomological and representational emergence, respectively,
I argued that the Cummins organizational view and the Griffi ths/Godfrey-
Smith view can be made compatible with one another in providing a
complete defi nition of biological function. We should expect that the traits
of an organism function the way they do because such traits presently
contribute to the overall organization of the organism (Cummins) as well
as were selected for in the organism’s species’ recent ancestry (Griffi ths/
Godfrey-Smith). The work of the fi rst two chapters was accomplished for
the twofold purpose of giving further elucidation to Mayr’s description of
organisms as hierarchically organized systems that operate on the basis of
historically acquired programs of information, as well as ratifying Plotkin’s
claim that biological phenomena only make complete sense in light of
evolutionary theory.
In the next chapter, I deal with the evolution of the mammalian visual
system. In this chapter, building upon the work of the previous two chapters,
I show how the processes associated with vision in mammals comprise
a hierarchically organized system exhibiting the same kinds of properties
of information exchange, selectivity, and integration found in organisms
in general (also see Arp, 2005b). I restrict my analysis of the brain to the
primary processes and mechanisms associated with the mammalian visual system and visual cognition. I do this for three reasons. First, there is much
empirical evidence supporting our understanding of the mammalian visual
system’s structure and layout (van Essen, 1985, 1997; van Essen &
Maunsell, 1983; van Essen & Gallant, 1994; van Essen, Anderson, & Felleman,
1992; van Essen, Anderson, & Olshausen, 1994; van Essen et al.,
1998; Allman & Kaas, 1971; Desimone & Ungerleider, 1989; Mishkin,
Ungerleider, & Macko, 1983; Rueckl, Cave, & Kosslyn, 1989; Casagrande
& Kaas, 1994; Kosslyn & Koenig, 1995). Second, the visual system is present
in many kinds of vertebrate species thought to be homologous (i.e., having
evolved from a common ancestor) to human beings (Kaas, 1993, 1995,
1996; Northcutt & Kaas, 1995; Preuss, Qi, & Kaas, 1999; Harvey & Pagel,
1991; Desimone, 1992; Desimone, Albright, Gross, & Bruce, 1984; Karten
& Shimazu, 1989; Butler & Hodos, 1996; Tyler et al., 1998). Finally, I restrict
my scope to the visual system because it plays a central role in the evolutionary
account I give of the progression from noncognitive visual processing
to conscious cognitive visual processing in terms of scenario
visualization. I fortify what thinkers like Barton (1998), Crick (1994),
Carruthers (2002), and Allman (1977, 1982, 2000) have maintained,
namely, that visual processing is an important factor in the evolution of
conscious behavior, including creative problem solving.
The mammalian visual system is situated within the vertebrate nervous
system, while at the same time it is composed of neurons that are specialized
in their own processes. Recalling the schematization of triangles in
fi gure 1.1 in the fi rst chapter, the visual system is like a medium-sized triangle
made up of smaller triangles (the neuronal processes), existing in a
larger triangle (the nervous system) along with other systems like the auditory,
olfactory, and so forth. All of these triangles exist within the largest
triangle (the organism) as is schematized in fi gure 3.1. There is an elegant
consistency in the hierarchical organization exhibited from the microlevel
of the neuron to the macrolevel of the vertebrate nervous system. This
consistency is echoed in Bear, Connors, & Pardiso’s (2001, p. 161) claim
that the “signaling network within a single neuron resembles in some ways
the neural networks of the brain itself.” Hierarchies exist within hierarchies,
and, as we will see, the visual system is one of those hierarchies that
functions so as to aid in producing the architectonic organization of the
nervous system of an animal.
In the fi rst chapter, I proposed that these hierarchies are able to interact
with one another because of internal–hierarchical data exchange, whereby
data—the raw material that are of the kind that have the potential to
become useful for a process or operation—are exchanged between and
among the processes and subsystems at various levels of operation in an
organism. In their textbook devoted to the principles of neuroscience,
Kandel et al. (2000, p. 353) describe the processes associated with perception
in the cerebral cortex using a hierarchical model: “Sensory information
is fi rst received and interpreted by the primary sensory areas, then
sent to unimodal association areas, and fi nally to the multimodal sensory
areas. At each successive stage of this stream more complex analysis is
achieved, culminating eventually, as with vision, for example, in object
and pattern recognition in the inferotemporal cortex.”
Kandel et al.’s text is a standard work in neuroscience, and I use it as my
primary reference throughout this book. Kandel et al. actually divvy up
the hierarchy of sensory systems into four parts, namely, (1) the primary
sensory areas, (2) the unimodal areas, (3) the unimodal association areas,
and (4) the multimodal association areas.
The primary sensory areas act as base levels, and they refer to the parts
and processes associated with information that is initially communicated
to the spinal cord and/or brain through one of the fi ve sensory modalities,
namely, touch, hearing, taste, smell, and vision. For example, in the
visual system the primary sensory area comprises the eye, the lateral
geniculate nucleus (LGN), and the primary visual cortex located in the
occipital lobe of the brain. The unimodal areas build upon the data
The Organism
The Auditory System
The Visual System
The Olfactory System
The Vertebrate Nervous System
Neuronal Processes
Figure 3.1
The visual system hierarchy in relation to the organism received from some prior particular primary sensory area and refer to the
parts and processes associated with a higher level integration of the data
received from one of the primary sensory areas. In the visual system,
there are two primary unimodal areas that process information concerning
where an object is and what an object is, located along trajectories
between the occipital lobe and parietal and temporal regions, respectively.
The unimodal association areas, in turn, refer to parts and processes associated
with an even higher level integration of the data received from
two or more unimodal areas. In the visual system, the unimodal association
area integrates data about the color, motion, and form of objects
and is located in the occiptotemporal (also called occipitotemporal) area of
the brain. Finally, the multimodal association areas refer to parts and
processes associated with integrating the data received from the unimodal
association areas and, depending upon the sensory modality, process this
information in either the parietotemporal, parietal, temporal, and/or
frontal areas of the brain.
Having given this general overview of the hierarchy concerning perception
in the cerebral cortex and related areas, we now can give a
more specifi ed description of the visual hierarchy, along with its components
and processes. The components of the visual hierarchy are
comprised of groups of specialized neurons that “fi re” according to certain
external and internal stimulus cues, and the various processes of the
visual hierarchy are active when an object “comes into view,” as it were,
namely, when an object is recognized as present in a mammal’s visual
fi eld. In essence, what follows is a description of the neural wiring and
functioning associated with mammalian object recognition in the visual
system.
The primary sensory area of the visual system comprises the pathway
that starts with the retina of the eye and projects through the LGN of
the thalamus to the primary visual cortex of the occipital lobe (V1 or
Brodmann area 17). Photons of light are transduced into electrical signals
by the photoreceptor neurons that lie on the innermost layer of the retina
known as rods and cones. Rods are sensitive to dim light, while cones are
sensitive to brighter light. The photoreceptors make synapses with other
kinds of neurons known as horizontal and bipolar cells. The horizontal
cells primarily are responsible for the center-surround organization of the
receptive fi eld of the bipolar cell. The bipolar cells receive synapses from
photoreceptors, horizontal cells, and other neurons known as amacrine
cells and relay data from the photoreceptors to the ganglion cells, which
send their axons to the brain via the optic nerve. The ganglion cells project to a number of sites, including several cortical areas through the thalamus,
the hypothalamus, and midbrain. The major cortical projection is via the
LGN of the thalamus to the primary visual cortex in the occipital lobe
(Kandel et al., 2000; Zigmond, Bloom, Landis, Roberts, Squire, & Wooley,
1999; Bear et al., 2001).
The LGN consists of six layers in primates. The inner two layers, with
their large neurons, form the magnocellular laminae (literally, big-celled
layers); while the remaining four layers, with their smaller neurons, constitute
the parvocellular laminae (small-celled layers). Intercalated between
these principal laminae are the koniocellular neurons (K cells). In their
fi ring responses, the magnocellular neurons (M cells) are sensitive to
motion especially, while the parvocellular neurons (P cells) are responsive
to color.
Like the LGN, the primary visual cortex in the occipital lobe (again,
known as V1 or Brodmann area 17) is made up of six primary layers in
primates. The LGN mainly projects to layer IV of V1, and to a lesser extent
to layer VI, with the M and P channels having different synaptic targets
within these laminae. There is also a projection from cells in the intralaminar
part of the LGN directly to layers II and III of V1. The layer IV
neurons project on to adjacent neurons in such a way as to form what are
known as orientation-specifi c columns, ocular-dominance columns, and
blobs. Orientation-specifi c columns are responsible for the decomposition
of objects of the visual fi eld into short line segments of varying orientation
form. Ocular-dominance columns are responsible for the combination of
input from the two eyes so as to perceive the depth associated with an
object and its background. Blobs are responsible for processing wavelength
information, which ultimately contributes to the recognition of various
colors of objects.
The occipital lobe is split into many visual-related areas, each processing
an aspect of an object in the visual fi eld. V1 is responsible for initial visual
processing and can be subdivided into different subregions, each containing
a full representation of the visual fi eld for the contralateral world.
However, after the initial processing in V1, the processing that takes place
in other regions of the occipital lobe is more specialized: V2 is responsible
for stereo vision, V3 for distance, V4 for color, V5 for motion, and V6 for
object position. Van Essen et al. (1992) have recorded more than thirty
primary visual areas in the macaque monkey. Through positron-emission
tomography (PET) and functional magnetic resonance imaging (fMRI)
scans, Zeki, Watson, Weck, Friston, Kennard, & Frackowiak (1991) and
Sereno et al. (1995) have demonstrated that there are multiple visual areas
62 Chapter 3
in humans devoted to specifi c analysis of the properties of an object in the
visual fi eld.
So far, I have described what Kandel et al. would call the visual primary
sensory area of the visual hierarchy. From this area, another level is added
to the hierarchy as cortical projections are laid out along two visual unimodal
areas, namely, the M cell and P cell pathways. The M cell pathway
is also known as the parietal or dorsal pathway, and it consists of visual
areas laid out along a trajectory from the occipital region, through V1, V2,
V3, V5, and V6, to the parietal region of the brain. Research suggests that
the M cell pathway is responsible for guiding our actions in our visual
environment, since depth, motion, and object position—that is, where an
object is, independent of what the object is—appear to be processed along
its stream. Conversely, the P cell pathway is also known as the temporal,
or ventral, pathway, and it consists of visual areas laid out along a trajectory
from the occipital region, through V1, V2, and V4, to the temporal
region of the brain. Research suggests that the P cell pathway is responsible
for the color and form recognition of an object—that is, what an object is,
independent of where the object is (see Desimone & Ungerleider, 1989;
Ungerleider & Mishkin, 1982; Ungerleider & Haxby, 1994; Goodale et al.,
1994).
Yet another level is added to the visual hierarchy as neurons from
the parietal and temporal visual unimodal areas project to the visual
unimodal association area in the occiptotemporal cortex of the brain. This
area is considered more complex than the unimodal areas because neurons
there are involved in integrating the processed data received from the
parietal and temporal unimodal areas concerning color, motion, depth,
form, distance, and the like. Research has shown that there is a division
of labor concerning an animal’s abilities to distinguish what an object
is from where an object is. However, there are times when an animal
must perform both of these tasks, and given the neuronal projections
from the parietal and temporal areas to this common site in the occiptotemporal
cortex, it makes sense that an animal be able to integrate
visual information about what and where an object is in its visual fi eld
at the same time.
At the highest level of the visual hierarchy, the visual unimodal association
area projects to the multimodal association areas of the prefrontal,
parietotemporal, and limbic cortices. This is the level at which visual information
is integrated with other sensory information, as well as where
motor planning, attention, emotion, language production, and judgment
take place.
There are a number of other parts of the central nervous system (CNS)
that exchange data with the visual system, including the posterior parietal
cortex, the subcortical structures of the hypothalamus, and upper brainstem.
The neurons in the posterior parietal cortex respond to stimuli of
interest and are probably involved in visual fi xation and tracking. The
superior colliculus in the midbrain is a multilayered structure wherein the
outer layers are involved in mapping the visual fi eld, the intermediate areas
are involved in saccadic eye movements, and the deeper layers are involved
with more complex sensory integration involving visual, auditory, and
somatosensory stimuli. There is a projection from the optic tract to the
pretectal nuclei of the midbrain that, in turn, projects to the Edinger–
Westphal nucleus. This projection provides the parasympathetic (i.e., autonomic)
input to the pupil of the eye, allowing it to constrict. Also, there
is a direct retinal input to the suprachiasmatic nucleus of the hypothalamus
that is important in the generation and control of circadian
rhythms.
Other extrastriate cortical areas receive projections from the LGN, as well
as the pulvinar region of the thalamus. An area in the inferotemporal (IT)
cortex has been found to respond selectively to faces. This area has been
described as face-selective without implying that when one recognizes a
face, only cells in this area participate in this recognition (Tovee &
Cohen-Tovee, 1993; Tovee, 1998). IT cortex also has been shown to be
involved in storing visual memories, as well as encoding the properties of
objects (Kosslyn & Koenig, 1995). Finally, the visual unimodal and multimodal
association areas are linked up with other areas in the frontal, temporal,
and parietal lobes and the hippocampus so that visual images can
be made, stored, recalled, inspected, and possibly utilized in planning,
judging, feeling, and other complex voluntary activity.
As has been said, the systematic distributions of parts and processes in
the visual system, and related systems, are hierarchically organized. The
multimodal areas build upon information received from the unimodal
areas, and the unimodal areas build upon information received from the
primary sensory area, as is schematized in fi gure 3.2. It should be clear
from the aforementioned description of the neural wiring and functioning
associated with the mammalian visual system that there must be a massive
coordination and organization of processes in the CNS for a seemingly
simple activity—like recognizing an object—to occur. When I visited him
in his lab, Michael Ariel, a neuroscientist at Saint Louis University in St.
Louis who specializes in the visual systems of turtles, affi rmed this point
(see, e.g., Martin, Kogo, Fan, & Ariel, 2003). The realization that the
The Primary Visual Area and Pathways
Eye
LGN
V1 in the
Occipital Cortex
The Unimodal Visual Area and Pathways
Eye
Where System
Occiptotemporal
Cortex
What System
Occipital Cortex
Figure 3.2
The primary sensory, unimodal, and multimodal areas and pathways of the visual
System nervous system is such a grandiose architectonic has caused Gray (1999,
p. 31) to maintain, “The inescapable conclusion is that sensory, cognitive,
and motor processes result from parallel interactions among large populations
of neurons distributed among multiple cortical and subcortical
structures.”
Data are exchanged at the various levels of the visual system and CNS
and, because of these exchanges, an animal is able to form a coherent
picture of an object in its visual fi eld. However, a fi nal qualifi cation must
be made about the hierarchical processes of the visual system. We must
draw a distinction between a serial hierarchy and a dynamic, or interactive,
hierarchy. In a serial hierarchy, information fl ows in a one-way direction
from the lowest level to the highest level of the hierarchy. Conversely, in
an interactive hierarchy, information fl ows bidirectionally among and
between the lower and higher levels of the hierarchy.
Consider a small, fi ctitious corporation consisting of a worker, a manager,
and a CEO. The worker is the lowest member of the hierarchy, the manager
is one step above the worker, and the CEO is at the top level of the hierarchy.
The worker communicates two ideas to the manager who, in turn,
communicates these two ideas plus two more of his own ideas to the CEO.
Once the fi rst two ideas are communicated from worker to manager, there
is no further contact between the two people; likewise, once the four ideas
are communicated from manager to CEO, there is no further contact
The Multimodal Visual Area and Pathways
Eye
Parietotemporal
Limbic Cortex
Cortices
Prefrontal Cortex
Occiptotemporal
Cortex
Figure 3.2 (continued)
between those two people. This worker →manager →CEO setup would
be an example of one-way information fl ow in a serial hierarchy.
There is only one sense in which the visual system can be considered as
a serial hierarchy; otherwise it is most appropriately envisioned as an
interactive hierarchy. The information fl ow from retina through LGN to
V1 occurs in a one-way direction, like the fl ow of information from worker
to manager to CEO in the small corporation. There is no information
feedback from V1 to the retina, just as there is no information feedback
from CEO to worker in our fi ctitious corporation. This makes sense, since
the inputs of primary sensory areas themselves are passive and automatic
in-takers of information (see Sekuler & Blake, 2002).
Unlike the one-way fl ow of information between retina and V1, there is
the possibility for a dynamic, interactive, two-way fl ow of information
between and among the primary visual, unimodal, and multimodal areas
of the visual hierarchy. For example, Kosslyn & Koenig (1995) present evidence
that emotions (present at a higher level in the hierarchy) can affect
the visual system’s performance in terms of visual priming and coded
anticipation of certain visual scenes (present at a lower level in the visual
hierarchy).
Also, in their experiments with monkeys and humans, Sigala & Logothetis
(2002) show that visual categorization determines, in many ways,
what specifi c features of a face will be focused upon. In effect, the experiments
show that if a face is categorized generally in a certain way as expressive
of some particular emotion, then this categorization will infl uence
what particular features of a face—for example, basic lines, symmetries, and
the like—the animal subsequently will focus upon. The neural correlates
of visual categorization are found higher up in the visual hierarchy associated
with the occiptotemporal and parietotemporal cortices, while the
neural correlates concerning the processing of particular features of the
face, in terms of lines and symmetries, are found at a more basic spot in
the visual hierarchy associated with the trajectory between the temporal
and occipital cortices of the what system.
Here, we have an instance of a dynamic, interactive fl ow of information
in the visual hierarchy because, in addition to information regarding a
face’s particular features fl owing from the what system to the occiptotemporal
and parietotemporal cortices in facial categorization, there is fl ow of
information back from these cortices to the what system in terms of this
categorization’s determination of the particular features of a face that are
focused upon by an animal. Referring to the fi ctitious corporation, this
kind of information fl ow would be analogous to the CEO’s being in some
form of dialogue with the manager about his or her ideas, such that the
CEO has infl uence upon the manager’s ideas, and vice versa.
In the fi rst chapter, I established that an organism is a living entity, the
components of which are hierarchically organized in subsystems and processes
operating so as to achieve particularized and generalized homeostasis
(HOV). The subsystems and processes possess certain properties,
including abilities to exchange data fl exibly, convert data to information
in a selection process, integrate information, and process information from
environments. In the second chapter, after using HOV and as-if realism to
give credence to nomological and representational emergence, respectively,
I argued that the Cummins organizational view and the Griffi ths/Godfrey-
Smith view can be made compatible with one another in providing a
complete defi nition of biological function. We should expect that the traits
of an organism function the way they do because such traits presently
contribute to the overall organization of the organism (Cummins) as well
as were selected for in the organism’s species’ recent ancestry (Griffi ths/
Godfrey-Smith). The work of the fi rst two chapters was accomplished for
the twofold purpose of giving further elucidation to Mayr’s description of
organisms as hierarchically organized systems that operate on the basis of
historically acquired programs of information, as well as ratifying Plotkin’s
claim that biological phenomena only make complete sense in light of
evolutionary theory.
In the next chapter, I deal with the evolution of the mammalian visual
system. In this chapter, building upon the work of the previous two chapters,
I show how the processes associated with vision in mammals comprise
a hierarchically organized system exhibiting the same kinds of properties
of information exchange, selectivity, and integration found in organisms
in general (also see Arp, 2005b). I restrict my analysis of the brain to the
primary processes and mechanisms associated with the mammalian visual system and visual cognition. I do this for three reasons. First, there is much
empirical evidence supporting our understanding of the mammalian visual
system’s structure and layout (van Essen, 1985, 1997; van Essen &
Maunsell, 1983; van Essen & Gallant, 1994; van Essen, Anderson, & Felleman,
1992; van Essen, Anderson, & Olshausen, 1994; van Essen et al.,
1998; Allman & Kaas, 1971; Desimone & Ungerleider, 1989; Mishkin,
Ungerleider, & Macko, 1983; Rueckl, Cave, & Kosslyn, 1989; Casagrande
& Kaas, 1994; Kosslyn & Koenig, 1995). Second, the visual system is present
in many kinds of vertebrate species thought to be homologous (i.e., having
evolved from a common ancestor) to human beings (Kaas, 1993, 1995,
1996; Northcutt & Kaas, 1995; Preuss, Qi, & Kaas, 1999; Harvey & Pagel,
1991; Desimone, 1992; Desimone, Albright, Gross, & Bruce, 1984; Karten
& Shimazu, 1989; Butler & Hodos, 1996; Tyler et al., 1998). Finally, I restrict
my scope to the visual system because it plays a central role in the evolutionary
account I give of the progression from noncognitive visual processing
to conscious cognitive visual processing in terms of scenario
visualization. I fortify what thinkers like Barton (1998), Crick (1994),
Carruthers (2002), and Allman (1977, 1982, 2000) have maintained,
namely, that visual processing is an important factor in the evolution of
conscious behavior, including creative problem solving.
The mammalian visual system is situated within the vertebrate nervous
system, while at the same time it is composed of neurons that are specialized
in their own processes. Recalling the schematization of triangles in
fi gure 1.1 in the fi rst chapter, the visual system is like a medium-sized triangle
made up of smaller triangles (the neuronal processes), existing in a
larger triangle (the nervous system) along with other systems like the auditory,
olfactory, and so forth. All of these triangles exist within the largest
triangle (the organism) as is schematized in fi gure 3.1. There is an elegant
consistency in the hierarchical organization exhibited from the microlevel
of the neuron to the macrolevel of the vertebrate nervous system. This
consistency is echoed in Bear, Connors, & Pardiso’s (2001, p. 161) claim
that the “signaling network within a single neuron resembles in some ways
the neural networks of the brain itself.” Hierarchies exist within hierarchies,
and, as we will see, the visual system is one of those hierarchies that
functions so as to aid in producing the architectonic organization of the
nervous system of an animal.
In the fi rst chapter, I proposed that these hierarchies are able to interact
with one another because of internal–hierarchical data exchange, whereby
data—the raw material that are of the kind that have the potential to
become useful for a process or operation—are exchanged between and
among the processes and subsystems at various levels of operation in an
organism. In their textbook devoted to the principles of neuroscience,
Kandel et al. (2000, p. 353) describe the processes associated with perception
in the cerebral cortex using a hierarchical model: “Sensory information
is fi rst received and interpreted by the primary sensory areas, then
sent to unimodal association areas, and fi nally to the multimodal sensory
areas. At each successive stage of this stream more complex analysis is
achieved, culminating eventually, as with vision, for example, in object
and pattern recognition in the inferotemporal cortex.”
Kandel et al.’s text is a standard work in neuroscience, and I use it as my
primary reference throughout this book. Kandel et al. actually divvy up
the hierarchy of sensory systems into four parts, namely, (1) the primary
sensory areas, (2) the unimodal areas, (3) the unimodal association areas,
and (4) the multimodal association areas.
The primary sensory areas act as base levels, and they refer to the parts
and processes associated with information that is initially communicated
to the spinal cord and/or brain through one of the fi ve sensory modalities,
namely, touch, hearing, taste, smell, and vision. For example, in the
visual system the primary sensory area comprises the eye, the lateral
geniculate nucleus (LGN), and the primary visual cortex located in the
occipital lobe of the brain. The unimodal areas build upon the data
The Organism
The Auditory System
The Visual System
The Olfactory System
The Vertebrate Nervous System
Neuronal Processes
Figure 3.1
The visual system hierarchy in relation to the organism received from some prior particular primary sensory area and refer to the
parts and processes associated with a higher level integration of the data
received from one of the primary sensory areas. In the visual system,
there are two primary unimodal areas that process information concerning
where an object is and what an object is, located along trajectories
between the occipital lobe and parietal and temporal regions, respectively.
The unimodal association areas, in turn, refer to parts and processes associated
with an even higher level integration of the data received from
two or more unimodal areas. In the visual system, the unimodal association
area integrates data about the color, motion, and form of objects
and is located in the occiptotemporal (also called occipitotemporal) area of
the brain. Finally, the multimodal association areas refer to parts and
processes associated with integrating the data received from the unimodal
association areas and, depending upon the sensory modality, process this
information in either the parietotemporal, parietal, temporal, and/or
frontal areas of the brain.
Having given this general overview of the hierarchy concerning perception
in the cerebral cortex and related areas, we now can give a
more specifi ed description of the visual hierarchy, along with its components
and processes. The components of the visual hierarchy are
comprised of groups of specialized neurons that “fi re” according to certain
external and internal stimulus cues, and the various processes of the
visual hierarchy are active when an object “comes into view,” as it were,
namely, when an object is recognized as present in a mammal’s visual
fi eld. In essence, what follows is a description of the neural wiring and
functioning associated with mammalian object recognition in the visual
system.
The primary sensory area of the visual system comprises the pathway
that starts with the retina of the eye and projects through the LGN of
the thalamus to the primary visual cortex of the occipital lobe (V1 or
Brodmann area 17). Photons of light are transduced into electrical signals
by the photoreceptor neurons that lie on the innermost layer of the retina
known as rods and cones. Rods are sensitive to dim light, while cones are
sensitive to brighter light. The photoreceptors make synapses with other
kinds of neurons known as horizontal and bipolar cells. The horizontal
cells primarily are responsible for the center-surround organization of the
receptive fi eld of the bipolar cell. The bipolar cells receive synapses from
photoreceptors, horizontal cells, and other neurons known as amacrine
cells and relay data from the photoreceptors to the ganglion cells, which
send their axons to the brain via the optic nerve. The ganglion cells project to a number of sites, including several cortical areas through the thalamus,
the hypothalamus, and midbrain. The major cortical projection is via the
LGN of the thalamus to the primary visual cortex in the occipital lobe
(Kandel et al., 2000; Zigmond, Bloom, Landis, Roberts, Squire, & Wooley,
1999; Bear et al., 2001).
The LGN consists of six layers in primates. The inner two layers, with
their large neurons, form the magnocellular laminae (literally, big-celled
layers); while the remaining four layers, with their smaller neurons, constitute
the parvocellular laminae (small-celled layers). Intercalated between
these principal laminae are the koniocellular neurons (K cells). In their
fi ring responses, the magnocellular neurons (M cells) are sensitive to
motion especially, while the parvocellular neurons (P cells) are responsive
to color.
Like the LGN, the primary visual cortex in the occipital lobe (again,
known as V1 or Brodmann area 17) is made up of six primary layers in
primates. The LGN mainly projects to layer IV of V1, and to a lesser extent
to layer VI, with the M and P channels having different synaptic targets
within these laminae. There is also a projection from cells in the intralaminar
part of the LGN directly to layers II and III of V1. The layer IV
neurons project on to adjacent neurons in such a way as to form what are
known as orientation-specifi c columns, ocular-dominance columns, and
blobs. Orientation-specifi c columns are responsible for the decomposition
of objects of the visual fi eld into short line segments of varying orientation
form. Ocular-dominance columns are responsible for the combination of
input from the two eyes so as to perceive the depth associated with an
object and its background. Blobs are responsible for processing wavelength
information, which ultimately contributes to the recognition of various
colors of objects.
The occipital lobe is split into many visual-related areas, each processing
an aspect of an object in the visual fi eld. V1 is responsible for initial visual
processing and can be subdivided into different subregions, each containing
a full representation of the visual fi eld for the contralateral world.
However, after the initial processing in V1, the processing that takes place
in other regions of the occipital lobe is more specialized: V2 is responsible
for stereo vision, V3 for distance, V4 for color, V5 for motion, and V6 for
object position. Van Essen et al. (1992) have recorded more than thirty
primary visual areas in the macaque monkey. Through positron-emission
tomography (PET) and functional magnetic resonance imaging (fMRI)
scans, Zeki, Watson, Weck, Friston, Kennard, & Frackowiak (1991) and
Sereno et al. (1995) have demonstrated that there are multiple visual areas
62 Chapter 3
in humans devoted to specifi c analysis of the properties of an object in the
visual fi eld.
So far, I have described what Kandel et al. would call the visual primary
sensory area of the visual hierarchy. From this area, another level is added
to the hierarchy as cortical projections are laid out along two visual unimodal
areas, namely, the M cell and P cell pathways. The M cell pathway
is also known as the parietal or dorsal pathway, and it consists of visual
areas laid out along a trajectory from the occipital region, through V1, V2,
V3, V5, and V6, to the parietal region of the brain. Research suggests that
the M cell pathway is responsible for guiding our actions in our visual
environment, since depth, motion, and object position—that is, where an
object is, independent of what the object is—appear to be processed along
its stream. Conversely, the P cell pathway is also known as the temporal,
or ventral, pathway, and it consists of visual areas laid out along a trajectory
from the occipital region, through V1, V2, and V4, to the temporal
region of the brain. Research suggests that the P cell pathway is responsible
for the color and form recognition of an object—that is, what an object is,
independent of where the object is (see Desimone & Ungerleider, 1989;
Ungerleider & Mishkin, 1982; Ungerleider & Haxby, 1994; Goodale et al.,
1994).
Yet another level is added to the visual hierarchy as neurons from
the parietal and temporal visual unimodal areas project to the visual
unimodal association area in the occiptotemporal cortex of the brain. This
area is considered more complex than the unimodal areas because neurons
there are involved in integrating the processed data received from the
parietal and temporal unimodal areas concerning color, motion, depth,
form, distance, and the like. Research has shown that there is a division
of labor concerning an animal’s abilities to distinguish what an object
is from where an object is. However, there are times when an animal
must perform both of these tasks, and given the neuronal projections
from the parietal and temporal areas to this common site in the occiptotemporal
cortex, it makes sense that an animal be able to integrate
visual information about what and where an object is in its visual fi eld
at the same time.
At the highest level of the visual hierarchy, the visual unimodal association
area projects to the multimodal association areas of the prefrontal,
parietotemporal, and limbic cortices. This is the level at which visual information
is integrated with other sensory information, as well as where
motor planning, attention, emotion, language production, and judgment
take place.
There are a number of other parts of the central nervous system (CNS)
that exchange data with the visual system, including the posterior parietal
cortex, the subcortical structures of the hypothalamus, and upper brainstem.
The neurons in the posterior parietal cortex respond to stimuli of
interest and are probably involved in visual fi xation and tracking. The
superior colliculus in the midbrain is a multilayered structure wherein the
outer layers are involved in mapping the visual fi eld, the intermediate areas
are involved in saccadic eye movements, and the deeper layers are involved
with more complex sensory integration involving visual, auditory, and
somatosensory stimuli. There is a projection from the optic tract to the
pretectal nuclei of the midbrain that, in turn, projects to the Edinger–
Westphal nucleus. This projection provides the parasympathetic (i.e., autonomic)
input to the pupil of the eye, allowing it to constrict. Also, there
is a direct retinal input to the suprachiasmatic nucleus of the hypothalamus
that is important in the generation and control of circadian
rhythms.
Other extrastriate cortical areas receive projections from the LGN, as well
as the pulvinar region of the thalamus. An area in the inferotemporal (IT)
cortex has been found to respond selectively to faces. This area has been
described as face-selective without implying that when one recognizes a
face, only cells in this area participate in this recognition (Tovee &
Cohen-Tovee, 1993; Tovee, 1998). IT cortex also has been shown to be
involved in storing visual memories, as well as encoding the properties of
objects (Kosslyn & Koenig, 1995). Finally, the visual unimodal and multimodal
association areas are linked up with other areas in the frontal, temporal,
and parietal lobes and the hippocampus so that visual images can
be made, stored, recalled, inspected, and possibly utilized in planning,
judging, feeling, and other complex voluntary activity.
As has been said, the systematic distributions of parts and processes in
the visual system, and related systems, are hierarchically organized. The
multimodal areas build upon information received from the unimodal
areas, and the unimodal areas build upon information received from the
primary sensory area, as is schematized in fi gure 3.2. It should be clear
from the aforementioned description of the neural wiring and functioning
associated with the mammalian visual system that there must be a massive
coordination and organization of processes in the CNS for a seemingly
simple activity—like recognizing an object—to occur. When I visited him
in his lab, Michael Ariel, a neuroscientist at Saint Louis University in St.
Louis who specializes in the visual systems of turtles, affi rmed this point
(see, e.g., Martin, Kogo, Fan, & Ariel, 2003). The realization that the
The Primary Visual Area and Pathways
Eye
LGN
V1 in the
Occipital Cortex
The Unimodal Visual Area and Pathways
Eye
Where System
Occiptotemporal
Cortex
What System
Occipital Cortex
Figure 3.2
The primary sensory, unimodal, and multimodal areas and pathways of the visual
System nervous system is such a grandiose architectonic has caused Gray (1999,
p. 31) to maintain, “The inescapable conclusion is that sensory, cognitive,
and motor processes result from parallel interactions among large populations
of neurons distributed among multiple cortical and subcortical
structures.”
Data are exchanged at the various levels of the visual system and CNS
and, because of these exchanges, an animal is able to form a coherent
picture of an object in its visual fi eld. However, a fi nal qualifi cation must
be made about the hierarchical processes of the visual system. We must
draw a distinction between a serial hierarchy and a dynamic, or interactive,
hierarchy. In a serial hierarchy, information fl ows in a one-way direction
from the lowest level to the highest level of the hierarchy. Conversely, in
an interactive hierarchy, information fl ows bidirectionally among and
between the lower and higher levels of the hierarchy.
Consider a small, fi ctitious corporation consisting of a worker, a manager,
and a CEO. The worker is the lowest member of the hierarchy, the manager
is one step above the worker, and the CEO is at the top level of the hierarchy.
The worker communicates two ideas to the manager who, in turn,
communicates these two ideas plus two more of his own ideas to the CEO.
Once the fi rst two ideas are communicated from worker to manager, there
is no further contact between the two people; likewise, once the four ideas
are communicated from manager to CEO, there is no further contact
The Multimodal Visual Area and Pathways
Eye
Parietotemporal
Limbic Cortex
Cortices
Prefrontal Cortex
Occiptotemporal
Cortex
Figure 3.2 (continued)
between those two people. This worker →manager →CEO setup would
be an example of one-way information fl ow in a serial hierarchy.
There is only one sense in which the visual system can be considered as
a serial hierarchy; otherwise it is most appropriately envisioned as an
interactive hierarchy. The information fl ow from retina through LGN to
V1 occurs in a one-way direction, like the fl ow of information from worker
to manager to CEO in the small corporation. There is no information
feedback from V1 to the retina, just as there is no information feedback
from CEO to worker in our fi ctitious corporation. This makes sense, since
the inputs of primary sensory areas themselves are passive and automatic
in-takers of information (see Sekuler & Blake, 2002).
Unlike the one-way fl ow of information between retina and V1, there is
the possibility for a dynamic, interactive, two-way fl ow of information
between and among the primary visual, unimodal, and multimodal areas
of the visual hierarchy. For example, Kosslyn & Koenig (1995) present evidence
that emotions (present at a higher level in the hierarchy) can affect
the visual system’s performance in terms of visual priming and coded
anticipation of certain visual scenes (present at a lower level in the visual
hierarchy).
Also, in their experiments with monkeys and humans, Sigala & Logothetis
(2002) show that visual categorization determines, in many ways,
what specifi c features of a face will be focused upon. In effect, the experiments
show that if a face is categorized generally in a certain way as expressive
of some particular emotion, then this categorization will infl uence
what particular features of a face—for example, basic lines, symmetries, and
the like—the animal subsequently will focus upon. The neural correlates
of visual categorization are found higher up in the visual hierarchy associated
with the occiptotemporal and parietotemporal cortices, while the
neural correlates concerning the processing of particular features of the
face, in terms of lines and symmetries, are found at a more basic spot in
the visual hierarchy associated with the trajectory between the temporal
and occipital cortices of the what system.
Here, we have an instance of a dynamic, interactive fl ow of information
in the visual hierarchy because, in addition to information regarding a
face’s particular features fl owing from the what system to the occiptotemporal
and parietotemporal cortices in facial categorization, there is fl ow of
information back from these cortices to the what system in terms of this
categorization’s determination of the particular features of a face that are
focused upon by an animal. Referring to the fi ctitious corporation, this
kind of information fl ow would be analogous to the CEO’s being in some
form of dialogue with the manager about his or her ideas, such that the
CEO has infl uence upon the manager’s ideas, and vice versa.