Object Vision and Spatial Vision
Two different streams for object vision and spatial vision were first inferred by contrasting the effects of inferior temporal and posterior parietal lesions in macaque monkeys (Ungerleider and Mishkin, 1982).
Since this early work, multiple visual cortical areas in both monkeys and humans have been shown to be organised into two functionally and anatomically distinct streams - a ventral stream projecting to the inferior temporal cortex for object perception, and a dorsal stream connected to the parietal cortex for visuospatial control of movement. In an admittedly simplified way of summarising the clinical and experimental data one may attribute to the ventral stream the processing of "what" and to the dorsal stream the processing of "where" and "how".
Neuropsychologists accepted willingly the Ungerleider and Mishkin proposal. Indeed, since the beginning of the 20th century, the neuropsychological literature has distinguished syndromes related to occipito-parietal lesions (Balint's syndrome, Holmes syndrome) and those related to occipito-temporal lesions (visual agnosia, prosopagnosia, alexia).
Some questions may arise from the relative independence of these two postulated streams: is object vision possible without any spatial vision? In other words, how to identify complex shapes or objects without analysing the spatial relationships between their different parts? Functional neuroimaging evidence in any case indicates that visual object processing takes place in both the dorsal and the ventral streams (e.g. Kraut, 1997).
Milner and Goodale (1993) have argued that the “What-Where” model cannot account for some new behavioural data. Their neuropsychological investigations have shown that a patient with visual form agnosia remained able to grasp precisely objects of different size and orientation, while perceptively the patient could not appreciate size nor orientation. This patient was able to process “what” to a limited degree, but only when acting. In Goodale and Milner’s proposal (1992), both streams are assumed to use information about objects and their locations, but each stream, however, uses this information in different ways.
According to these authors, the ventral stream carries out transformations concerning characteristics of objects and their internal spatial relations, therefore allowing the formation of long-term perceptual representations that are necessary to identify and recognise objects. In contrast the dorsal stream carries out transformations which use moment-to-moment information about objects and their locations in egocentric frames of reference, thus mediating the visual control of skilled actions (such as pointing, reaching, and grasping).





