That perception and action share abstract representations is a key insight into the organization of intelligence. However, organizing behavior requires additional representations and processes which are not `early' sensing or `late' motion: structures for sequencing actions and arbitrating between behavior subsystems. These systems are described as a supplement to the Theory of Event Coding (TEC).
Hommel et. al. have made a substantial contribution in our understanding of the interrelation of perception and action. Considering that well-ordered action is the selective advantage of intelligence, it is clear that perception is fundamentally a part of action. Given our understanding of neural learning, it is also not surprising that as regularities in coding become established, either in perception or motor control, they will become linked directly to any other well-correlated systems that happen to be able to monitor their activity. It is generally held that this explains the separation of the sensory and motor cortical maps in all but the most primitive mammals: separation provides greater opportunity for the intervention of between-process action control (Livesey, 1986).
There is, however, more to action representation and control than what the Theory of Event Coding (TEC) currently provides. Also, this supplementary control is often not explicit or intentional, despite being at a higher level than the TEC representations the authors claim are necessarily intentionally accessed. Perception and action are also sometimes fused even more tightly than feature-based coding implies. In this commentary, I will address two forms of action control beyond those specified in TEC: the sequencing of basic actions, and arbitration between possible higher-level behaviors or goals.
That sequences of actions are themselves a privileged representation has been well supported for some time in the neuroscience literature (Lashley, 1951; Houghton and Hartley, 1995). Many expressed action sequences are too quick and intricate for perceptual feedback to control the transitions between them. Further evidence of privileged representations for behavior patterns is expressed by experiments where animals with their forebrains removed are still capable of conducting complex species-typical behaviors, although they are unable to apply these behaviors in appropriate contexts (Carlson, 2000). In particular, the periaqueductal grey matter has been implicated in complex species-typical behaviors such as mating rituals and predatory, defensive and maternal maneuvers (Lonstein and Stern, 1997).
Evidence of brain cells directly involved in processing sequential recall and behavior has also been found by Tanji et. al. (Tanji and Shima, 1994). Tanji et al. studied trained monkeys performing sequences of actions either from memory or by following cues. They recorded cells firing in the medial frontal cortices in three different functional contexts: 1) during a particular action, regardless of whether it was recalled or cued, 2) between particular action pairs, only when recalled, but regardless of which sequence they occurred in, and 3) between signal and initial action for a particular complete sequence when it must be performed from recall. These cell types made up 12%, 36%, and 26% of the cells recorded, respectively. The latter two types indicate special coding for types of transitions and full sequences of behavior.
Graziano et. al. have also recently shown that the premotor and motor cortices contain single cells representing not only complex motion sequences, but also multi-modal triggering of these sequences. Example behaviors generated by these cells include feeding (making a grasp at a particular location in egocentric space and bringing the hand to an opening mouth), scratching and ducking as if from blows or projectiles (simultaneously raising an arm and averting the head) (Graziano and Gandhi, 2000; Graziano et al., 2001). The multi-modal perceptual input is often keyed to a particular location relative to the relevant organ, e.g. the head for ducking, or a limb to be scratched, and can be triggered by sights, sounds, or tactile sensations. The action-sequencing cells seem to be organized in the same roughly topographic mapping (with respect to the location of the triggering stimuli) common in perceptual cortices from V1 through face orientation cells in the temporal cortex (Perrett et al., 1992).
Even if Graziano's cells reference common feature mapping for sensing and action in the way that TEC implies (which I think is unlikely), it seems unlikely that any intentionality is intervening between stages of this process. By this I do not mean to say TEC-like representations do not exist, only to say that perception and action are probably unified in a number of different ways, of which TEC is but one.
Another is the level of arbitrating between behaviors: the problem mentioned earlier of choosing appropriate contexts in which to act, or behavior processes to attend to. Arbitration is necessary in a parallel, distributed model of intelligence because an agent has finite resources (e.g. hands and eyes) which the different possible behaviors (e.g. feeding and mating) must share. Arbitration must take into account both the activation level of the various `input' cortical channels and previous experience in the current or related action-selection contexts.
Recently, the basal ganglia (BG) has been proposed as the system responsible for this aspect of action coordination (Prescott et al., forthcoming; Redgrave et al., 1999; Gurney et al., 1998; Mink, 1996). The BG is a group of functionally related structures in the forebrain, diencephalon and midbrain. Its main `output' centers -- parts of the substantia nigra, ventral tegmental area, and pallidum -- send inhibitory signals to neural centers throughout the brain which either directly or indirectly control movement, as well as other cognitive and sensory systems (Middleton and Strick, 2000), which is fitting for an arbitration system. Its `input' comes through the striatum from subsystems in both the brainstem and the forebrain, giving it access to both automatic and `intentional' cues for redirecting attention.
Finally, although my commentary has focussed on evidence from neuroscience, my understanding of the roles of sequence, arbitration, and sense/action coupling derives from my own experience with artificial intelligence. For 15 years, behavior-based AI (BBAI) has built intelligence by explicitly combining perception and action into uniform representations, producing systems capable of complex, animal-like behavior. Although the original BBAI systems were fully distributed (Brooks, 1986), difficulty in scaling the complexity of such systems have lead the field towards incorporating control structures such as sequence and arbitration for coordinating the actions of the behaviors (Kortenkamp et al., 1998; Bryson and Stein, 2001). I encourage readers to explore this literature for a further understanding of intelligent control.
Thanks to Will Lowe for his comments on an earlier draft. Parts of this document were generated using the LaTeX2HTML translator Version 99.2beta8 (1.42)