The dorsal pathway


From the superior temporal sulcus we move up to the beginning of the dorsal pathway at the boundary of the temporal and parietal lobes near the Sylvian fissure. The dorsal pathway maps auditory sensory representations onto articulatory motor representations.

Hickok & Poeppel’s basic argument for the need for the dorsal pathway is that …

Learning to speak is essentially a motor learning task. The primary input to this is sensory, speech in particular. So, there must be a neural mechanism that both codes and maintains instances of speech sounds, and can use these sensory traces to guide the tuning of speech gestures so that the sounds are accurately reproduced.

They point to the fMRI studies of [BHH01] and [HBHM03], both of which were reviewed in [HB03]. In these experiments, subjects were asked to listen to pseudo-words and then repeat them sub-vocally, that is, in their imagination as opposed to out loud. The analysis combined the regions that were active for both the sensory task of listening to the stimuli and the motor task of repeating them sub-vocally into the image below:


Fig. 96 Bilateral activation in the superior temporal sulcus (STS), along with unilateral activation of area Spt and frontal regions. [1]

Will the real dorsal pathway please stand up?

In the dual pathway model of [HP07], the dorsal stream is diagrammed as so:


Fig. 97 Hickok & Poeppel’s dual pathway model. [2]

I have redrawn the diagram to make it easier to follow the flow, and in particular its input from the ear and its output to the voice, which are themselves connected to the extent that you hears your own voice as you speak:


Fig. 98 Hickok & Poeppel’s dual pathway model, with auditory feedback. [3]

Yet this diagram still does not satisfy me, because the “articulatory network” contains three cortical areas – pIFG, PM, and anterior insula – all of which have important yet distinct roles to play. In the next diagram I have split the articulatory network box into two, on the basis of some data that are discussed in the next section:


Fig. 99 An expanded dorsal stream. Author’s diagram.


Fig. 100 Repetition of the question “What is your name?” in my extended version of Hickok & Poeppel’s dual pathway model. Author’s diagram.

State feedback control

There is a problem with the execution of motor commands that [Hic12] introduces by way of driving a car.

The analogy to driving

Imagine driving a car on a racetrack while only looking in the rear-view mirror. From this perspective, it is possible to determine whether the car is on the track and pointed roughly in the right direction. It is also possible to successfully negotiate the track under one of two conditions: the track is perfectly straight or you drive extremely slowly, inching forward, checking the car’s position, making a correction, and inching forward again. It might be possible to learn to negotiate the track more quickly after considerable practice; that is, by learning to predict when to turn on the basis of landmarks that you can see in the mirror. However, you will never win a race against someone who can look out of the front window, and an unexpected event such as an obstacle in the road ahead could prove catastrophic. The reason for these outcomes is obvious: the rear-view mirror can only provide direct information about where you have been, not where you are or what is coming in the future.

The brain faces a similar problem

[Hic12] continues by pointing out that cerebral motor control presents the nervous system with precisely the same problem:

As we reach for a cup, we receive visual and somatosensory feedback. However, as a result of neural transmission and processing delays, which can be significant, by the time the brain can determine the position of the arm based on sensory feedback, it is no longer at that position. This discrepancy between the actual and directly perceived state of the arm is not much of an issue if the movement is highly practiced and is on target. If a correction to a movement is needed, however, the nervous system has a problem because the required correctional forces are dependent on the position of the limb at the time of the arrival of the correction signal — that is, in the future. Sensory feedback alone cannot support such a correction efficiently. As with the car analogy, one way to get around this problem is to execute only very slow, piecemeal movements.

The solution from robotic and industrial control

The solution rather obviously involves looking out of the ‘front window’. In robotic and industrial control, the ‘front window’ is called a forward model that predicts the current and future states of effectors. The general form of a system that combines feedback monitoring with forward predictions is set forth in Fig. 101.


Fig. 101 State feedback control, color-coded in terms of the dual pathway model. [4]

The idea is that the controller issues a command to an effector which the effector executes and thus undergoes a change in its state. This change in state is perceived and measured by the sensory system and ultimate arrives at the state estimator. Backing up, at the same time that the controller issued its command to the effector, it copied the command to the forward model which predicts the outcome of the command and sends the prediction on to the state estimator. The state estimator now has two pieces of information at its disposal: a prediction of what the effector should be doing, and a measurement of what the effector actually is doing. The two can be compared to generate an error signal that is relayed back to the controller. If the two are in agreement, then the error is zero, and the controller can continue with what it is doing. If they are in disagreement, then a non-zero is reported to the controller, which presumably helps it to change course. If this process works fast enough, the controller can be corrected nearly in real time.

Hickok casts this system into the dual pathway model as in Fig. 102:


Fig. 102 State feedback control, color-coded in terms of the dual pathway model. [5]

I see several problems with this revision. The first is that the generator of the error signal has disappeared, apparently subsumed into each of the modules of the internal model. Does this mean that each one generates an error? While not impossible, there is a long-standing hypothesis that only the cerebellum is capable of calculating an error, not ordinary cerebral cortex [REF?].

The third problem is to try to match the blocks in this model to the blocks in the larger dual pathway model. I postulate the following mapping:

  1. Auditory phonological system ⇔ STG + STS
  2. Auditory-motor translation ⇔ Spt
  3. Motor phonological system ⇔ PM (motor cortex), the top part of the articulatory network
  4. Controller ⇔ pIFG (posterior inferior frontal gyrus), the bottom part of the articulatory network

However, the controller/pIFG does not connect directly to the vocal tract; it is mediated by motor cortex. In fact, there is no arrow at all between the pIFG and PM in the left hemisphere of Fig. 97. Thus the motor phonological system must be both inside the internal model and outside of it, a contradiction.

Hickok (2012) on somato feedback control

For the sake of completeness, I include the full model of [Hic12]:


Fig. 103 Hierarchical state feedback control. [6]

Speech production


Phonological working memory



One of the issues investigated by neuroimaging concerns the relationships between phonological and semantic processing in the left frontal lobe. As a matter of fact, lesion studies have not clearly resolved whether the analysis of language sounds and the processing of language meaning are segregated or not in the left F3. Recent investigations on this topic have produced contradictory results. In this line, an intriguing finding of this meta-analysis is the specific involvement of the dorsal part of the pars triangularis of the inferior frontal gyrus (F3td) in phonology, whereas this area was considered, until recently, to be a semantic area (Poldrack et al., 1999). As a matter of fact, in a landmark study, Poldrack et al. (1999) carried out a meta-analysis of activation peaks, comparing tasks that called for either semantic or phonological processes. The results led these authors to propose a segregation of F3 into two functional areas: the posterior and dorsal part (pars opercularis, F3op), involved in phonological processing, and the anterior and ventral part (pars triangularis F3t and orbitaris F3orb), involved in semantic processing.

There is no doubt, however, that the dorsal cluster of the F3t (F3td) mainly contains peaks that have higher activity during phonological processing than during semantic processing; for example, counting the number of syllables in a word versus abstract/concrete categorization (Poldrack et al., 1999), pseudo-word repetition versus verb generation (Warburton et al., 1996), word articulation versus word reading (McGuire et al., 1996), non-word versus word reading (Paulesu et al., 2000), reading consonant strings versus reading words (Jessen et al., 1999), and phonetic discrimination versus word listening (Zatorre et al., 1996). Unlike the Rolandic and precentral clusters, this area does not include activation peaks that are related to tongue or mouth movement. Instead, it exhibits a high proportion of peaks that are related to explicit working-memory tasks (Bunge et al., 2001, Cohen et al., 1997, Hautzel et al., 2002, Jonides et al., 1998 and Rypma et al., 1999) during which subjects are required to keep in mind lists of letters or numbers through a short delay. Such tasks require the subject to mentally rehearse the list during the delay in what was defined by Baddeley as the phonological loop (Baddeley, 1992). Furthermore, the more anterior location of the frontal component of working-memory phonological processing, compared to auditory–motor language sound representation, is coherent with the postero-anterior frontal lobe hierarchical organization from motor to executive functions (Fuster, 1998).

Assigning a role in phonological working memory to F3td would be consistent with reports of its recruitment during tasks that rely heavily on this process, such as counting the syllables of a pseudo-word (Poldrack et al., 1999), repetition of a word (Price et al., 1996c) or pseudo-word (Warburton et al., 1996), or syllable identification in the presence of a low signal-to-noise ratio (Sekiyama et al., 2003). Moreover, five contrasts involving phonological working-memory tasks resulted in co-activation of peaks located in both F3td and supramarginalis gyrus (SMG) (Hautzel et al., 2002, Jonides et al., 1998 and Rypma et al., 1999). Our meta-analysis confirms that the SMG is activated by working-memory tasks but not by rhyming tasks. It might therefore be considered as the phonological store area—part of the phonological loop postulated by Baddeley (1992) and initially demonstrated with functional imaging by Paulesu et al. (1993). Additional support for this model has been provided by Cohen, who showed a load effect (an increase in activity correlated with the amount of material to keep in mind) on F3td and SMG co-activation during working-memory tasks based on letters (Cohen et al., 1997). Both regions, connected by both the arcuate fasciculus (Catani et al., 2005) and short connections (Duffau et al., 2003), constitute the neural basis of a perception–action cycle (Fuster, 1998 and Fuster, 2003) for phonological working memory (Fig. 5).


double dissociation

[TG12] review the evidence for dividing verbal working memory in humans is into two neural systems [7–9]:

  1. A mainly left-hemispheric premotor-parietal system including Broca’s area, the left lateral premotor cortex and intraparietal cortex as well as the right cerebellum is involved in articulatory rehearsal, i.e. ‘inner speech’ [7, 10–12],
  2. whereas a bilateral prefrontoparietal system comprising the cortex along anterior parts of the intermediate frontal sulcus, the inferior parietal lobules and the anterior cingulate cortex has repeatedly been demonstrated to subserve a non-articulatory mechanism for maintaining phonological information, which corresponds to the concept of the ‘inner ear’ [7, 9, 13].

Powerpoint and podcast

  • none

The next topic

The next topic is The sensorimotor interface.

End notes

[1]From [HBHM03].
[2]Fig. 1 from [HP07].
[3]Redrawn from Fig. 1 from [HP07].
[4]Redrawn from Fig. 1 of [Hic12].
[5]Fig. 3 of [Hic12].
[6]Fig. 4 of [Hic12].

Last edited Aug 26, 2019