Stephen J. Cowley



From bodily co-regulation to language and thinking
Stephen J. Cowley, School of Psychology, University of Hertfordshire

[Uncorrected pre-publication draft of paper forthcoming in the journal, Linguistics and the Human Sciences.]

Like a digital code, language can be analysed in terms of forms and functions. In many respects, however, it contrasts with Morse and programming languages. Above all, human language influences the subject.s experience. Recent books by Michael Tomasello and Derek Melser suggest that this happens because language derives from co-regulation. In scrutinising how this is possible, I contrast their mentalist and anti-mentalist approaches. While Melser traces language to a history of co-action, Tomasello posits an inner competence based on how children encode/decode intentions.

Melser's view of co-regulation provides an actional model for the rise of language. This is a parsimonious alternative to intention reading. It is, however, too general to illuminate how children master early words or Tomasello.s concrete constructions. Children need, above all, to hear utterances as utterances of words. Without this skill, they cannot act as if they saw and read intentions. It is, moreover, also needed for the verbal skills of thinking. Were approaches synthesized, human powers might connect conventional action with affect based co-regulation. For Melser argues, thinking is closely bound up with self-education. Given the rise of language, new forms of action serve to shape future experience. Melser's is a novel sketch of how thinking emerges from action.

How can we fill the gaps? Both Melser and Tomasello treat real-time dynamics as an accompaniment to language. One separates action from biomechanics and the other treats language as conventional. Love.s (2007) view of first-order language offers a way of uniting the strengths of the models. In dialogue, the dynamics of co-regulated expression serve to construct modes of co-action (including speech). As meanings emerge, babies gain from attending to the sound of second-order patterns. The child.s world becomes one that features audible words. Given neural schema, this phenomenological change permits new modes of acting, speaking and understanding. Although based in bodily dynamics and hearing, learning to talk enables the baby to become a person. A history of co-regulated activity allows her to take responsibility for what she says and thinks.


Mentalism today
After the rise of generativism, those interested in the psychology of language focused on representations. Like Chomsky (1965), many thought that mentalism's rival — behaviourism — had been entirely discredited. The linguist's goal ceased to be that of describing languages and, instead, became that of explaining how forms were represented in the mind.(i) For the many who thought that no learner could overcome the poverty of the stimulus, formalist approaches were the only game in town. It became popular to think that innate structures both get language started and permit grammatical complexity. For the generativist, intuitive knowledge of linguistic facts was the basis for developing grammatical models. Since behaviourists were said to deny any reality to 'intuitions', common sense was alleged to sustain mentalism.

To talk was to have acquired (or learned) a grammar. Explanatory theories set out to model formal systems which divide language into codes. Computer programs were thus introduced in modelling both sentence generation and, oddly, a putative process of language acquisition. Debate focused on whether grammars were learned or, as formalists suggested, grew in the mind. More recently, exclusive focus on mental representations has been challenged. Many reject input-output models (Clark, 1997; Hurley, 1998; Wheeler, 2005) and, instead, invoke leaky minds.(ii) During many cognitive tasks, from carrying out long division to flying planes, we rely on how bodies connect neural states with the world beyond the skin. Connectionism, neuroscience and robotics all provide ways of investigating how intelligence leaks as bodies adjust to external events and patterns. As Vogt (2002) demonstrates, body-world co-regulation is necessary to the physical grounding of linguistic signs and symbols. Embodied models trace language to social and historical events. Neither cognition nor communication reduces to digital coding (Love, 2004; 2007). Ontologically, language differs from, say, Morse. Far from resembling machine operators, we use vague meanings and indeterminate forms. Circumstances fill language with connotations (Kravchenko, 2007). Unlike a digital code, language admits no reversibility (Cowley, 2007). Co-regulation can replace an operator because understanding connects experience with real-time. Since generative processes cannot mimic body-world relations, conversations draw on body-based processes. They produce biosemiotic effects that probably draw on organic coding (Barbieri, 2007; Cowley, 2007). As Harnad (2005) stresses, language exploits the feeling of thinking.

Contrasts and parallels
Language is inseparable from bodily co-ordination. How, then, do we come to talk and think? Such questions are addressed in the two books which illuminate how language influences child development. Both Michael Tomasello's Constructing a Language (CaL) and Derek Melser's The Act of Thinking (AoT) reject digital codes. Instead of positing that we learn or acquire 'forms', they propose that babies use words-in-interaction to become talking persons. They tap into the power of language by using action and perception. Like Halliday (1983), they deny that language is best viewed as a kind of commodity. Rather, learning to regulate the behaviour of others is, 'one of the essential steps in the developmental process' (1983: 3). However, in contrast to systemic-functional tradition, neither focuses on the transmission of the social order to the child. Rather, semiosis serves the child in gaining increasing control over her doings. Reversing the focus on construing experience(iii), they stress how agents come to mean and think. Here they contrast. While Tomasello is a mentalist who stresses language relevant knowledge (conventions used to construe particulars), Melser traces language to learning how to behave. Using this contrast, I scrutinize how word-based patterns are integrated with the causal processes that co-constitute the mental. In pursuing how language makes us fully human, I use the books critically and, where possible, constructively.

Both authors trace language to co-regulation. Nonetheless they books come from different disciplines, draw on different experiences, highlight different topics, and propose contrasting theories. Indeed, while Michael Tomasello is well known in cognitive science, Derek Melser is a philosopher who recently completed his doctorate. While AoT presents a theory about actions that lie beyond the reach of science, CaL relies on experiments on how co-ordination contributes to learning. Conceptually, they differ too: while CaL treats thinking as the output of intention-processing, AoT regards it as voluntary action by a living being. To bring out the contrast, I highlight how they conceptualise the mental. Finding strengths in both views, I emphasise the importance of real-time events. By synthesizing these with the models, it becomes possible to investigate minds without positing linguistic codes.

In the 1960s, Chomsky began arguments with claims like, “Obviously, everyone has acquired a grammar” (1965: 8). Today this seems odd. First, brains are unlikely to represent what grammarians describe. Second, forms arise from grammaticalization (Hopper & Traugott, 1993) and, in parallel, cultural selection (Lyon et al., 2007). Third, sociolinguistics shows the diversity of language. Linguistic formalism is thus on the defensive. Increasing numbers view word-forms as patterns or constraints that, during social life, influence learning. If mind is not symbolic, new questions arise. How is language embodied? Do we need to contrast the natural, mental and cultural? Does minded activity extend beyond the skin? While taking different views, both Tomasello and Melser trace mental events .and language .to the public domain. Heard (and seen) movements prompt construals. In some sense, minds leak. To contrast the theories, I begin with why formal models fail to connect with the world. Given the linking problem (Tomasello, 2005), how do babies can construe anything at all? Then, heeding critiques of behaviourism, I compare CaL and AoT in relation to, first, grammatical complexity and, second, silent thinking. Finally, I suggest how attention to real-time events can be used in synthesizing strengths of the models.

Beyond commonsense
While concurring that language is integral to behaviour, Melser and Tomasello differ on their views of human agency. While CaL invokes individual minds, AoT appeals to the subjects of the human world. Contra Tomasello.s leaky intentions and conventional signs, Melser treats .mind. as purely metaphorical. The sounds and movements that people make can be divided, he posits, into natural processes and ones that constitute action. Given this dichotomy, intelligent behaviour is separated from inner causes. Language and gesture — as part of action — depend on human skills. We need no inner intentions. Rather, human actions arise from our empathetic ability or what Melser calls a history of concerting. Since concerting grounds talk, language has a constitutive role in action. By definition, this is “something a person actually does” (3). Even in silent thinking, we rely, not on linguistic forms, but on rehearsing events. We act or carry out what Melser calls tokening. This is learned by discovering, first, how to do things with others and, later, how to act alone. In managing a spoon, for example, we token by opening our mouths and by ceasing to blow just before food enters the oral cavity. Gradually, sequences of behaviour emerge. Events based on concerting come under solo control. Babies learn to bring the spoon to their mouths in ways that are appropriate to its contents. As they learn from circumstances, actions become abbreviated. For example, infants will learn to adjust holding to counterbalance the texture of the food. For Melser, learning to think is no different. In becoming thinkers, concerted action gradually gives way to using circumstances in doing things on our own. Even thinking has no need of inner .mind.. Silent rehearsal of thoughts is accompanied by signs of thinking. To say that mind leaks is, for Melser, metaphorical.

Mentalist views trace cognitive capacities to an inner rationality. In Constructing a Language, some kind of X-ray perception permits learning to talk. For Tomasello, therefore, minds literally leak. A baby's inherent rationality permits her to 'recognise' what others intend. Later, given role reversal, she will later leak her own intentions. Gradually, mind-leaks prompt her to learn the conventional sounds and movements of her culture as the brain constructs networks of functional symbols. Thus, while sustaining words and gestures, concrete constructions depend on brains that generalize about intentional acts. The baby need not identify forms because statistics reveal behavioural conventions. A capacity for intention-reading enables a human child to cross Deacon's (1997) symbolic threshold. Leaky minds reveal cultural usage-patterns which, for Tomasello, are identical to languages.(iv)

The Act of Thinking falls into three parts. In the first (Chapters 1-2), Melser traces behaviourist and other precursors. Playing down Skinner, he links his work to, among others, Hampshire, Vygotsky, de Laguna and Mead. Given my focus on learning to talk, I emphasise the part of the book where (Chapters 3-7), using Vygotsky, he sketches how we come to produce utterances and, crucially, think. Finally (Chapter 8-11), he sketches why 'mind' has mesmerized many and, pushing the point, makes a full frontal attack on the view that physiological and causal models can explain actions. By opposing what he calls action physicalism, Melser opens us space for his thesis: thinking is a form of action based in empathetic concerting.

Aiming at clarity, Melser presents a general picture. Consciousness, language and thinking derive from how babies act together with other persons. Using concerting, they develop experience of co-ordinating with others. Gradually, a dyad develops forms of co-action and, later, the baby learns to act on his or her own. First, we are fed; later, we use instruments in eating. First, we participate in talk; later, language sustains thinking. In both cases, the end result is mastery of action that, by definition, can be fully described. Given this self-awareness, acts of thinking are voluntary, learned (through exhibitions) and available for moral evaluation. While distinct from consciousness, acts of thinking originate in how adults 'display back' awareness of infant doings. The baby relies on behaviours that are “constitutive of actions” such as “adopting specific facial expressions, making eye movements as if inspecting things, tensing specific muscles (8)”. By learning these, concerting prompts the baby to act like a person. Later, by coming to refrain from acting, she will act alone, become conscious, talk and, eventually, think. Increasingly, the baby will use verbal, imagistic and other expression covertly. Language arises as we become conscious in a world where cultural fictions are associated with verbal expression. Reality consists in conventions and practices that use cultural objects and events. There is a sense in which “we invented the world”: what we live in action depends, to a large extent, on social construction. Although, in Western cultures, we ascribe intentions to action, this reflects on neither natural rationality nor intention-reading. The brain, Melser thinks, is transformed by culture. AoT thus rejects both internalization and what CaL terms intramental states. Far from relying on mind (or brain), reduced support prompts the infant to new initiatives. Although solo action (e.g. eating with a spoon) arises in concert, caregiver contributions, like infant efforts, become increasingly attenuated. As children manipulate objects, they invent intelligible ways of acting by learning from co-ordinated events. Gradually, children learn to use stylized movements to request food. As they do so, complete actions become abbreviated. We make increasing use of overt gestures, little mimes, facial expressions, and speech dynamics. In development, tokening to self-ready becomes increasingly covert. This is clear in the case of language. While based in overt vocalization, thinking draws on actions (including speech) that are rehearsed without display. Speaking is covert when we (silently) plan, ask and wonder. Even so, acts of thinking leave a public trace. While we can see what Le Penseur is doing, we do not necessarily know what he thinks. Of our own 'minds' we do not know why we come up with questions, or thoughts. To capture this one can adopt Vygotsky's dictum that dialogical skills go underground. Attending to what we think — mental spillage — gives some control over verbal aspects of language. Circumstances and conventions enable us to manage our tokening. We integrate gestures and sound patterns with action and, as a result, influence others. Linguistic (and other) 'facts' enter a child's world through acts of thinking.

Constructing a Language is cautious, strategically written and designed to develop a theory whose findings can be replicated. While denying that brains represent 'grammar', Tomasello posits that innate structures are needed if a child is to develop the intention reading skills (allegedly) needed to construct a language. Social reality depends, he thinks, on representations or processing intentional output. In CaL, this picture shapes a socio-pragmatic view of language acquisition. Using cognitive linguistics and usage-grammar, Tomasello argues that a single process enables children to learn words, phrases, and the patterns of discourse. In presenting how we master usage-patterns, CaL provides a mentalist alternative to nativism. Tomasello thus both reassesses other theories and gives evidence supportive for his model. The approach has the merit of showing, clearly, that empirical work can be used to discover how children learn from exploit visible and audible expression. Second, though not emphasized here, CaL also presents a synthesis of old and new information about how and when, on average, children are likely to exploit linguistic functions (mainly in English).

Placed against Melser's work, Tomasello's view is conservative. De-emphasising the human subject, he invokes a mind that processes intentions. Learning to talk, on this view, depends on a competency for intention-reading that is species specific. As this develops, the functions of linguistic symbols are mapped onto usage-patterns. We link proposition-like abstracta, it is suggested, with 'non-natural meaning'(.v) Given intentions, the brain's schema (see below) perform functional analysis of usage-patterns. Talk is the conventional and rational use of such processes. Given inherent reason, language, consciousness and thinking all emerge at the end of the first year. Babies come to see intentions (perhaps as images) and, soon after, recognize those of others. At 9 months, they become “intentional agents like the self” (2003: 21). Given abilities for simulation, social events prompt them to understand what is expected of persons. Displaying inherent rationality, intention-reading allows the rise of vocal imitation. For Tomasello, primate brains facilitate us in learning to talk.

Acts of utterance, Tomasello believes, give intention-readers a grasp of concrete constructions. Using construction with systematic ambiguity, he invokes both entities described by cognitive linguists and what he calls neural schema. Did brains not construct such representations, Tomasello thinks, we could neither re-identify intentions (e.g. requests for silence) nor, presumably, recognize utterance-types. In contrast to formalist models, schema are not isomorphic to verbal patterns. Babies do not literally internalize structures but, rather, construct functional models of sound and gesture. In Tomasello's terms, imitation gives them the neural schema needed to control concrete constructions. As these encode intentions (not grammar), role reversal enables children to map their mental states on to those of others. True imitation thus allows them to grasp and produce talk without mentally representing propositions. As noted, minds literally leak. An inherent capacity for intention-reading sensitizes children to symbols. Brains use patterned patterns (sic) to manage a “flow of information” (244). Children eventually master abstract constructions. They negotiate situations which depend on complex shifts in style and register. In so doing, like babies, they rely on intention-reading. Even rhetorical skills depend on role reversal, imitation and pattern recognition. Anticipating complaints that this is vague, CaL documents statistical change in grammatical and discursive patterns.(vi) An adult.s brain is like a baby.s in recurrent exposure hones its linguistic powers. While representations link intentions to imitations and social skills, adults can also invoke intentions to explain what they do.(vii) Finally, Tomasello sketches how discursive patterns arise, reiterates his anti-formalism, and links the theory to culture, evolution and cognition.

The importance of behaviour
Given their views of human agency, the theories are incommensurable. No comparison can be made between a model of how children learn to think and a theory of how brains develop the schema allegedly required for language (that is, what is described by a school of grammarians). To compare the views, therefore, I consider how babies .and interactions .change over time. Specifically, I contrast AoT's action view with CaL's functional-style mentalism by considering how changes in the child.s behaviour might be related to the use of the verbal patterns of a culture.(viii)

In development, learning to talk transforms the baby's world. This is important because, as Melser and Tomasello agree, language impacts on agency. It is therefore appropriate to begin with co-regulation. By avoiding appeal to 'subjective' or 'objective' features of events, we can avoid what Davis (2001) calls the dire consequences of conflating nonverbal and verbal aspects of expression. Instead of positing that babies depend on either words or meanings, both concur that local ways of speaking are grounded in nonverbal interaction (including vocalization). They agree that adult reactions prompt a child to discover the aspect of events that we call 'words'(.ix) Since differences concern mechanisms, I turn to how bodily co-regulation can prompt a child to discover utterances in their verbal aspect. This enables me to contrast how Tomasello and Melser view the interactional history that gives an infant power over vocalizations (that permit formal and functional analysis).

Grounding and the linking problem
Where formal patterns are used for explanatory purposes, these must be meaningful to the person or system whose activities they are said to explain. In computing, the difficulty of making symbols meaningful to a device is called the symbol grounding problem (Harnad, 1990; Cangelosi et al., 2002; Taddeo & Floridi, 2005; Belpaeme & Cowley, 2007). Naturally enough, formal linguistic models face the same issue. Until forms are connected with the world perceived, they lack function. It is thus surprising that, in American descriptivism (Matthews, 1993), formal strings define languages(.x) In mentalist theories, symbol processing occurs at what is called a syntax/semantics interface(.xi) On this picture, as Tomasello (2005) shows, formalists fail to show how linguistic structures connect with the world. Indeed, by invoking a linking problem, he commits himself to viewing mind-leaks literally. Learning to talk demands a neural device that 'reads' intentions into leaked public symbols.

Tomasello posits that a brain matches sound patterns to socially exhibited or leaked intentions. These enable a baby to develop a suite of skills based on intention-reading or unconscious inferences. Her brain works out how to follow gaze, point, and make simple utterances. Gradually, strata of functional knowledge accumulate. Given intention-reading, phonetic patterns need only prompt babies to appropriate intention-based expression. Shared attention is, he thinks, sufficient to reveal 'symbols' or sound-patterns. Retaining mentalism, Tomasello replaces formal models with descriptions of language function. Rejecting content-free forms, he posits that, given exposure to symbols, brains encode and decode by using intentions. We rely — not on forms — but on how conventions spread. As a result, one-year-olds follow points or grasp (some) context-bound words. Given sensitivity to intentional expression, babies discover conventions. Do brains (or neural schema) identify intended aspects of context? Unfortunately, no account is given of how, say, gaze and pointing contribute to intention-recognition. We do not learn how our skills draw on the alleged capacity. Rather than defend his literal view of leaky-minds, Tomasello.s rhetoric sets up a contrast with activities typical of this age .crawling, attitude-based decision-making, and using instruments. Communication by looks and points is unlike practical activity. Thus cultural learning and role reversal imitation explain only intentional communication(.xii) In ignoring action, CaL gives no attention to real-time or, indeed, infant spontaneity. We are intention-driven creatures who fulfill the expectations of conformist cultures.

Although concurring that language is based in social events, Melser's theory is simpler. Above all, he allows human agency to change in developmental time. Eschewing intrinsic intentions, babies are changed by experience of co-regulated events. Using an impulse for concerting,(xiii) infants sense what is going on. Control of actions uses human empathy. While brains render feeling possible, there are no inner intentions. Rather, in this tradition, co-action grounds language (Trevarthen, 1979; 1998; Fogel 1993; Thibault, 2000; Cowley et al., 2004; Spurrett & Cowley, 2004; Cowley, 2007; Gallagher, 2005). Before pursuing this, let us ask why Tomasello posits a linking problem rather than seeking out the physical grounding of neural schema.

When syntax is opposed to semantics (e.g. Chomsky, 1965; Hauser at al, 2002), the biology of language is treated as separable from semantics. Whereas understanding is traced to perception and categorization, core-grammar is posited to draw on human universals. This creates a curious problem. A separate system is needed to map linguistic symbols onto representations of the external world. In other terms, the brain.s putative (syntactic) processing must be grounded. Indeed, any theory that posits a rule-based inner language (e.g. Pinker, 1994; Jackendoff, 2002) faces the challenge of explaining how neural representations become meaningful. By eschewing formalism, Tomasello seeks to save mentalism by appeal to leaky minds. Symbol use is said to depend on a device that allows infants to read other peoples. intentions. Infants thus learn how to leak intentions into the social world.(xiv)

Tomasello emphasises that small babies do not read intentions.(xv) Rather, he posits a (putative) developmental discontinuity. After nine-months, he believes, brains start intention-processing. Babies begin to grasp why caregivers act as they do. While more credible than Chomsky.s (1965) universal grammar, the logic is similar. A mechanism generates intentional output (not sentences) that fixes action (not speech). Having challenged the 'intention-box' (Cowley, 2004) elsewhere, I shall not dwell on it here. Rather, I note that a 'species specific adaptation' (Tomasello et al., 1993) became a harbinger of a '9 month revolution' (Tomasello, 1999) and now shapes a 'suite of skills.' Instead of grounding symbols, a brain links conventional forms with (putative) intention-processing. While based in co-regulation, language depends on a very specialized neural endowment. Babies need special systems that enable them to identify, generate and use intention-based conventions.

Melser's view is more parsimonious. Both language and thinking arise from a history of co-regulation. No linking problem arises because skills develop as co-action changes what bodies can do. Mind is metaphor that picks out the subjective results of empathetic experience. Even a new-born infant gets caught up in concerting that drives (public) construal. As co-action develops, she gains control over movements (e.g. looking or using an index finger) that prompt action-based construal (including language and thinking). Rejecting in-built rationality, AoT traces cultural skills and perceptions to caregivers. Rather than invoke developmental discontinuity, Melser emphasises mutual relations between bodies, co-action and circumstances. On this actional view, language, consciousness and thinking all arise from joint sense-making. Eschewing Tomasello's appeal to brain schemas, Melser invokes shared understandings and joint action. In this way, he can map subject-level descriptions onto how we explain cultural ways of moving (and acting). These constitute actions which, in his model, cannot be separated from how we explain them. As such, they are irreducible to natural processes.

While echoing a tradition in grounding communication in co-action, Melser goes further. Concerting shapes joint action, language and, therefore, thinking. A baby learns a sub-class of movements that sustain social action. In keeping things theoretically bare, Melser distinguishes the natural from the cultural. AoT therefore, leaves aside how a baby moves, develops attentional skills or, indeed, how motivations change. While sketchy, this throws light on how we become human. By rejecting intention-reading, mutual concerting becomes the ontogenetic basis of becoming actors. This links brains to the social world while replacing (vague) talk of neural schema with evidence of actual and rehearsed action. As in CaL, this fits the consensus on child development. Co-action does give infants a sense of persons (see, Legerstree, 2005) as attachments use social routines. By twelve months, infants (sometimes) sense what people mean and, later, use social movements to get what they want. By the second year, they follow points, monitor attitudes and, in familiar settings, act as expected.(xvi) Against cognitivist tradition, caregivers prompt babies to engage with language. The issue, then, is whether this depends on brain-based intention-recognition and/or bodily co-regulation. Given the failings of other anti-formalist views, this raises another issue. How can such a capacity explain the rise of (grammatically constrained) thoughts?

Grammatical thresholds?
When we focus on forms in the mind, mastering grammar becomes a mystery. If reinforcement matters little, children need inner resources to cross grammatical thresholds. These serve both to get language started and adjust neural settings to those of other brains. Despite the work of theorists like Sampson (2005), many think that this vindicates linguistic nativism. An alternative is to challenge formalism. Using this strategy, CaL and AoT trace language to co-regulation. While concurring that engaging with caregivers transforms the child, Tomasello's neural focus contrasts with Melser's actional view. While differing on means, they agree that verbal patterns stabilize in history. Consistent adult expression ensures that, by the second year, children hear 'words' and, at times, vocalize appropriately. Activity becomes analysis amenable (Spurrett & Cowley, 2004) or, in Dennett's terms (1978) fits ascriptions from an intentional stance. While Melser appeals to overt tokening of action, Tomasello invokes (inner) constructions. For both, utterances are elicited by circumstances. Like thinking, talk is activity based in construing social expression. How, though, do children come to exploit the abstract resources of grammar?

Tomasello focuses on coming to use sound-patterns in utterances. Infants, he claims, use mind-leaks to identify intentions and sound-patterns or symbols. Learning to talk is thus skill-based. Salient intentional behaviour enables children to discover conventions. Without grammars, statistical learning gives one year olds symbols and gestures.(xvii) As they grow, knowledge of conventions enables them to speak 'words'. Without phonology or literal meaning, they make utterances that can be analysed into one or more verbal units. Neural 'schema' prompt them to vary concrete constructions between bath, church and supermarket.(xviii) What linguists call (public) 'structure' develops as brains allegedly construct schemas for acts of utterance. Given these, 'gazzer' may prompt a child to grasp its intended reference (e.g. a hammer). Slowly, schemas conform to conventional usage. Use of repeated patterns enables them to index local expectations. Children thus begin to gain control of what linguists call argument structure. Just as mind-leaks prompt speaking, they spur us to master grammar. Because function (allegedly) indexes intention, brains do not represent formal types. Although formalists use structure as evidence for representations, symbol use is based on neurally differentiated schema. Applied to early development, then, Tomasello.s mentalism is shallow. The issue is whether or not statistical analysis of verbal patterns is the basis for learning to speak and think. Can a single process explain both early learning and the linguistic practices of children over two?

Having sketched how leaky minds give rise to concrete constructions, the model is re-applied. Mind-leaks enable 3-4 year olds to background and foreground by using “transitives, intransitives, ditransitives, attributives, passives, imperatives, reflexives, locatives, resultatives, causatives” (144) and so on. As children sensitise to “information flow” (244), they learn abstract constructions (i.e. complex structures) or come to master style. Like babies, older children rely on phonetic patterns, role reversal, and skills in joint attention. Only function changes as blind induction gives way to “categorization and analogy based on linguistic schemas” (300). The same leaky minds and statistical learning enable them to grasp stories, experience teasing and, tell themselves about the past. Neural schema come to represent “generalizations across many dozen or more item based constructions“ (2003; 144).

For Tomasello brains generalize symbols in line with usage-grammar. Using statistical generalization, constructing a language proceeds by detecting what others 'intend to say'. Generally, we can discount the importance of individual evaluations and judgements. An automatic process of intention-recognition can unmask even the formats of active, passive and interpersonal. We do not construe what we hear: rather, we calculate the informational value of, say, a passive. This is a strange perspective. However, if usage-patterns are separate from, action, statistical learning must be the key to language. For Tomasello, its products are used, equitably, in talking with friends, acquaintances and bullies. But, can this be right? Surely children make less use of intentions and symbols than experience, conflict, habits and what 'just happens'?(xix) Tomasello avoids such issues by separating action from (intentional) communication. Following cognitive grammarians, he defines language as conformist use of conventional symbols. For the child, language becomes a convention-based system. Thus, while rejecting formalism, Tomasello nonetheless accepts the code metaphor. Like the generativists, he traces language-use to .recognition. of conventional patterns. Intention-reading bodies link brains with symbols. Like philosophers, babies categorize intentional contents. Given mental leaks, we escape from distractions like, pauses, repetitions and even the play of emotion. Symbols are separated from bodies and (reinforcement) learning. In one sense, CaL is an old-fashioned competence theory.

For Melser the focus is on action. Rather than examine communication, he asks how verbal patterns and thinking enter a baby.s repertoire. In early life, of course, language is secondary. Even ostention is based on what we do. A child learns .not from leaky minds .but mutual rewards. As “successful concerted activity” (118) ostention depends on integrating hearing with looking. The child develops skills that link gestures with vocal patterns. By recognising likely events, the baby gains new powers. In Melser's terms, she comes to token actions. This can elicit both empathy and an action's circumstances-relevant value. Children learn by covertly tokening other people's actions while refraining from doing them. Skills develop as the child discovers the rewards of attending to, say, speech or the contents of an approaching spoon. Action and language transform how we react to both objects and other people. When role reversal occurs, agency accumulates new complexity. As the child becomes skilled in anticipating what actions are likely to achieve, she masters solo actions. Increasingly, these come to be treated as responsible and voluntary acts. Melser's view is that, as we gradually become conscious, we learn from intellectual and moral decisions.(xx) While novel — and abstract — he allows vocal patterns to use circumstances (not just intentions). Even if frequency matters, the results correlate — not just with neural schema — but also objects, settings and values. Just as talk is more than grammar, social activity is more than conformity. Usage-patterns enable children to exploit constraints on action.

While viewing language as integral to both concerted and solo action, Melser emphasises cases like using a spoon. The child begins with efforts at concerted performance where the caregiver serves as a model. As she reduces demonstrations and token demonstrations, the child takes on the caregiver's role. She learns what can be called solo eating. This is a paradigm for coming to act, speak and think. Like manual dexterity, grammar can simplify action. In using spoons and utterances alike, we begin with concerted action and, later, go solo. On this social constructionist view, we are instruments of culture. While a superstition to think that words have meanings (131), these give us the beliefs that shape what we think and do. Nor does Melser baulk at the radical implications of this view. As tokening becomes covert, perception binds reality to cultural history. Before learning silent rehearsal, a child absorbs thoughts, values and, feelings. These are manifest in skilled — and abbreviated — action.

While showing how thinking could be grounded in skills, appeal to tokening offers less in explaining how children come to speak appropriately (and grammatically). Since the goal of AoT is to explain thinking, this might not matter. Given Melser's view of brains, the actional view leaves gaps. Little is said about attention and no questions are raised about how actions arise. Equally, no attention is given to real-time motives or the rise of specific abbreviations. On analogy with spoon use, language gives mental dexterity. Since motivation is ignored, nothing is said about why children learn to deny claims, insist that they are right, or ask questions. Given opposition between nature and culture, Melser ignores epigenesis and co-evolution. Further, by identifying language with how we explain events, talk remains separate from action. To my eyes, something is missing. Indeed, CaL has the theoretical edge in giving statistical learning a role in shaping the child.s action repertoire. Even if we reject intention-reading (as I do) children under two learn to anticipate the upcoming results of speech and action. Tomasello's problems arise — not from emphasising statistical learning — but making this central to how children gain control over language and thinking. As processes depend on living beings, they are irreducible to 'information flow'.

Solo thinking
In CaL, verbal patterns spill over from intentions. Indeed, given intention-based signaling, utterances are complex from the start. As a result CaL reduces language-saturated action to using constructions. As a mentalist, Tomasello fails to distinguish, mindless social speech from its thoughtful counterpart. Thus, a 10 month old who utters miw in response to “Do you want your milk?” acts as intentionally as a visitor to Italy who rehearses mi darebbe un po¡¯ di latte? In an intention-reading model, the deliberate is like the spontaneous. All utterances use the same schema. Verbal patterns stand-in for intentional states. To solve the linking problem, mental-processing is treated as defining (sic) subjects and circumstances. This is a form of solipsism that often surfaces in mentalist theory. Indeed, if a single process maps schema onto what a cognitive grammar describes, one language must map onto another (that of intentions). Language consists in forms (with complex functions). Instead of defending this (outdated) view, Tomasello turns to statistics. The origin of constructions is shown, he claims, by Diessel’s (2001) correlations of usage, discourse function and lexical context. Of course, the objection is that behaviour gives no evidence of neural schema. It is (trivial) to suggest that we have skills in “managing the flow of information” (2003: 244). Thought is not thrall to brains; language is not reducible to convention.

Melser seeks to explain thinking. Like using a spoon, he argues, thinking is a learned form of action. Even the Penseur is employing skills that make his brain an instrument of culture. His intellectual capacity derives from concerting, solo tokening, refraining and, eventually, skills in covert solo rehearsal. They serve, at root, in preparing for real (and imagined) performance. Thinkers reconceive scenarios by drawing on the past. Experience of action — and language — shapes these skills. For Melser, we need no “concept of mind at all (181).” Indeed, a history of interactions enables us to manage how we act and, by so doing, to “standardise and intensify the action-readying power of language (143)”. Tomasello is wrong to treat a baby’s ‘repetition’ of miw as thinking. In fact, it is a vocalization. By contrast, thinking produces “synthesis of two kinds of actions” (55). In saying miw, there is neither “covert token performance of speech (147)” nor “educative concerting of some [other] activity (147)”. Unlike mi darebbe un po¡¯ di latte, the baby's utterance lacks any (self) educative function. In contrast to acts of thinking — imagining, remembering, considering etc. — miw is not self-motivated but, rather, echoes what has just been heard.

Melser does not trace intentions to neural representations. Rather, like Skinner and Dennett (1991a), he denies that reports of the 'inner' reveal anything about the brain. On the contrary, silent rehearsal shows how conscious action draws on verbal patterns. Thinking is action that becomes both voluntary and moral. In CaL, by contrast, as neural schema map intentions onto usage-patterns, language leaks. When I truthfully say that I want to go out (for example), I report a mental state.(xxi) Thinking is, for Tomasello (and Descartes) a window on the mind. By contrast, such reports can be seen as mere self-ascriptions. This is Melser's view. However, rather than appeal to 'habits', 'brain' or 'language', he invokes a history of actions. This is novel.(xxii) Boldly, he posits that actions that construe and drive construal give us skills in self-ascription. Without concerting, there could be no learned gestures and no reflection. Co-action grounds skills where action takes on a verbal overlay. For Melser, movements and construals promote thinking. Concerting is the basis of thought. While a baby uses empathy to develop skills, later performance is enriched by verbal patterns. By concerting and monitoring the results, the child learns to carry off social actions. Thinking matters. Indeed, even in day dreaming, we rehearse what we would say. Although stressing “public conventions and practices” (149), these are neither internalized nor the basis for explaining actions. Rather, patterns must be trustworthy: we rely on linguistic experience. While CaL traces thinking to inner mind, AoT traces it to social events that occur beyond the skin.(xxiii)

From a behavioural point of view
Wittgenstein called for an explanation of how — not ratiocination — but “something like an instinct” might give rise to language (1980, 471). For Melser, this outcome is a result of concerting. To my judgment, the dexterity of thinking is indeed likely to derive from a history of empathetic action. Melser, however, makes utterances by 1-2 year olds little more than overt expression. Here AoT seems to underplay language. To extend the actional view, we need a view on learning to vocalize. First, however, we must clarify how verbal aspects of language shape full-bodied expression. As argued by Harris (1998), Toolan (1996), Davis (2001), Love (2004; 2007) and others, words are not a priori signs. Language enacts particular events:

Linguistic facts are ultimately facts about individuals acting in certain ways on particular occasions (Harris, 1998: 125).

Developmentally, bodies (including brains) learn how to integrate sound and movement. The resulting whole-bodied activity is constrained by historically derived cultural constructs. In our literate tradition, of course, such activity can be transcribed and, indeed, analysed as if it exploited particular modalities. This, however, throws little light on particulars (e.g. Goodwin, 2000; Thibault, 2004). If minded behaviour draws on these, we depend on first-order language, “a temporally situated, ongoing process — the process of making and remaking signs in contextualised episodes of communicative behaviour” (Love, 2007: 705). While based in causal events, its dialogical nature prompts us to develop skills in meaning-making. Brains and bodies act under dual control as, in real-time, each party tries to make sense of the other's doings. Co-regulating acts of utterance use dynamics which are constrained by — but not reducible to — what we hear as word-forms. Given history, these patterns enable us to partition the world. In Love's terms (2007), they are second-order cultural constructs. While words constrain how adults behave, the baby has no need of these second order patterns. Rather, social events depend on complex co-regulation. While historically conditioned, learning to talk cannot be traced to use of words (let alone intentions!)

Recognising the causal basis of language allows us, first, to abolish (inner) intention-recognition. In engaging with first-order language babies use contingencies and affect to construct new powers. Real-time events shape the neural systems that provide intrinsic motivation (Trevarthen, 1979; 1998; Trevarthen and Aitken, 2001; Cowley, in press). By three months they have anticipatory powers based on signs of culture (Cowley et al., 2004). As Skinner suggests, behaviour, or what is reinforced (Ainslie, 1985) enable them to develop situation-relevant norms. Caregiver rewards set up contingencies that enable self-motivating infants to anticipate and (at times) act as others expect. Action-based neural systems (schema) orient them to cultural functions (e.g. having fun, being good, showing respect) that allow value-realizing actions. Expression-guiding schema allow some control over real-time concerting.

Infant motivation systems gear to maternal cues that signal upcoming events. Far from having to see their own intentions (if these exist), babies learn from intention-attributions. Language-saturated caregivers reward movements that are apparently intentional.(xxiv) This sets off self-fulfilling prophesies as children learn to regulate actions around norms and expectations. They benefit from fluent co-action and, at the end of the first year, start to go solo (using operant conditioning). Co-action gives rise to increasingly adaptive vocal powers. In turn, caregiver attribution enhances sense-making as concerting is increasingly geared to social effects. Neural self-organization picks out cues for when actions are to be initiated, abbreviated or inhibited. With experience babies develop both attentional skills and vocal control (giving one and two-word utterances). Skills in getting what they want increasingly supplement imitation, role reversal and statistically-driven learning. By age two, talk is increasingly motivated by what we call the child.s perspective. Gradually, speech takes on ideational functions that, for example, permit talk about the past.(xxv) Babies begin to integrate second-order cultural constructs with first-order (biomechanical) events.

When milk is visible (and salient), a one-year old may utter a miw. Rewarded and repeated, this may become part of feeding. One day, in another setting, a child may say (or a caregiver hear) miw. Because no milk is present, this may set off rewarding reactions (it may strike a caregiver as intentional). To get milk, therefore, the child uses adult hearings and construals. This milestone depends on how others connect an utterance-type to a possible .want.. Reward depends on recognizable utterance-acts that fit adult criteria of context, accuracy and timing. This challenges the child. Occasional rewards motivate her to author speech, develop vocal skills, and, indeed, learn when to inhibit (e.g. at church). By the age of three (or so), she may be able to say miwk (sic) 'when she wants'. To do so, however, she needs a brain that induces the (sound) image to 'leak' into action. It is only if a child can induce this self-reminding that she can choose whether — and how — to speak. However this is done, the trick has surprising outcomes. What began as part of a feeding ritual becomes a vocal resource over which a subject exerts (some) control. In short, the actional view can gain from tracing skills to intention attribution, reward, and statistical learning. Nor is that all: AoT also overlooks how hearing changes the child. Given experience of (what we call) words, she alters how she acts and, by extension, reorganizes neural function.

Both Melser and Tomasello focus on action. At its broadest the AoT view of concerting and abbreviated action fits CaL's brain-centred approach. The neural schema used in social circumstances can be attributed to — not intention-reading — but a history of co-regulated activity. Among other things, children learn to anticipate attribution-based rewards. Agency develops in routines that use real-time expression. Since the models overlook first-order language, they ignore how caregivers encourage attention to sound-patterns and objects by using, say, exaggerated prosody and talk about utterances (as words with meanings). Concerted action gives babies skills in actively perceiving what caregivers want and feel. This is enhanced through cultural activities like reading nursery rhymes. While Melser recognizes that a sensitive teacher may help a child play an instrument with flair, he sees no way of applying the analogy to language. Having simplified by distinguishing action from the world of nature, he cannot address the power of how we act (and speak). In fact, like a musical instrument, the voice can be used to realize values. Anticipatory and habit-forming brains integrate reinforcement learning with attention-guided action. As with a musical instrument, a baby comes to hear utterances as utterances of something (e.g. miw). Perhaps, rather as babies-under-one use intention-attributions to construct coherent actions (see, Cowley et al., 2004; Cowley, 2007), similar attributions help older children to identify verbal patterns. Indeed, once adults (think that they) hear words, they will seek to encourage, inhibit and modify infant vocalization. Concerting and rewards thus challenge the baby's hearing. Given statistical learning, brains need represent neither language form nor function. Rather, their role is to motivate attempts at vocalization.(xxvi) Babies hear the resulting sounds and, at times, these are rewarded. By anticipating effects they move towards saying things. In going solo, much depends on hearing. Not only can adult response be used to construe talk about words but this reveals (metaphorical) mind-leaks. Rather as we picture objects (in the .mind.), we imagine words. Habits prompt expectations of vocal events as we begin, metaphorically, to entertain thoughts. Hearing verbal patterns can restructure first-person experience. By concerting while attending to verbal patterns, children discover more deliberate action. Indeed, once children experience mind leaks, they can find new ways of packing verbal images. Gradually, their contributions will contribute to formulations that shape concerted action and, in later years, activities based on solo thinking.

From birth, babies use (analogue) expression that is irreducible to movements. They act in ways that are perceived as meaningful and, almost certainly, pick up on a band of affective responses. Concerting ensures that bodies learn to respond against a well-defined cultural backdrop. As there is no simple input/output process, there is no impoverished stimulus. By six months, cultural practices (e.g. giving games, feeding routines, bathing) enable children to use word-marked attitudes. They follow points and vocalizations and, later, act to manipulate adults. Expression uses what can be reinforced as children to realize values. By twelve months, they orient to second-order constructs such as 'milk'. While using adult construals, they are thought (prophetically) to 'know' the word. Later, activity like pretend play will nurture sound-patterns that, metaphorically, leak into mind. This is elementary thinking. Further, once hearing is imbued with a verbal aspect, vocalizations become audible signs (Kravchenko, 2007). As children begin to grasp talk about talk functional transformations will follow. Hearing a verbal aspect can open up questions about wordings and trust in meanings. Given metaphorically leaky minds, children find new ways of realizing acting. Indeed, the capacity to hear words is surely central in planning and formatting thought. Gradually, subjects become skilled in managing what Menary calls (2007) cognitive integration. Language and thinking co-develop as we transform our agency by linking what happens with utterances, gestures, texts etc..

Learning from leaky bodies
Formalist linguistics was yesterday's news. Brains do not represent what grammarians describe. Representations arise in the service of action (Anderson, 2003). Far from reflecting 'language' they emerge from how actions contribute to co-regulated activity. Skills arise in co-ordinating expression, first, with a caregiver.s intentions and, later, by integrating deliberate speech with other modes of expression. Leaky bodies do not learn (or acquire) grammars. Rather, using other people, they co-ordinate in ways that are constrained by grammatical (and other social) expectations. Verbal patterns are intrinsic to our modes of action. Thus far, Melser and Tomasello think roughly in parallel. So what of their differences? Is mind a metaphor? The weakness of CaL, I have argued, is that Tomasello is bound to emphasise conventional linguistic strings. Far from reflecting on developmental facts, this depends on the rhetoric of intention-recognising brains. By contrast, the actional view breaks with the code metaphor. Melser traces thinking to a history of co-action. Given a concept of first-order language, the model can be readily extended. It can incorporate both real-time intention attribution and how statistical learning contributes to the rise of vocal skills.

As regards the start of language, not only is intention-processing implausible but the actional view is confirmed by investigation of both interactional co-ordination (Trevarthen, 1998) and how expression uses the self-organizing brain (Trevarthen & Aitken, 2001; Cowley, in press). Philosophical logic also rejects inherent rationality.(xxvii) Melser's view is parsimonious: thinking, consciousness and language derive from a single source. We do things together, abbreviate actions and, later, go solo. Since thinking is grounded in co-regulation, we see evidence in Le Penseur's face and, of course, experience our own thinking. Language is irreducible to formal input/output because, above all, acting and thinking is spurred by (metaphorically) leaky minds. An instinct for concerted activity prompts the self-reminding that allows actions to be integrated with thoughts (that use second-order constructs). The importance of AoT lies in tracing thinking to bodily co-regulation (constrained by brains and language). Given that concerting drives neural change, response to other people and the world can shape both first and second-order language. The dexterity needed for acts of thinking arises from a history of engaging in social practices.

Although Chapter 6 of AoT describes perception, Melser overlooks real-time. By separating action from activity, he blinds himself to the importance of how we act. No space is left for, say, attentional monitoring or coming to hear words. Yet, if language alters subjective experience, brains are more than instruments of culture. Given that we hear, see and imagine words, we can connect other-oriented perception (and action) with the events that drive reinforcement learning, motivation, attention, and affect. On such a view, Tomasello's intrinsic rationality can be rethought as the product of a history of interpersonal events. Learning to talk arises as changes in concerting shape intention-attribution. Action-based schema allow infant actions (including vocalizations) to realise values held by adults (not a child.s brain). Expressive functions are thus insinuated into neural control systems (see, Cowley, 2002). Babies use intention-attribution to develop ways of acting to become living subjects. One functional transition occurs, I suggest, when adults come to hear their utterances as words. Another occurs when the child begins to hear speech around recurrent sames.(xxviii)

Spontaneous concerting becomes solo thinking as language gives infant motives that co-develop with expressive powers. Without concerting, there is no engagement with first-order language; without dialogue — and hearing 'words' — there can be no deliberate action. To become a person, therefore, we need both intrinsic motivations and metaphorical mind-leaks. In Melser's terms, thinking depends on (self) educative planning. Perhaps, this will help us understand how children benefit from using abstract constructions. This stage of learning to talk may link reinforcement learning with hearing second-order constructs (especially, words) that prompt consideration of what is (and is not) possible. While consistent with Melser's bold claim, there is a need to study the child's changing motivations. We need to ask how collective events enable babies (and children) to invent strategies for social and solo action. If connotations matter to second-order language, actions and reactions must be reinforced by both caregiver construal and self-evaluation. Given Clark's (1997) use of leaky mind, culture, language and neural powers meet at a plastic frontier.

While minds do not literally leak, subjects integrate what philosophers call 'content' with their own speech (and action). Thinking, I suggest, depends on hearing utterances in a verbal aspect. Since first-order language is bodily co-regulation, children live a history of concerting. Once they hear verbal patterns, second-order constructs can be integrated with expression to prompt new kinds of construal. Given anticipatory learning, the results will reveal new ways of realising values. Later, given skills and attentional abilities (based in hearing wordings and doing things with thoughts) they will become living subjects who exploit both social and solo action.

We do not need linguistic representations. While seeing this, Tomasello nonetheless relegates thinking to an 'inner' domain. Echoing Descartes, his single process view identifies the subject with output from intention-processing. As a result, he fails to escape from code models. By contrast, Melser abandons mentalism without quite returning to behaviourism. If concerting underpins conversation, the capacity for thought arises between people. Eventually, social events lead us to experience thinking during speech, in covert rehearsal, and as we create written signs. While persuasive, the view needs to be developed. Thus I have argued that it overlooks intrinsic motive formation and leaves obscure why we (sometimes) trust thoughts. It leaves does not show how, given reinforcement, history and culturally regulated concerting, we become creative users of vague patterns. It overlooks engaging with first-order language opens up a world of audible (and, perhaps, visible) signs. This transforms the infant's world. Above all, talk about talk comes to make sense. As Vygotsky saw, action can be guided by imagining signs. Eventually, given words, minds and meanings, we begin to think. Co-action and conversations depend on fictions. Beliefs motivate (self) educative actions. What is reinforced is insufficient to overthrow formalism because language is also co-regulated and culturally constrained. Verbal habits shape full-bodied expression as we repackage feelings, thoughts and values. Thinking results from (neurally-controlled) engagement with first-order language. As co-expression develops, babies gain skills in acting, hearing and reflecting. Real-time concerting enables them to become fully-fledged persons. With skills in first-order language, we begin to take responsibility for what we do and say. We integrate social events with second-order constructs by imagining — and seeking — future opportunities.


I thank Phil Carr, Peter Jones, and Derek Melser and my anonymous reviewers for their thoughtful comments. Together with questions from my Hertfordshire students, these crystallized my thoughts. Where I ignore good sense, my scepticism reaches its limits. Even if language is grounded in neither neural schema nor individuals, I am certain that I am getting old. Action and language happen in time.
i If the process is causal, internal symbols must be grounded in the world. For the mentalist who aims to explain ¡°cognitive capacities in terms of cognitive structures in the mind/brain¡± (Gross, 2005), this is a problem. As Dennett (1978) complains, mentalists view putative structures as intrinsically rational.
ii Clark (1997) uses the phrase to challenge the view that there clear-cut distinctions between .levels. of cognition. Body depends on mind rather as mind depends on world.
iii In Halliday and Matthiessen.s terms (1999), Melser and Tomasello focus on how language changes experience as an .instantial product. as agents exploit .particulars of the world.. In systemic-functional tradition, by contrast, the focus falls on learning a .multiple coding system consisting of content, form and expression: a system of meaning relations, together with their realization as configurations of words and structures and realization of these in turn as phonological forms. (Halliday, 1983: 3). Neither Melser nor Tomasello think that people or brains learn meaning relations or realize forms.
iv As Tomasello fails to distinguish kinds of conventions, it seems that intention-reading applies to events as different as making conversation and playing football.
v ¡°The notion of a communicative intention and function are correlative. Someone uses a piece of language with a certain communicative intention, and we may say that the piece of language has a certain function¡± (Tomasello, 2003: 3). Although they may be .correlative., this fails to explain why functional analysis is to be identified with .intentions.. Philosophical analyses of non-natural meaning (Grice, 1957) did not set out to identify the input (or output) of neural processes.
vi CaL also acknowledges regularities found in conversation analysis (2003: pp. 266-270).
vii For Tomasello, both uses of .intention. identify brain-states. A more subtle view (see, Hacking, 1999) would distinguish the intentions that are intrinsic to (joint) behaviour from those which we report. The former would be intentions as objects and the latter intentions as ideas (in our matrix).
viii Behaviour applies to .actions and reactions by whole organisms. (Martin and Bateson, 1986) which are modified (on many dimensions) as circumstances prompt kinds of action/reaction (at varying frequencies).
ix For Davis, words are ¡°at once an illusion, an invention of grammarians, an artifact of traditional wisdom, of literacy and pedagogy (2001: 191).¡± This heterogeneity is lost when they are identified with verbal patterns. These may be abstracta or, perhaps, real patterns (see, Dennett, 1991b).
x In the 1950s, this ran generative grammar. Although meaning is usually ignored, (Fodor, 1975) posits an interface where a language of thought maps semantic representations onto formal counterparts.
xi It reappears in what Dennett (1988) calls viewing .intentionality. as original. This view is prominent in the work of Searle and his followers.
xii Tomasello makes verbal communicational as .intentional. from the start. In contrast, Thibault (2000) treats interpersonal acts as indexicals and Hodges. (2007) appeals to realizing (conventional) values.
xiii Melser traces tokening to imitation. Elsewhere, using Trevarthen, Cowley (2003) traces co-action to how rewarding events interact with intrinsic motive formation (Trevarthen & Aitken, 2001).
xiv Tomasello fails to explain how putative inner processes are grounded in physical events.
xv Cowley et al. (2004) argues that caregivers shape infant behaviour by ascribing intentions to them.
xvi Landmarks include pointing (Vygotsky, 1986), secondary intersubjectivity (Trevarthen & Hubley, 1978) and social referencing (Campos and Sternberg, 1981). For review, see Legerstee (2005).
xvii In contrast to most theories of categorization, these are neither action-guided nor action-oriented. While perception-based, he appeals .not to physical events (see Barsalou, 1999) .but intention detection.
xviii Tomasello does not explain. While Brooks (1999) and others deny that intelligence needs representations or schema, most think these track aspects of an environment (Sterelny, 2003). Anderson (2003) argues that animals depend on schema for .action-guidance..
xix Tomasello forgets that brains regulate what an organism does. Intention-recognition cannot explain how (putative inner) constructions control social action.
xx For MacDorman (2007) sensorimotor representations are transformed by the interplay of the conscious and the unconscious or what he calls contextual broadening.
xxi One can emphasise first-person phenomenology while denying that it illuminates neural events (e.g. Thompson, 2007). Accounts of wordings (¡°I want to go out¡±) seem post hoc. Often, I think by creating verbal or imagistic fragments. .Intuitions. may give experiential value of phrasal units and mental images without this reflecting on the brain. Even if socially based, they may draw on what Selfridge (1959; cited in Dennett, 1991a) termed pandemonium architecture.
xxii Environmentalism makes Skinner invoke habits; if not eliminativists, cognitivists posit that language is internalized (e.g. Vygotsky, 1986) or installed (Dennett, 1991a). xxiii While treating some action as voluntary, Dennett (2003), stresses that neural processes can arise from culturally-located interactions. Instead of treating the world-perceived as a social product, he would restrict this to aspects of experience. While wetness would be largely non-social (more directly embodied), the taste of wine can be influenced by conversational experience.
xxiii Martinelli (2007) argues that, to understand non-human intelligence, we should consider how animals draw on our tendency to ascribe intelligence to them. While not using the term, he stresses that Clever Hans had the operational equivalent of a .theory of mind.. xxiv In Hallidayan terms, utterances have an ideational metafunction when a child construes a previously un-noticed action potential. For Melser, this arises from mastering verbal patterns and would permit, for example, the rise of autobiographical memory (see, Nelson 1996). xxiv Children learn to hear in locally appropriate ways. While the bio-mechanisms are unclear, they manifest the perceptual magnet effect (see, Kuhl, 1998).
xxiv The social constructionist perspective leads to the same conclusion as Dennett.s (1988) attempt to naturalize language and cognition.
xxiv These are not yet .forms. because the child lacks a linguist.s perspective. On the view given here language relevant neural structures do not correspond to formal descriptions. This is because while brains control action and perception, word-forms are (and remain) second-order cultural constructs.

Ainslie, G. (1985) Behavior is what can be reinforced. Behavioral and Brain Sciences, 8 (1): 53-54.
Anderson, M. L. (2003) Embodied cognition. Artificial Intelligence, 149(1): 91-130.
Barbieri, M. (2007). Is the cell a semiotic system? In M. Barbieri (Ed.) Introduction to Biosemiotics, 179--208. Springer, Dordrecht.
Barsalou, L. (1999) Perceptual symbol systems. Behavioral and Brain Sciences, 22: 577-- 609.
Belpaeme, T and Cowley, S.J. (2007) Extending symbol grounding. Interaction Studies, 8/1: 2-6.
Brooks, R. (1999) Cambrian Intelligence: The Early History of the New AI. Cambridge MA: MIT Press.
Campos, J. J. and Sternberg, C. (1981) Perception, appraisal, and emotion: The onset of social referencing. In M. E. Lamb and L. R. Sherrod (eds.), Infant Social Cognition: Empirical and Theoretical Considerations 273--314. Hillsdale NJ: Lawrence Erlbaum.
Cangelosi, A. Greco, A. and Harnad, S. (2002) Symbol grounding and the symbolic theft hypothesis. In
A. Cangelosi and D. Parisi (eds.), Simulating the Evolution of Language 191--210. London: Springer.
Clark, A. (1997) Being There: Putting Brain, Body and World Together Again Cambridge MA: MIT Press.
Chomsky, N. (1965) Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
Cowley, S.J. (2002) Why brains matter: an integrational perspective on ¡°The Symbolic Species¡±. Language Sciences, 24: 73-95.
Cowley, S.J. (2003) Distributed cognition at three months: mother-infant dyads in kwaZulu Natal. Alternation, 10/2: 229-257.
Cowley, S.J. (2004) Simulating others: the basis of human cognition? Language Sciences, 26/3: 273-299.
Cowley, S.J., Moodley, S. & Fiori-|Cowley, A. (2004) Grounding signs of culture: primary intersubjectivity in social semiosis. Mind, Culture and Activity, 11/2: 109-132.
Cowley, S.J. (2007) The cradle of language: making sense of bodily connexions. In D. Moyal-Sharrock (ed.) Perspicuous Presentations: Essays on Wittgenstein's Philosophy of Psychology 278--298. London: Palgrave MacMillan.
Cowley, S.J. (2007). The codes of language: Turtles all the way up? In M. Barbieri (Ed.) The Codes of Life, 319--345. Springer, Dordrecht.
Davis, H. (2001) Words: An Integrational Approach. Richmond. Curzon.
Deacon, T. (1997). The Symbolic Species: Co-evolution of Language and the Brain. London: Norton.
Dennett, D. (1978) Brainstorms. Montgomery, VT: Bradford Books.
Dennett, D. (1988) Evolution, error and intentionality. In Y. Wilks and D. Partridge (eds.), Sourcebook on the Foundations of Artificial Intelligence 190--212. Cambridge: Cambridge University Press.
Dennett, D. (1991a) Real patterns. Journal of Philosophy, 88/1: 27-51,
Dennett, D. (1991b) Consciousness Explained. Boston: Little Brown.
Dennett, D. (2003) Freedom Evolves. Harmondsworth: Penguin:
Diessel, H. (2001) The ordering distribution of main and adverbial clauses a typological study. Language 77: 345-365.
Fodor, J. (1975) The Language of Thought. New York : Crowell.
Fogel, A. (1993) Developing through Relationships: Origins of Communication, Self and Culture. Chicago: University of Chicago Press,.
Gallagher, S. (2005) How the Body Shapes the Mind. Oxford: Oxford University Press.
Goodwin (2000) Action and embodiment within situated human interaction. Journal of Pragmatics 32: 1489-522.
Grice, P. (1957) Meaning. The Philosophical Review 66: 377-88.
Gross, S. (2005) The nature of semantics: on Jackendoff.s arguments. The Linguistic Review, 22: 249-270.
Halliday, M.A.K. (1983). Learning How to Mean. Edward Arnold: London.
Halliday, M.A.K. & Matthiessen, C. (1999). Construing Experience Through Meaning: A Language Based Approach to Cognition. Cassell: London.
Hacking, I. (1999). The Social Construction of What? Cambridge MA: Harvard University Press.
Harnad, S. (1990) The symbol grounding problem. Physica D, 42: 335-346.
Harnad, S. (2005). Distributed processes, distributed cognizers and collaborative cognition. Pragmatics and Cognition, 13(3): 501-514.
Harris, R. (1998) Introduction to Integrational Linguistics. Oxford: Pergamon.
Hauser, M., Chomsky, N., Fitch. T., 2002. The faculty of language: What is it, who has it, and how did it evolve? Science, 298: 1569-1579.
Hodges, B. (2007) Good prospects: Ecological and social perspectives on conforming, creating, and caring in conversation. Language Sciences, 29/5: 584-604.
Hopper, P.J., and Traugott, E.C. (1993) Grammaticalization. Cambridge: Cambridge University Press.
Hurley, S. (1998) Consciousness in Action. Cambridge MA: Harvard University Press.
Jackendoff, R. (2002) Foundations of Language: How Language Connects to the Brain, the World, Evolution, and Thinking. Oxford: Oxford University Press.
Kravchenko, A. (2007) Essential properties of language: why language is not a digital code. Language Sciences, 29/5: 650-621.
Kuhl, P., (1998) Language, culture and intersubjectivity: the creation of shared perception. In S. Braten. (Ed.) Intersubjective Communication in Early Ontogeny, 297--315. Cambridge: Cambridge University Press.
Legerstee, M. (2005) Infants¡¯ Sense of People: Precursors to a Theory of Mind. Cambridge: Cambridge University Press.
Love, N. (2004) Cognition and the language myth. Language Sciences, 26: 525-544.
Love, N. (2007) Language and the digital code. Language Sciences, 29/5: 690-709.
Lyon, C. Nehaniv, C.L., Cangelosi, A. (Eds.) (2007). Emergence of Communication and Language. London: Springer.
Martin, P. and Bateson, P. (1986) Measuring Behaviour. An Introductory Guide. Cambridge: Cambridge University Press.
Martinelli, D. (2007) Language and interspecific communication experiments. In M. Barbieri (ed.) Introduction to Biosemiotics: The New Biological Synthesis, 473--518. Dordrecht: Springer.
Matthews, P. (1993) Grammatical Theory in the United States from Bloomfield to Chomsky. Cambridge: Cambridge University Press.
MacDorman, K. (2007) Life after the symbol system metaphor. Interaction Studies, 18/1: 143-158.
Melser, D. (2004) The Act of Thinking. Cambridge MA: MIT Press.
Menary, R. (2007) Writing as thinking. Language Sciences, 29/5: 621-632.
Nelson, K. (1996) Language in Cognitive Development. Cambridge: Cambridge University Press.
Pinker, S. (1994) The Language Instinct: the New Science of Language and Mind. London: Penguin.
Sampson, G. (2005) The ¡®Language Instinct¡¯ Debate. London: Continuum.
Selfridge, O. (1959) Pandemonium: A paradigm for learning. Symposium of the mechanization of thought processes. HM Stationary Office, London.
Spurrett, D. and Cowley, S.J. (2004) How to do things without words. Language Sciences, 26/5: 443- 466.
Sterelny, K. (2003) Thought in a Hostile World: The Evolution of Human Cognition. Oxford: Blackwell.
Taddeo, M. and Floridi, L. (2005) Solving the symbol grounding problem: A critical review of fifteen years of research. Journal of Experimental and Theoretical Artificial Intelligence. 17(4), 419-445.
Thibault, P. (2000) The dialogical integration of the brain in social semiosis: Edelman and the case for downward causation. Mind, Culture and Activity, 7 (4): 291-311.
Thibault, P.J. (2004) Brain, Mind and the Signifying Body: An Ecosocial and Semiotic Theory. London: Continuum.
Thompson, E. (2007) Look again: phenomenology and mental imagery. Phenomenology and Cognitive Science, 6: 137-170.
Tomasello, M. (1999) The Cultural Origins of Human Cognition. Cambridge, MA: Harvard University Press.
Tomasello, M. (2003) Constructing a Language: A Usage-based Theory of Language Acquisition. Cambridge MA: Harvard University Press.
Tomasello, M. (2005) Beyond formalities: the case of language acquisition. The Linguistic Review, 22: 193-197.
Tomasello, M. Kruger, A.C. and Ratner, H.H. (1993) Cultural learning. Behavioural and Brain Sciences, 16: 495-552.
Toolan, M. (1996) Total Speech: An Integrational Approach to Language. Durham, NC: Duke University Press.
Trevarthen, C. (1979) Communication and co-operation in early infancy: A description of primary intersubjectivity. In M. Bullowa (ed.), Before Speech 321--347. Cambridge: Cambridge University Press.
Trevarthen, C. (1998) The concept and foundations of infant intersubjectivity. In S. Braten (ed.), Intersubjective Communication in Early Ontogeny 15--46. Cambridge: Cambridge University Press.
Trevarthen, C. and Hubley, P. (1978) Secondary intersubjectivity: Confidence, confiding and acts of meaning in the first year. In A. Lock (ed.), Action, Gesture, and Symbol 183--229. Academic Press, New York.
Trevarthen, C. and Aitken, K. J. (2001) Infant intersubjectivity: Research, theory and clinical applications. Journal of Child Psychology and Psychiatry, 42(1): 3-48.
Wheeler, M. (2005) Reconstructing the Cognitive World. Cambridge MA: MIT Press.
Vogt, P. (2002) The physical symbol grounding problem. Cognitive Systems Research: 3(3): 429-457.
Vygotsky, L. (1986) Thought and Language. Cambridge, MA: MIT Press.
Wittgenstein, L.W. (1980) On Certainty. Oxford: Blackwell.