Verbal Communication: from pedagogy to make-believe
Pre-publication draft of paper in Language Sciences 31, 5, Sept 2009.


Abstract: This paper brings the concept of ‘acting in concert’ to the aid of those wanting to understand the nature of verbal communication. Verbal communication is introduced as a form of concerted activity which has a management function vis-à-vis other concerted (and cooperative) activity. In the body of the paper, verbal communication is likened to other basic management practices: the simplest pedagogic techniques, the soliciting of concerted action by means of mime, collective and solo rehearsing of activity, shared make-believe, the teaching and subsequent exploiting of perceptual abilities, empathy, and the use of objects and graphics for communicative purposes. Various concluding observations are offered, concerning: the great variety of speech’s managerial roles, the danger of relying on colloquial figurative ways of characterising verbal communication, the advantages of the acting-in-concert analysis, and the possibility of a future truly scientific account of verbal communication.


1. What sort of phenomenon is verbal communication?
1.1 It is activity
1.2 It is activity we are aware of engaging in
1.3 It is social activity
1.4 It has a managerial function
2. Prototype acting in concert
3. Teaching by demonstrating
4. Miming to solicit participation
5. Rehearsing an activity
6. Shared make-believe
7. Showing people things
8. Empathy
9. Object-displaying as a means of communicating
10. The many uses of verbal communication
11. Naive figurative conceptions of verbal communication
12. Could there be a science of language?


1. What sort of phenomenon is verbal communication?

By verbal communication (and sometimes speech) I mean speech used to communicate, speech understood by someone. I don’t mean to exclude writing – I bring it in later on – but I am concentrating on speech.

1.1 It is activity

I start with the assumption that verbal communication is a kind of action or activity, something people do. This may seem too obvious to mention, but some theories of verbal communication manage to convey the impression that verbal communication is actually an automatic and impersonal process – a biological or informational transaction of some kind, between people’s brains, perhaps – in which the role of the people involved is hard to pin down. In my view, the people’s role is clear. They do the communicating. Communicating is not an impersonal process but something we ourselves do. And it is an action or activity in the full sense: conscious, voluntary and purposeful. We do it, and are aware of doing it, deliberately and for a purpose, or various purposes (albeit it is sometimes difficult to specify what the purpose is).

1.2 It is activity we are aware of engaging in

What of the ‘conscious’? Surely, if verbal communication is something we voluntarily and consciously do – and we do it (and have since infancy done it) all the time – then giving a comprehensive and accurate description of it should be easy as pie. However, although we do do it very often, and are well aware when we are doing it, and often what for, we are otherwise remarkably ignorant about what it is we are doing, what is going on, when we are communicating verbally. There are several reasons for this. The familiarity itself is one of the difficulties. We have been doing it so long and so unthinkingly that it is difficult to hold our speaking and listening at arms’ length, to get a good view of it. It is difficult to scrutinise just because it is second nature. Our ignorance is also partly due to logical difficulties associated with the reflexivity involved in speaking about speaking, describing describing, understanding understanding, and so on.

But the main barrier to our getting a clear view of what verbal communication consists of is the numerous figures of speech we employ for talking about it at the everyday level. In everyday situations communication difficulties or breakdowns sometimes arise and, when they do, it is necessary for us to speedily highlight the salient feature of the communication scenario in order to locate and clear up the difficulty. We have accumulated a stock of verb nominalisations, metaphors and synecdoches for this kind of purpose. Because these fanciful (but efficient and memorable) ways of bringing certain aspects of communication to our attention are so habitual to us, and because there is such a variety of them, they acquire something of the effect of subliminal propaganda. This propaganda-like effect should not be underestimated. The metaphors, etc., are designed for efficient singlings-out of particular aspects and they don’t care, and it doesn’t matter for everyday purposes, how, by what caricatures or logical mischief, they achieve this, as long as they do. The metaphors were certainly not devised for facilitating the kind of coherent and synoptic purview of verbal communication that philosophers and linguists are interested in. Our addiction to the colloquial metaphors is such that, when we attempt a coherent purview, the metaphors automatically intrude and impose their fanciful construals. We end up peering at shadows on the cave wall. I return to this problem with our awareness of verbal communication later, and I cite some of the offending metaphors then.

1.3 It is social activity

Verbal communication is activity that two or more people engage in together. It is social activity. Most of our social (some prefer cultural) activity falls into one or other of three categories. There is ‘concerted’ activity, in which the participants deliberately conform their actions and act in unison, in concert, and are as it were side-by-side in the activity. Then there is ‘cooperative’ activity, in which people do different things to the same end. This is acting in concert but with division of labour, doing the same thing doing different things. Thirdly, there is what might be called ‘objective’ activity. This covers calculating and manipulative behaviour vis-à-vis others – including exploitative, coercive and defensive/aggressive interactions. Objective behaviour is characterised by one or more of the participants abandoning concert and cooperation and treating the other(s) in an ‘objective’ and impersonal, perhaps sub-personal, way. Where social activity is not clearly one or other of these three types it is usually an amalgam of two or all three of them. Solo actions (performed by an individual alone) are still social if, and in so far as, they contribute to cooperative or objective activity involving others – as most in fact do.

This paper is not concerned with objective social activity, except for brief mention of it just below. My main concern is to float the idea that verbal communicating is social activity of the acting-in-concert type, albeit with a certain amount of cooperative role-differentiation mixed in. This cooperative aspect of verbal communication has been described by others – in, for example, Grice 1989, Clark 1996 and Levinson 2000.

There is an interesting and important background issue concerning the logical and developmental priorities pertaining between concerted, cooperative and objective activity. It is standardly assumed in the philosophical literature that concerted and cooperative activity (usually rubricked together as ‘joint’ activity) is a product of rational (my type three, objective) calculations by individuals. See, for example, Gilbert 1989, Searle 1995, Tuomela 1995, 2000, Meijers 2003. A related evolutionary psychology literature on reciprocal altruism contributes the supplementary assumption that these calculations are aimed at maximising benefits to the individuals themselves. It is assumed that the rational and self-interested individual is the basic social agent, and that concerted and cooperative activity is primarily activity of the objective kind.

My own view is that concerted activity, not individual agency, is developmentally primary. I argue elsewhere that imitation, a biologically innate urge to join in or ‘entrain’ with others’ activity, represents the beginning of the both the social (Dijksterhuis 2005, Kinsbourne 2005) and the rational (Melser 2004). In this perspective, cooperation, solo action and objective activity are all developmentally derivative of acting in concert. Cooperation arises out of concerted action by a relatively simple process of role differentiation. And solo agency, the ability to plan and act by oneself, depends on prior experience of cooperating – that is, acting on shared understandings and on others’ instructions. Thus, solo agency is also developmentally derivative of acting in concert. By arguing in this paper that verbal communication is a form of concerted activity – albeit with cooperative, individually performed, aspects – I am arguing that verbal communication is in essence, like any concerted activity, not so much an interaction of individuals as a reversion to a more primitive side-by-side mode of activity, in which individuals are not yet differentiated. An approach to verbal communication in line with the philosophical orthodoxy would start with individuals and look around for some suitable connection or exchange between them. My strategy is to start with a paradigm of togetherness, a oneness of action and accord – namely, acting in concert – and go on to explain communication as a derivative of that.

1.4 It has a managerial function

The various concerted, cooperative and objective practices that make up a culture include not only innumerable practical activities but also many forms of ancillary, meta-level, activity – via which the practical activities are managed. The management tasks I have in mind are such as: the teaching and training of beginners; acquisition and dissemination of empirical knowledge and/or perceptual skills; instigating and orchestrating of concerted and cooperative activity; ritual and recreational activity to sustain group morale; planning and other practical thinking, and so on. Like the ground-level practical activities they minister to, these meta-level management practices are themselves joint undertakings – they also involve people acting in concert and/or cooperatively. They too are learned adaptations of prototype ‘doing in concert’.

Verbal communication is a particularly versatile and efficient way of expediting the usual management tasks – teaching practical skills and factual knowledge, implementing and orchestrating activities, raising morale, planning, etc. It is the management tool par excellence. Although the epithet ‘managerial’ is not used elsewhere, this idea of speech having an activity-management function is supported by a long and substantial tradition in linguistic theory. This tradition includes the work of Malinowski (see Malinowski 1923) and later ethnomethodologists, de Laguna’s ‘coordination of group activities’ theory (de Laguna 1927/63), the Vygotskyan concept of speech as a tool for regulating social activity (Vygotsky 1978, 1986), Skinner’s theory of verbal behaviour, particularly his discussion of ‘mands’ (Skinner 1957), Wittgenstein’s language-game approach (Wittgenstein 1963), Bakhtin’s dialogic theory (Morson and Emerson 1990), Austin’s and Searle’s speech act theories (Austin 1961, 1962, Searle 1969), and integrationism (Wolf and Love 1997, Harris 1998). And perhaps one should include the abovementioned work of Grice, Clark and Levinson here too.

My main reservation about this tradition is that it does not go far enough. Generally speaking, the above theorists don’t offer explanations of what happens during verbal communication – how it is, for example, that, by speaking, one can get other people to do things. It is implied that, in order to understand verbal communication, it is enough just to appreciate speech’s multifarious and complex regulating or integrating roles in our social lives. At one or two points in Philosophical Investigations, for example, in the context of his elaboration of the language-game concept, Wittgenstein implies that, once one has sufficiently steeped oneself in the cultural and practical circumstances of verbal communication, its nature, which is ineffable, will simply appear to one. He speaks of ‘forms of life’ – and he means forms of concerted and cooperative activity expedited by speech – as “what has to be accepted, the given”. One grasps by a kind of empathy that ‘this game is played’ and that’s all there is to it. Speech works by custom. Well, I think one does come to a Wittgensteinian halt, but it is at a point further on. Before we get there, there is more to be said about the nature of verbal communication. We can attempt explanations of how the managerial tasks are accomplished.

In this paper I try to expose some of the ‘mechanics’ or ‘actional technology’ involved in verbal communication. My strategy is to liken verbal communication to several other familiar kinds of joint activity, with a similar managerial roles, but whose practical efficacy is, compared with that of verbal communication, relatively transparent. You can see how they work. By highlighting affinities and continuities between verbal communication and these other practices, I aim to get some of the transparency to rub off on verbal communication. To show that X has feature F, it often helps to liken X to a Y in which feature F is more obvious. Comparisons can also help clarify what features X does not have. The practices I bring on for comparison with verbal communication are: teaching an action by demonstrating it, the use of mime to solicit another’s participation in an activity, joint and solo rehearsing of activity, showing people things, children’s make-believe, empathy and the use of signs and other representations.

2. Prototype acting in concert

Acting in concert is ‘two or more individuals doing the same thing, in synchrony, together, and being aware of their doing so’. The mutual awareness factor is manifested in the success display which results when parity and synchrony are achieved. The success display includes meeting of gaze (itself a basic form of acting in concert), manifestations of pleasure such as reciprocal smiles and vocals, and more enthusiastic performance of the primary activity. Four kinds, or functions, of concerted activity can be distinguished: practical (lifting a heavy object together, terrifying the enemy with a haka, going on strike); educative (showing someone how to do something by getting them to do it with you); recreational (joining together in song, screaming abuse from the grandstand, sitting down together for a nice cup of tea); and ritual (shaking hands or waving goodbye, attending a church service).

Performing an action in concert with someone else is something that normal human infants begin doing (or participating in) at about two months. Infants are born with impressive imitative abilities, based on the mirror neuron systems in pre-frontal cortex (Meltzoff and Moore 1983, Meltzoff 1996, Rizzolatti and Arbib 1998, Arbib 2002, Bråten 2007). However, neonatal imitation extends only to simple body movements such as arm waving and facial expressions, and has a slightly robotic appearance. It takes six to eight weeks, and persistent loving encouragement by the caregiver, for the infant to demonstrate awareness, in imitation sessions, that he and the caregiver are performing the action in question ‘together’. That is, it takes this amount of time for the infant to be able to reliably join in the success display in addition to imitating the action (Stern 1985, pp.37, 100-102; Trevarthen 1979, p. 347). The mutual success display is important. Before he can properly be said to be doing X with the caregiver, the infant needs to know that he’s got it right. The achievement of this more subtle and extensive togetherness (including both the action and the success display) generates tremendous pleasure and excitement in infants. Caregivers get considerable pleasure from it too. Successful imitation is satisfying to the infant from the beginning but mutually acknowledged successful imitation is much more so. Human brains have evolved to be expectant of, and to thrive on, the particular excitement that acting in concert brings. It is the motivation for culture, and for learning, and teaching. Researchers attest to the enthusiasm with which infants embrace acting in concert once they master it. Caregiver and infant soon develop their own culture of recreational concertings.

Presumably, acting in concert was originally a fortuitous product of innate imitative ability in combination with the propinquity inevitable in small-group nomadic life. Presumably also, it was ensconced in the hominid repertoire well before, perhaps three or four million years before, the advent of modern humans 200,000 years ago. Although, strictly, it is just the imitative ability that is biologically given (and the mutuality, the togetherness, is a cultural addition) we can speak, with Dijksterhuis and Kinsbourne, of a basic urge in human beings towards doing things together. Perhaps it is the defining human urge.

3. Teaching by demonstrating

The learning strategy our mirror neurons equip us for is unilateral imitation. But in a group in which doing things in concert is all the rage, imitation soon gets a cultural makeover. The model wants to help the pupil join in and will facilitate the imitation process by actively demonstrating the activity. Demonstrating involves making a performance ostentatious and hence easier to copy. Imitation is thus transformed into a joint activity, a cultural practice. My claim is that this cultural practice, this ‘teaching by example’ or ‘teacher-assisted performing-in-concert’, is the prototype pedagogical-cum-communicative strategy. Other methods of teaching and communicating, and behaviour-management generally, are basically all modifications and/or streamlinings of this procedure.

The pupil’s motivation for getting it right is provided by the innate urge to imitate and by the enhanced version of this ‘imitation pleasure’ associated with concerted activity – that is, imitation mutually recognised as such. Presumably, the learning of an activity depends on the establishment in cerebral cortex and elsewhere in the body of a reliable neural firing program corresponding to that activity. Establishment (and, subsequently, maintenance) of neural firing programs in cortex is achieved by synaptic enhancement, which in turn is a function of the amount of contiguous neural excitation occurring in association with the program in question. There is evidence that the distinctive pleasure and excitement of concerted performance gives rise to just the kind of intense ancillary neural excitation necessary to etch new behaviours in cortex (Iacoboni 2005; see also Bråten 1998).

As demonstrator, the caregiver/teacher’s job is to by various means concentrate the pupil’s attention and effort on particular phases, junctures, and modules within the activity, and thus assist him to perform these (perhaps more difficult) parts of the activity in concert with her. Among the attention-focussing tactics are: performing parts of the activity in an ostentatious or slowed-down way, or accompanying an action with gestures such as pointing or exaggerated facial expressions. Another invaluable teaching aid is the strategic use of vocal sounds. Vocal sounds are convenient and efficient in this attention-directing role because they reliably attract the pupil’s attention while leaving the demonstrator’s hands (and the rest of the body) free for demonstrating the action. Pointing and other manual gestures, and tactics like slowing down or exaggerating the focus action, are also effective but they are more likely to disrupt the performance. Vocal marking allows it to proceed naturally. Thus, in normal circumstances, vocals – albeit often still augmented by facial expressions and gestures – become the preferred attention-directing tactic.

The vocals used in teaching are at first mostly repetitions of stock Look! How interesting! noises. However, much mother-infant vocalising in early educative contexts is also onomatopoeic and ‘fits’ their joint activity to that extent (Stern 1985, pp.140-141). There is also a tendency for different activities, and different parts of the same activity, to each acquire their own distinctive vocal marker. Early on in mother-infant sessions, for example, the beginnings and ends of activities are differentially marked (Bruner 1975, p.13). Later, the marking regime isolates separately learnable modules within the activity (Savage-Rumbaugh et al. 1993, Wallman 1992, p.51). Distinct vocal markings may also help differentiate activity components otherwise difficult, at first, for the pupil to distinguish.

As a child masters an activity and becomes successful at concerting his performance with that of the teacher, the role of the vocal changes. From being a means of encouragement and attention-directing, the vocal becomes part of the success display – marking the successful performing together of a particular activity (or phase, juncture, module). As a key part of the success display, it attracts much of the associated excitement. The vocal success marker becomes an integral part of the activity, and its performance in concert with the caregiver comes to be the culmination, and a highlight, of it. As infants master more and more specific actions and details of actions, and become more experienced and proficient in the learning and joint performing of actions generally, the pleasure and excitement associated with vocal marking will ensure that teachers’ demonstrations are more and more comprehensively vocally marked.

The child acquires most of his early repertoire via assisted performings-in-concert with caregivers. Once a new skill and its vocal markers are mastered, subsequent repetitions ensure both skill and markers stay, and stay together, in the repertoire. And once members of a group (such as caregiver and infant) get used to vocally marking their doings-in-concert, and marking them the same each time, the activities themselves will become standardised. Over more and more repetitions of an activity-plus-vocal combination, and/or across an increasing diversity of participants, a standard version will tend to average out.

However, we should for a moment set aside the vocalisings instrumental in demonstrating. We should imagine a purely silent – albeit still ostentatious, theatrical – prototype of ‘teaching by demonstrating’. This is what I am likening to, proposing as a first approximation of, verbal communication. I am suggesting that the basic format of verbal communication is that of demonstrating an action. Imagine: the person speaking to you is attempting to demonstrate something, some action for you to copy.

4. Miming to solicit participation

Even after pupils have become au fait with a given joint activity and ceased being pupils, they still need to be able to efficiently initiate sessions of the activity when required. This means getting the other participant(s) interested. The initiating of sessions of collective activity is a social practice in its own right and, like other useful practices, it tends to standardise. The default gambit is for an individual to begin demonstrating the activity, in an ostentatious, inviting way – in the hope of attracting the others’ attention and participation (Bruner 1975, p.13; Vygotsky 1978, p.60). This ‘invitatory demonstrating’ is a simple adaptation of what a teacher does: demonstrating an action in an ostentatious way, to solicit the pupil’s joining in.

In situations in which the audience is very familiar with an activity, invitatory demonstrating can be streamlined. A token performance is enough to identify the activity. A brief mime will often do. As for the audience’s motivation to participate, such is the power of people’s (certainly children’s) naïve desire to ‘entrain’ with others, as Kinsbourne puts it, that a public display of any distinctive fragment of a familiar activity tends to precipitate collective in-concert performance of it. Other things being equal – if there is no reason not to – people will join in. The most perfunctory of imbibing mimes usually works, for example, to solicit others’ participation in ‘having a cup of tea’. At any rate, instead of commencing a laborious demonstration of the desired shared activity, one can often make do with a greatly abbreviated and edited demonstration – a mime consisting of the performance of distinctive and/or representative fragments of it, including gestures, facial expressions, etc. The would-be initiator requires merely to ‘betoken’ the activity.

One kind of ‘distinctive fragment’ convenient for soliciting a session of a familiar shared activity is the vocal sounds that marked our formative experiences of that activity. The same vocals used for action-marking and success-registering during the teaching/learning of an activity can later be employed – perhaps with supporting mime, facial expression, etc. – to initiate subsequent ‘for real’ sessions of it.

To mark this second kind of use for vocalising, we can move from the term vocal, to the terms verbal, word and speech. Speech is just as effective in this activity-soliciting role as an ostentatious commencing of the activity, or a mime involving illustrative gestures and/or facial expressions, or other dumb show. The same principle operates. Speech is dumb show. Rehearsing the speech associated with an activity is just another way of abbreviating a demonstration of it. One is still betokening the activity. One is still displaying, in an invitatory manner, certain distinctive fragments of the activity – in this case, the success markers used in the original demonstrating-and-imitating of it. And speech is a very convenient for this purpose. Vocal/verbal sounds, different ‘words’, are easy to discriminate, remember and reproduce (thanks, presumably, to abilities that have evolved into us) and they broadcast and attract attention well.

If numerous activities have been comprehensively marked out in past learning sessions, then numerous actions and action-modules are subsequently amenable to being ‘called’ in the above way by verbal sounds. More or less any item in the repertoire becomes potentially verbally biddable. Presumably, this second usefulness of vocal markers, for future verbal activity-soliciting purposes, would add to teachers’ motivation to ensure that beginners master all the verbal markers associated with an activity. Not only is the soliciting and initiating of sessions of the activity thereby facilitated, appropriate specific markers can be used, once the activity is in progress, to cue particular phases and junctures, and to generally orchestrate, integrate and expedite proceedings. The calling of suits and bids in card games illustrates this finer level of managerial intervention. But “Let’s have a couple of hands of bridge”, at a coarser level, is a call in just the same sense.

Because of the increased precision, efficiency and convenience it offers, speech thus gradually encroaches into, and usurps aspects of, the soliciting-mime. As Wittgenstein puts it, “The word is taught as a substitute for a facial expression or a gesture” (Wittgenstein 1966, p.2).

Again, as when comparing ‘teaching by demonstrating’ to verbal communication, for an uncontaminated comparison we should look at pre-verbal forms of ‘activity-soliciting’ mime – forms such as mimed tea-drinking, the ‘come here’ gesture, or pointing. This clarifies verbal communication’s relation to soliciting mime. Speech is miming of a sophisticated, factitious kind.

5. Rehearsing an activity

The concept of rehearsal also helps us to understand verbal communication and why we do it. I mean ‘rehearsal’ as in rehearsing a play, ceremony, song or speech, concert, and so on – but with flexibility enough to include other kinds of practice run, anticipatory goings-through of motions, experimental performings, etc. We can start with group rehearsings, ones in which the rehearsing, like the impending for-real performances, involve two or more people acting together. Given this concept, we can think of the prototype pedagogic procedure – teaching by demonstrating, teacher-assisted performing-in-concert – as one extreme of ‘rehearsal’, a rehearsal of the most thorough and laborious kind. Yet its aim is that of any rehearsal: to ready someone for for-real performance.

The use of mime and/or speech to cue performance of already-familiar activities can also plausibly be construed as a preliminary rehearsal. One cost of having a large repertoire of learned skills is that, even after they are securely mastered, many skills still need to be ‘primed’, in an at least minimal way, before they can be put into action. This reflects both the logistics of organising people for action and the storage and arousal requirements of neural firing programs in cortex. In any event, it is usually necessary, even when the proposed activity is very familiar, for the soliciting party to ready the other prospective participants in advance.

When an initator uses speech to solicit a familiar activity, his verbal ‘miming’ does look a bit like an abbreviated rehearsal, a kind of verbal recapitulation, of the activity. But is it a rehearsal for the audience too? What part do they play in this putative ‘joint rehearsal’? Certainly, they shortly embark with the initiator on a for-real performance of the activity, whatever it is. But what are they doing while the initiator is doing his verbal thing – what is their role in the preliminary ‘verbal rehearsal of the activity’ that he is essaying?

I am supposing that the exigencies of our behavioural repertoires are such that the singling-out for performance of any stored behaviour requires that that behaviour be readied, warmed-up, however briefly, beforehand. Presumably also, when an activity is very familiar, the necessary readying process, the rehearsal that the would-be agent or agents must undertake, can be very brief. Whatever minimum arousal suffices to ‘switch on’ or ‘enable’ the activity will do. And in fact, usually, when impending activity is very familiar, its preliminary rehearsal is such a token affair, and involves somatic adjustments so subtle and rapidly accomplished, that, to an external observer, no active rehearsing is visible at all. The would-be agents merely ‘minimally’ and ‘privately’ or ‘covertly’ rehearse doing X – they ‘imagine’ doing X – prior to doing it. By contrast, the kind of rehearsal that speech (or other mime or gesture) constitutes is, though much abbreviated compared with a full actual demonstration, still ‘public’ or ‘overt’. I argue elsewhere that we learn, and learn from others, how to abbreviate our rehearsings down to the minimal, unobservable-to-the-naked-eye and hence ‘private’, level (Melser 2004, pp.81-94).

Thus, while the speaker is overtly betokening – with words, facial expressions, gestures – an activity he wants us to perform, the audience is simultaneously rehearsing that activity covertly, in concert with these overt cues. While the speaker ‘describes’ and/or ‘exhorts’ a given action, the audience obediently ‘imagines’ doing it. The audience ‘follows what he is saying’. So, although they may appear to be immobile and inactive, the members of the audience are nevertheless doing, and collectively doing, something. They are minimally rehearsing the activity, and doing this minimal rehearsing in concert with the speaker’s more overt rehearsal. What we have here is still a case of speaker and audience rehearsing an activity together.

The concerted rehearsing – the verbal miming by the speaker and minimal enacting by the audience in response – makes everyone ‘aware’ what is being asked of them. The activity is primed, and for-real performance can ensue.

6. Shared make-believe

Children’s pretending games exhibit several of the abilities I have been talking about: abilities to act in an ostentatious way, to mime, to use speech to supplement or replace mime, and the ability to minimally rehearse or ‘imagine’ performing an action. As well as fastidiously sipping non-existent tea from non-existent cups, the participants in make-believe tea-parties will, whilst passing non-existent milk or sugar, make enthusiastic comments about the brew, and about other things. This overt show is underpinned by simultaneous corresponding and supplementary minimal rehearsing. And the sippers are doing this minimal rehearsing together, in concert.

The concept that make-believe does not seem to illustrate so well is ‘rehearsal in preparation for real performance’. The children are not practising for a real tea party. Certainly, if it happens to be space aliens they are playing, they are not rehearsing in the preparing sense. The ‘readying’ and ‘enabling’ of items in the repertoire – tea-party or space alien behaviours – is being done just for its own sake, with no for-real sequel. However, pretending games are still a serious business for children. In them they are practising both their public and their private rehearsal skills, they are rehearsing rehearsing.

One of the most common forms of verbal communication adults engage in, conversation, has clear similarities to children’s make-believe. The main difference is that, although the speech is just as varied and vigorous, the other mimings are much more subdued – with smiles, nodding (sober or delighted), head-shaking and tsk-tsking, and facial expressions the only tokenings showing. However, the concerted minimal rehearsing is much like the children’s.

Certainly, unlike childhood make-believe, much adult conversation has a clear practical purpose. Often, the parties are make-believe X-ing now – imagining being somewhere or looking at something or doing something – in order to ready themselves for actual X-ing in the future. However, just as often, conversation is, like make-believe, recreational, engaged in to conjure actions that will never be performed and are not expected to be, perhaps could never be. There is rehearsal but no for-real sequel. Often, people chat just to build togetherness. On the other hand, who is to say when and what imaginings, even unabashed fantasies, are idle? The readying effects of minimal rehearsals are often unforeseeable, and/or subtle, recondite, and slow to accrue.

The main point of similarity between teaching by demonstrating, soliciting participation in an activity by means of mime, publicly rehearsing an activity, children’s make-believe and verbal communication is the core element of doing-in-concert. In the case of verbal communication – whether used to solicit actions or just pass the time of day – what is beng done in concert by speaker and hearer is the minimal rehearsing of the activity in question.

What is perhaps most apparent is verbal communication’s cooperative aspect. With the speaker overtly tokening the activity and the hearer attentive but more or less silent and immobile, and with speaking and listening roles alternating in an orderly manner, the division of labour is clear. But the cooperation is merely instrumental to the ulterior concerted performance – which is, as it were, the main event, where the action is. Although it looks as if speaker and hearer are doing quite different things, au fond they are doing the same thing. Both are, in concert, minimally rehearsing the same activity. Certainly the speaker leads, with the verbal cues, and the hearer follows. But much doing in concert is of this led type. Much dancing is, for example. Children get practice in the kind of slavish obedience to a leader and the leader/follower role alternations required in verbal communication in imitation games such as Simon Says. In fact, Simon Says is another interesting approximation to what goes on in verbal communication sessions. Although the cooperative – role-taking and turn-taking – aspect of verbal communication is real and important, at a more fundamental level verbal communication is an undertaking in which speaker and hearer are not so much playing different parts – and face-to-face, as famously depicted by Saussure – as side-by-side, doing the same thing.

There is even a tendency, albeit only a tendency, for the speaker’s speaking and the hearer’s listening to merge. It sometimes happens that a listener, if particularly gripped by what he is hearing, repeats the speaker’s words, sotto voce. And it nearly always happens that listeners follow speakers with some form of token confirmatory reciprocation of the same speech – if not the same words mouthed inaudibly, then nods or smiles (of ‘comprehension’) as a success display. Listeners also tend to adopt the facial expression and bodily attitude of the speaker. The in-unison ulterior rehearsing generates similar microbehaviour (voluntary and involuntary) in both parties. Condon speculates a biological basis for our imitative response to speech (Condon 1971, Condon and Sander 1974). Of course, hearers are more or less immobile and impassive while the speaker speaks. If the main event in verbal communication is what I say it is – covert minimal rehearsing done in concert by speaker and hearer – then apparent immobility and impassivity on the hearer’s part is to be expected. As I stressed earlier, minimal rehearsing is usually a very subtle and hard-to-observe kind of performance. It presents the appearance, almost, of going on ‘inside’ the person. One might even fancy it going on inside the head.

In summary, the view of verbal communication suggested by the pretending-game analogy is as follows. The speaker’s sequential telling of parts of the salient activity, by verbally miming them, is like what teachers demonstrating actions do, and what mimes and children playing make-believe do. Although the hearers do not exhibit much overt action-rehearsing in response, they are nevertheless fully engaged, in concert with the speaker, in their more subtle and sophisticated ‘minimal’ rehearsing. The speaker is directly demonstrating, albeit with an extremely abbreviated and stylised demonstration (he is doing it ‘in words’), an activity for the hearers to minimally rehearse with him. The hearers perceive the speaker’s verbal and bodily display as a token performance, a ‘pretend version’, of activity X – and they join in accordingly.

7. Showing people things

I have cast verbal communication as a member of a family of social practices the purpose of which is to organise and implement concerted and cooperative activity – or, more simply, to get people to do things. These management practices work by exploiting people’s native desire to join in with what others are doing. One person demonstrates an activity – performing it in a certain ostentatious and inviting, albeit often very abbreviated, way. The audience naturally falls in and imitates, again often in a very abbreviated way, the activity being demonstrated. Such joint, concerted, rehearsing of an activity may or may not lead – depending on what kind of activity it is and what present circumstances are – to an actual performance of it. At the very least the activity in question is ‘primed for action’ in the brains, and elsewhere in the bodies, of the participants.

This may be a reasonable explanation of how hortative or imperative forms of verbal communication work, and it may thus help clarify how speech can serve its multifarious managerial roles vis-à-vis people’s behaviour. But it doesn’t seem to cast any light on that other main purpose of verbal communication – the acquainting of people with matters of fact: what Austin called ‘constative’ speech – describing, explaining, reporting, and so on. The elementary and ubiquitous (in verbal communication) act of ‘referring’ even – ‘referring to things in the world’ – seems to have been left unexplained by my analogies. But this appearance is temporary. In fact, the ‘soliciting of participation’ format applies just as firmly to descriptive as to hortative speech. If we see perceiving as a learned skill and an action people voluntarily perform, referring and describing can be seen to be not contrasted with hortative speech but a variety of it.

The young child acquires most of his repertoire of abilities in sessions of teacher-assisted doing-in-concert with caregivers. And this includes his perceptual abilities. Much of the child’s early education consists of shared perceptual explorations, with the caregiver, of local objects, people, body-parts and other things. Clap hands, finger (this little piggy) and Daddy are things we rehearse together. So are slowly, blue, in the garden, sunshine and all clean! In sessions of looking, listening, approaching, palpating, manipulating, and otherwise investigating, the caregiver is showing the child ‘things’ and how to perceive and talk about them. She is teaching him investigative and perceptual strategies, tactics and recipes, and permutations and combinations of these. And this teaching, cueing, orchestrating, expediting and shared imagining of perceptual behaviour proceeds in much the same way as the teaching, cueing, orchestrating, expediting and shared imagining of any other behaviour.

Like other joint doings, joint perceivings are taught with the aid of attention-directing tactics such as exaggerated performance. For example, the caregiver might hold an object up and ostentatiously scrutinise it from top to bottom, then pass it over for a repeat performance. And interested noises, exaggerated facial expressions, eye-widenings, ostentatious gazing, and other gestures play as important a role in teaching perceiving as in teaching other activity (Savage-Rumbaugh et al. 1993, pp.122-123). Pointing is one pedagogic tactic unique to perceiving lessons. And, as in the teaching of other doings-together, these ostensive ploys are first supplemented, then largely replaced, by verbal markers. Some of these will be general attention-directors like this, look and here, perhaps done alongside non-verbal vocals expressive of heed. However, each of the different perception-recipes the child learns is eventually allocated its own distinctive verbal marker. Successful concerted performance of a perception recipe can then be ritually consummated with that recipe’s ‘word’. The child ‘learns the names of things’. Such verbal brandings discipline, standardise and conform our perceivings just as they discipline, standardise and conform our efforts in other areas. Inclusion of the appropriate words empowers our perceptual interrogatings and interpretings of the world with correctness and togetherness.

Soliciting sessions of shared perceiving, new or familiar, is what we call ‘showing people things’. In showing someone something, one is getting them to concert their perceiving with yours. And, as when initiating joint sessions of any other activity, one requisitions verbal markers from the teaching context for use as mimes. Here too, verbal cueing has come to be the preferred means of instigating and enabling shared performance.

Supplementing the specific verbal cues are other, vaguer, attention-directing ploys also borrowed from the teaching context – handing over an object, pointing, conspicuously staring, making non-verbal vocal sounds expressive of interest. Sometimes, a subtle swivelling of one’s gaze and raising of the eyebrows is all that is necessary to direct the other’s attention. Vygotsky describes pointing as an abbreviation of the act of ‘handing over an object’ or, more precisely, an abbreviation of the act of ‘reaching out and grasping something in order to investigate it and/or pass it to someone’ (Vygotsky 1978, p.56). But usually the speech, just ‘saying the thing’s name’, does most of the work. You say, with or without pointing, Look, there’s a walrus, albeit you’d better be looking in the right direction. Generally, pointings, facial expressions, tones of voice, and other ancillary mimings play a supporting role. But the format is the same as with getting other people to do anything else with you. You do an ostentatious but abbreviated demonstration, a distinctive fragment or mime, of what you want of them. In this case it is the attending-to, and perceiving and investigating of, something.

As far as everyday adult perceptual activity is concerned, speech is used far more for soliciting the sharing of imagined or make-believe perceivings than for soliciting the sharing of actual perceivings. Typically, and almost always in conversation, the thing to which the (speaker, by means of the) speech is directing attention is absent, and speaker and hearer can but minimally rehearse the relevant perceptual activity together. However, in either case, whether the thing in question is absent (and the perceiving is perforce merely imagined) or whether it is there to be perceived here and now, we call this use of speech referring. Referring is the soliciting of joint perceivings, or joint imagining of perceivings, using erstwhile verbal markers for ‘distinctive fragments’ to mime with. Again, reprising the verbal marker (‘saying the thing’s name’) amounts to a kind of drastically abbreviated demonstrating of an activity – in this case the activity of perceiving such-and-such.

One might call describing ‘serial referring’: the speaker moots a certain sustained and complex course of perceivings involving multiple referrings to things, aspects and qualities of things, interrelations between things, and so on. Often a certain kind of heuristic vehicle is assumed as a context for the description. That is, the participants in a describing session imagine themselves performing some sort of investigative procedure – walking round an object and looking at all its aspects is one very simple ‘heuristic vehicle’ – in the course of which all the various perceptions of the subject matter are performed. Imagining the action vehicle links all the imagined perceivings together in a systematic and disciplined way, making the description both more plausible (than a bare list, say, of things to imagine) and more memorable. Some descriptions are so closely tied to an action or activity (in the course of which one does, and ticks off, the relevant perceivings) that the speaker might just as well have employed the hortative, ‘we should do this’ format. When giving street directions, for example, one might describe the lie of the streets – what building is to the north of what other, and so on – in a purely descriptive, objective way, as if reporting what the map says. Or one could tell the enquirer what to do, which way to go and what to see before what.

Although they are often used recreationally, the primary use of referring and describing is to prepare for perceivings that will occur, or are likely to occur, in the course of, and as integral parts of, activities speaker and/or hearer will engage in in the future. Referring and describing are primarily means of rehearsing – and thus readying, or ‘priming for action’ – the perceptual components of future activity.

8. Empathy

Comparison of verbal communication with empathy helps illuminate the motives we have for communicating. Empathy is hardly a cultural practice, but neither is it simply a biological phenomenon. It is an example, rather, of the cultural management of a biological urge, in this case the urge to imitate. Philosopher Susan Hurley is talking about empathy when she writes:

While normal adults are usually able to inhibit overt imitation selectively (and it is adaptive to do so), overt imitation can be regarded as just the disinhibited tip of the iceberg of continual covert, inhibited imitation. Such covert imitation may reflect a basic motivation of human beings, adults as well as children, to interact synchronously or entrain with one another, which is a mechanism of affiliation as well as of social perception and learning (Hurley 2006, p.211).

Empathy is our primary recourse when, as spectators, we are trying to understand what another person is doing – including what he is thinking or feeling (that is, minimally doing). We attempt to imagine, ourselves minimally rehearse, what it is the other person is thinking, feeling or otherwise doing. To succeed in this is to ‘empathise’ and in this sense understand. Perhaps the person is aware of being watched and – by adopting a certain posture, making certain gestures, or moving so that we can get a better look at an overt action he is performing – helps us to empathise. And his eyes may turn to meet ours, making the empathising mutual.

Or we can simply ask him what he is doing. If he chooses to tell us, we don’t need to empathise, we might say. But it is rather that, when he tells us – as when someone is aware of being watched and contributes the slightly exaggerated posture or the expressive gesture, ‘body language’, or deliberately exposes his action – we are being helped to empathise. Verbal communication is assisted empathy too.

The functions Hurley allocates to empathy are verbal communication’s functions: verbal communication is an instrument of learning and of affiliation (when we converse recreationally to confirm our togetherness), and it is a means of ‘social perception’. The listener’s role in verbal communication is essentially that of trying to understand, by trying to empathise, what another person is (overtly and/or minimally) doing. The format is much the same in each case. The main difference is that, in verbal communication, the person whose behaviour is being interrogated or ‘read’, the speaker, is doing his best, employing the most efficient means he has available (judicious verbal miming), to help the empathy along. This is reminiscent of the scenario at the very beginning, in which the caregiver actively helps the infant imitate. At any rate, it may be that to ‘understand what someone is saying’ is just to empathise, or to be able to empathise, or, perhaps, to be able to confidently and accurately empathise.

Lastly, once empathy becomes mutual it may or may not, as with successful spoken communication, result in overt concerted and/or cooperative activity. Empathy often precipitates conversation, say, or physical helping – or some act of solicitude.

9. Object-displaying as a means of communicating

I have argued, by analogies, that verbal communication involves the speaker in performing a kind of mime, using distinctive fragments of a salient activity – specifically, the verbals grafted on to that activity by teachers early on – in order to, albeit in a very stylised and streamlined way, demonstrate or re-present, and perhaps thereby solicit a session of, the activity. As shorthand for this, we can say that, by his speaking (in that way), the speaker ‘evokes’ the activity in question. There is a range of practices that seem to have a very similar function to that of speech, which all involve the displaying of objects of some kind. For example: marking a trail in the forest by tacking blazes on trees; putting up road signs and signs over shop windows; signalling using railway signals, smoke signals, semaphore or Morse code; displaying a symbol, icon or logo (the cross, swastika or the Nike tick) on one’s person or on a building; writing; arranging minims, quavers and crotchets on a musical staff; moving a matchbox round on a table to demonstrate the last movements of the Admiral Graf Spee; making and showing pictures and diagrams. How like are these various communicative or, at least, communication-related practices to the one I have been calling ‘verbal communication’ and ‘speech’? Do these various object-displayings ‘evoke’ activities and activity scenarios in the same way speech does?

Assuming that it is some kind of activity-evoking that is going on, we can ask first, what is doing the activity-evoking? Is it the object – the tin disc being used as a blaze, the road sign, the marks on paper, the matchbox – that is re-presenting something? We tend to speak and think as if it is. We say that the blaze shows the way to go, the red light means we stop, the cross signifies Christianity, the matchbox represents the Graf Spee. However, I would say this way of speaking is figurative (part-for-whole synecdoche, in fact) and that it is not the object that is doing the activity-evoking. Objects cannot literally do things. That’s the thing about objects. So, what is doing the business? My answer would be, ‘the person’. Whoever is displaying the object – and this might in the case of road signs, railway signals and commercial logos be as vague an agent as ‘the appropriate authorities’ – is doing the activity-evoking. The person is doing it by displaying the object. Speech involves attracting someone’s attention and, while you’ve got it, uttering carefully chosen sounds accompanied by gestures and facial expressions. Hardware-displaying involves acts such as attaching discs or ribbons to trees, erecting a sign in a certain place, raising and lowering a blanket over a smoky fire, wearing a necklace or arm band, making marks on paper…

So, what is the role of the object or graphic? The very presence of a communication-related object or graphic is an indication that somebody wants to display something for activity-evoking purposes. Somebody must have put the thing there with that aim. The object’s being there for us to see either still constitutes the act of displaying or it is an obvious relic of it. The genre of the object-displaying is also evident. It is track-marking, or traffic-controlling, long-distance signalling, letter-writing, etc. The nature of the displayed object (a brightly coloured plastic disc, a Stop sign, a particular pattern of smoke puffs, particular written words) serves to further-specify the displaying action – to help, or to enable, potential viewers to identify just what kind of object-displaying action is being performed. The action is ‘putting the disc on that tree rather than this’ or ‘displaying a Stop sign, not a Give Way sign’, ‘liberating three quick smoke-puffs, not two’, ‘writing a thank-you note’. And the game is that, within each management-genre, the various different object-displaying actions (as differentiated by what object or graphic is displayed) will each tend to evoke a different action or activity.

Of course, object-displaying is done for its practical advantages too. It is also part of the object’s role, if it is an object or sign or graphic, to be durable and thus be visible in that place, to whomsoever passes, for a long time. And smoke-puffs are chosen as display-objects because, although they are not durable, they are visible from afar.

Although we have yet to establish both what kind of ‘activity-evoking’ object-displaying accomplishes, if any, and what kind of thing is evoked, we have established that object-displaying is similar to speaking in that what does the activity-evoking is a person – by speaking in the one case and by displaying an object in the other. I mention this because not only do we often speak and think as if it is (in the case of object-displaying) the object displayed that does the activity-evoking, we tend to liken the two ways of communicating, speech and object-displaying, too closely. We compound the mistake about object-displaying by inferring, by analogy, that, in the case of speaking also, it is (if not objects displayed then) words uttered, produced, bruited, that have the evocative effect. I have suggested that, re. object-displaying, the mistake is that of taking a colloquial synecdoche too seriously. In connection with speech the mistake is twofold. First, a similar synecdoche is taken literally. The second mistake is thinking of a word as a kind of object. Certainly, the written word, as a graphic, is a kind of object, but I included writing as a kind of object-displaying and we are talking here about speaking. A word is, like the ‘go away’ gesture, say, a kind of action – actually, a mime-like kind of action. Although writing does, speaking does not involve any production or use of hardware, even in the broadest sense.

Before we get on to deciding whether it is ‘activity-evoking’ that object-displaying accomplishes, and if so what kind and how, we can take a guess at what general kind of activity it might be that object-displaying evokes (if it does evoke). This is a matter, like many addressed in this paper, that really requires book-length treatment. I’m going to make a suggestion anyway. What communicative acts of the object-displaying kind evoke is acts of speaking. Thus, the authorities’ displaying of the Stop sign there does not evoke in the motorist, at least not immediately, a minimal rehearsal of stopping there, behind the white line. Stopping is a primary, or ground-level-practical activity, such as speech would evoke. Rather, the displaying of the Stop sign solicits from the motorist a minimal rehearsal of ‘seeing an appropriate authority figure there telling him to stop’. Or perhaps, it evokes simply ‘his being told to stop’. The cross worn by a person or a building is proxy for a whole litany of descriptions and exhortations. Written words are similarly evocative of in-context acts of saying those words or listening to them being spoken. The crotchet perched by the composer on the staff does not re-present the note itself (the sound) nor even the playing of that note: what it re-presents, what it is, is the composer’s verbal instruction to the reader to play that note.

The speech that object-displaying evokes may well be imagined by the viewer as being performed in the context of someone’s showing him something. The hikers seeing the blaze over there rehearse being not only told but shown the way to go. The blaze is imagined as pointing in a certain direction. A drawing or diagram evokes a different kind of pedagogic scenario – a kind of guided tour of some scene, region, operation, object, or whatever, with accompanying verbals.

Naturally, there are dubious cases and exceptions. One of them concerns Morse code and semaphore. These seem to be at not one but two removes from any primary activity. Use of semaphore or Morse is meant to evoke written words. The written words then in turn evoke speech, which finally evokes whatever the primary activity is – sending reinforcements, perhaps.

My contention is that communicative object-displaying evokes acts of speaking (sometimes embedded in other forms of demonstrating) which in turn evoke activities to imagine engaging in. If this is a reasonable, we can go on to ask what sort of ‘evoking’ is involved here. Is it much the same as with speech, where the evoking is essentially a form of miming, of re-enacting distinctive fragments of an activity in such a way that one’s audience is encouraged to imagine engaging in the activity with you? I suspect it is much the same. And if it is, if object-displaying is essentially the miming of speech, then we have to ask how the necessary preliminary ‘marking-out’ could have been achieved. How could it be that object-displaying is ‘added in’ or ‘grafted on’ to acts of speaking, when those acts of speaking were being learned – in the same way speech is added in or grafted on, by caregivers or teachers, to primary activities the child is learning how to perform? How could object-displaying become a ‘distinctive fragment’ of speaking, such as could be later used for miming it?

In some cases, it is not so difficult to see how. When one is ‘learning Christianity’, for example – learning to talk like a Christian, to say the things Christians say – the cross is constantly there, in the Church and outside it, on the periphery. Seeing the cross does get ‘grafted on’ to one’s experience of Christianity, and subsequent displaying of it does ‘mime’ the whole Christian argument. And consider learning to read. In the middle ages, groups of monks taught their fellows to read by having them memorise long chants and then presenting the corresponding written text to them, line by line, while everyone recited the chants together. Variations on the monks’ method work well with modern children. Here the graphic is being ‘added in’ to the speech.

I am proposing that our mastering any object-displaying-type communicative genre presupposes prior repeated associating of specific object-displayings and specific verbal evokings – to the point where the object-displaying becomes part and parcel with performing those speech acts. After such training, witnessing object-displayings will reliably elicit from the witness, as other kinds of mime would, some minimal or not-so-minimal rehearsal of the corresponding speech. Imagining the speech in turn evokes the activity the speech itself mimes.

10. The many uses of verbal communication

The account I have sketched implies that speech’s only function is the readying of groups for concerted activity – albeit the mooted activity is often postponed and sometimes indefinitely postponed. But clearly, speech has a multitude of other functions. It can be employed to ready individuals for solo actions, to ready several individuals for contributing variously to complex cooperative undertakings, to ready hearers for activity on the speaker’s part (as in promising, declining, threatening), to ready hearers for refraining from acting in a certain way (warning or dissuading), to ready hearers for performing certain perceptual behaviour immediately, to ready them for attempting certain perceptual behaviour in the future, to ready them for never performing it, and so on, near enough ad infinitum.

What is there of the ‘mime-like evocation of shared activity’, that I am saying verbal communication is based on, in, say, saying goodbye? Threatening someone or welcoming them, yes. And thanking. But saying goodbye, and waving? The truth is that speech’s use to solicit and implement doings in concert is not its only but only its primary use. Helping us act together is what speech cut its teeth on. This is the original, powerful paradigm. But this paradigm format, which I have spent this paper likening to other things, gives rise to multiple variations and/or adaptations for particular purposes in particular contexts. In the beginning– in hominid cultures (presumably) and in the nursery (evidently) – there is just one kind of speech act. But this one kind is later forged by circumstance and invention into many. Elsewhere I have speculated, on the basis of developmental psychologists’ findings, how it is that perhaps the most important adaptation, the ability to exhort solo rather than just collective action, was accomplished (Melser 2004, Chapter Four). It is not for this paper to speculate even the two or three main branches of the developmental tree that has issued the innumerable act-kinds speech now performs. But I assume there is a job for a Darwin or at least a Linnaeus of pragmatics out there.

In the nursery speech has, and possibly out on the savannah it had, originally a pedagogic role. Teachers use it in a ‘marking’ capacity in the course of demonstration-and-imitation sessions, to help direct their pupils’ attention to important parts of activities being learned. Once the activities have been mastered, erstwhile vocal markers can be re-used, in a kind of miming, for the purpose of triggering performances of those activities or parts of them. A little further down the developmental track, speech is used to merely ready or ‘prime’ activities that are to be performed later, or in the indefinite future, or not at all. In what amounts to a kind of make-believe game, speech cues and guides speaker and hearer’s in-concert minimal rehearsing of an activity. And in mature verbal communication, this token ‘readying for action’ is supplemented by specific imperatives as to how individual parties are to act vis-à-vis the prospective activity.

Verbal communication can be said to establish the ‘shared understandings’ on which our concerted and cooperative activity, our culture, is ‘based’. Communication is our readyings of one another and ourselves to act together, to act ‘with one accord’. If we wanted to embrace the functions of speech in one formula we would have to say something like: the function of speech is to assist with the teaching, readying, initiating, expediting and general concerting and coordinating of cultural activity. This way of putting it is consistent at least in spirit with the conclusions of the ‘speech as cultural management’ theorists I listed near the beginning. Verbal communication is the cultural ancillary, or meta-cultural cultural activity, par excellence. It is our main managerial recourse vis-à-vis the things we do together – the concerted and cooperative undertakings culture consists of – and it is itself something that we do together, ‘in concert’.

Verbal communication is a technique for synchronising and integrating people’s actions – including their perceivings and imaginings. It is often used for conforming the actions of a follower to those of a leader, but anyone can play leader. And verbal communication is not the only means of synchronising actions. The leader’s simply making what he or she is doing conspicuous may be enough to get the audience to follow suit. Or other forms of mime or body language less factitious than speech might do. Nor is it always necessary that any special means be employed for the synchronising of actions. Hurley, Dijksterhuis and Kinsbourne claim that people instinctively act in concert with one another. See also Condon 1971, Condon and Sander 1974. People together in a situation tend to perceive and imagine the same things – often greatly reducing the need for speech. Another qualification: verbal communication is often convened not for the simple synchronising or conforming of actions but for concerting them in other ways – allocating roles in cooperations, for example. However, the use of speech to invite the hearer to join forces, act in concert, with the speaker is prototypical and other uses of speech are derivative of it.

11. Naïve figurative conceptions of verbal communication

We quite often need in everyday situations to talk about talk and, as I mentioned near the beginning, talking about talk can be difficult. As when we are talking about thinking at the everyday level, and for similar reasons, we tend to use figures of speech. And there are plenty available.

Verbal communication is, strictly and literally, an interpersonal activity and, in the case of the paradigm form, speech, an activity involving no use of hardware of any kind. When we come to talk about any class of actions or activities in a general ‘describing a phenomenon’ way, we tend to nominalise the relevant action verbs. Actions seem ephemeral and somehow ‘subjective’, difficult to hold at arm’s length, compared to objects. The nominalisations allow us the impression that the activity under scrutiny, in this case verbal communication, is not so much an action that we and/or others perform as an impersonal phenomenon, something independent of us and the things we do, something we can pin down and look at in a detached, objective way. Thus we come to think of verbal communication in terms of objects or quasi-objects out there in the world called ‘words’, ‘meanings’, ‘expressions’, ‘sentences’, ‘language’ and ‘languages’. Language – originally, perhaps, ‘activity involving use of the tongue’ – becomes a phenomenon in the world.

Metaphors of production and/or use, and the transitive verb say, then make a fait accompli of the reifications mooted by these nouns of convenience. The metaphors further-specify the putative linguistic objects as objects ‘produced’ and/or ‘used’. Words are ‘said’, ‘employed’, ‘set down on paper’, expressions are ‘coined’. Our use of words and so on for communication is conceived via metaphors of words’ ‘possessing’, ‘containing’, ‘carrying’ or ‘expressing’ ‘meanings’. Communication is pictured as the conveying of words, along with their contents (meaning, ideas, thoughts and feelings, information) through the air, as it were, from one person to another. Interpenetrating this metaphorical idiom for talking about verbal communication is the equally figurative everyday ‘mind’ idiom. Mind idiom reifies the personal activity, thinking (a sophisticated variety of my ‘minimal rehearsing’), and related actions and activities, into a host of impersonal ‘mental’ entities, states, events and processes (Melser 2004). Thus words are pictured as conveying thoughts from one mind to another, evoking concepts and images in people’s minds. Thoughts and feelings are put into words, conveyed in sentences, and so on. These essentially fanciful conceptions seem to have found their way to, and taken up residence in, ivory towers. Nominalisations, metaphors and synecdoches from the vernacular now sleep in the inner recesses of academic theory.

Metaphors and other figures of speech are a tremendous boon. The colloquial ‘language’ and ‘mind’ metaphors are a big help in construing thought and communication in the most convenient way, and highlighting the aspects of thinking and communicating that are most salient for everyday interpersonal purposes. These are mainly purposes of communication breakdown avoidance and/or repair. But the metaphors are no good at what they weren’t designed for. They were never designed to provide what the linguistic theorist needs – a perspicuous theoretical overview of verbal communication. If read as theory, the colloquial metaphors and the grammar scaffolding them can only mystify.

One of the problems with the metaphor-based view of verbal communication is that it inevitably raises the question: what is the nature of the representing relation between a word and its communicable content, its meaning? How do sounds in the air or marks on paper mean things – whether things out there in the world or things inside people’s heads? These questions are often assumed to be what formal linguistics and the philosophy of language are all about. At least, an account of how words mean things is thought to be essential for understanding verbal communication. However, due to the fact that ‘words’ are essentially convenient fictions and the fact that meaning and representing are things that, literally speaking, only people can do, the questions are, if they are comprehensible, unanswerable.

On the concerted-activity-based approach I have outlined, these particular imponderables do not arise. On this approach, the essence of verbal communication is speaker and hearer’s imagining something in concert. Speaker and hearer are engaged in a joint project – a sort of pretending game – involving a special kind of miming on the speaker’s part and the rehearsing, by the two of them, together, in a drastically abbreviated and ‘hypothetical’ way, of an activity that they could in principle, if circumstances were different, be rehearsing together in full and in fact. ‘Communication’ and ‘understanding’ are achieved when it is the same activity that is being (minimally and at lightning speed) rehearsed in concert by speaker and hearer – and this event is signalled, by some conventional display, as having occurred. Proof that the parties understand each other, and that both are entertaining the same activity, is always in principle available. They can check whether the activity is the same for both by carrying out a full actual rehearsal of the activity they are presently rehearsing only in its barest essentials. What counts as ‘the same’ here is what the parties, and any external referees the parties care to rope in, say is the same. In the final analysis, consensus (agreement, concert) rules. And here is the Wittgensteinian and ethnomethodological terminus.

On this account of verbal communication the only candidate for a ‘meaning’ or ‘representing’ relation would have to be the relation between the speaking, or verbal miming, that the speaker is doing and the activity that is being evoked (‘called forth’) thereby. But there is, essentially, no such relation. Apart from the fact that the one is a much abbreviated version of the other, there is no ‘semantic link’ between speech and the activity evoked by it. Verbal communication is a pretending game in which the speech is activity X. Speech and other overt betokenings (facial expression, tone, and so on) are just convenient ways to present or re-present, and thus solicit or moot, activities. Far from his aurally receiving a series of little semantic parcels (on their way into his brain for unwrapping and decoding), the hearer is simply, following the speaker’s lead and together with him, imagining doing something.

12. Could there be a science of language?

Verbal communication is something like charades: the speaker is putting on a theatrical-type performance, making as if he is doing something, and the audience has to guess what it is, what it is he is ‘doing’, and imagine doing it too, along with him. Of course, the fact that the speaker is providing a whole lot of verbal clues – in addition to the facial expressions, tones of voice, postures and gestures, and other demonstrative gambits – makes it easy for the audience. There is usually another thing that makes it easy too, the physical and social context. Although sometimes the speaker will attract your attention and start doing his thing pretty much out of the blue, this kind of charades is normally tied to whatever is the prevailing social activity at the time. You can infer what the speaker is imagining, incipiently doing, because the present context is one you are familiar with and in this kind of context, the next move on the agenda is usually such-and-such – and it is probably this the speaker is entertaining doing. Thirdly, it is easy because we have all been playing this game since late infancy and we are very good at it. At any rate, the ‘guessing’ is usually no trouble at all. You get it right right away. And the speaker smiles and nods to let you know you’ve got it right.

The fact it is so effortless so much of the time means we have no need to attend to what it is we are doing, to understand what kind of skill is involved, in the ‘guessing’. When communication breakdowns do happen, the colloquial vocabulary is immediately on the scene and, while the metaphors do their business, we are prevented from getting a clear view. I have suggested in this paper that the main skill required for verbal communication is empathy – a sophisticated, inhibited application of what is probably the basic and distinctive human urge, to participate with others in concerted activity. Scientific study of imitation and empathy and their relevance to verbal communication is already under way (Dautenhahn and Nehaniv 2002, Hurley and Chater 2005, Bråten 2007). However, it looks as if there might be logical and methodological problems inherent in such study.

I have elsewhere argued the general claim that people’s actions are not suitable for scientific scrutiny on the grounds that empathy is required for the observation and identification of actions and this empathising is incompatible with scientific objectivity (Melser 2004, Chapter 11). Even if empathy does not itself qualify as an action – it could be argued that it is a biological function not significantly different from neonatal imitation – the concept of action is required to define empathy. Empathising is imagining performing, whilst observing, an action being performed by someone else. The restriction on particular actions applies also to the concept of action in general. In order to understand what an action is, what it is to ‘do’ something, one needs to empathise, to activate one’s status as a personal agent – and this is something the objective scientist is not allowed to do. His job is to observe things and events in an entirely detached and impersonal way. Being unable to objectively define one’s subject matter should be a significant setback for the would-be scientist of empathy.

However, even if the proscription on scientists’ empathising is ignored (as it seems to be in the social sciences generally) there is a more acute problem, or a more acute version of the same problem, inherent in the study of verbal communication. At least, there is an acute problem if my ‘shared pretending’ analysis has anything going for it. A pretence exists only to the extent that the parties to it maintain it. Consider two standard means of getting someone on the other side of the paddock to come and stand beside you: making the ‘come here’ gesture, or bawling Come here! On my theory, both the arm movement and the speech are forms of mime. The pretence, to which both speaker and hearer are parties, is that the arm movement or the speech is the action, ‘the hearer’s going to stand beside the speaker.’ Within this pretending game, what the speaker is therefore doing is ‘demonstrating’ the hearer’s approaching him, as a means of soliciting it. Hopefully, the hearer takes the hint and approaches.

For the would-be scientific observer of the transaction there is a problem though. If he stays outside the game and tries to observe it objectively he sees the arm movement and the speech qua somatic event and/or sound, but the speaker’s ‘demonstration of the hearer’s approach’ will not appear to view. If one is not party to a pretence, it ceases to ‘exist’. And the hearer’s subsequent approach will be inexplicable. On the other hand, if he throws objectivity to the four winds and participates in the pretence, the arm movement and the speech qua physical events will effectively disappear off his radar. They will be replaced by the speaker’s pretended ‘demonstrating the approach of the hearer’. Either way, only one aspect of the transaction will be observable at any one time.

It is only when one alternates in and out of the game – which is, incidentally, something we are easily able to do, and often do do (it is often necessary) in everyday life – that we seem to see the speech or gestures and the activities they conjure juxtaposed. Our everyday concepts of ‘word’ and ‘language’ presuppose just this sort of rapid alternation – seeing speech as a physical phenomenon, then responding to it as one would in an actual communication session. At the everyday level we grant ‘words’ both a physical status and a pretence-conjuring efficacy: we can easily accommodate such stance-alternation. But who is going to allow ‘rapidly alternating between objectively observing something and playing a pretending-game with respect to it’ as a scientific procedure?

 — Derek Melser

July 2008, Masterton


