Vygotsky on Thinking



[Chapter 6 of PhD thesis Incipient Action (Derek Melser, Massey University, NZ, 2001) revised November 2005.]

The Russian Psychologist Vygotsky’s theory of thinking is, like Piaget’s, an ‘ontogenetic’ or developmental one. According to developmental approaches in education and child psychology, such subject matters as perception, speech, art, cooperation, emotional behaviour, thinking, etc. — are best viewed as actions or activities of the person, not natural functions of the human organism. Abilities to perform or engage in the activities in question are ‘acquired’ abilities. That is, they are not innately given, but learned post-natally, as a result of teaching by others and/or trial-and-error and practice. The underlying explanatory model is not of an innate biological process but rather of someone learning how to do something. The developmental psychologist’s job is to identify the chief skill components in a given complex ability and to specify in what stages, and via what training and experience, these component skills are mastered.

The nature/nurture controversy is still alive, and there are ongoing disputes about certain items in the human repertoire, as to whether a biological or a developmental approach is most appropriate for studying them. The most disputed human abilities include the ones we are interested in, namely, speaking and thinking. The biological approach, represented by cognitive science, assumes that abilities to speak and think are largely predetermined by biologically-evolved mechanisms. Chomsky’s and Fodor’s suggestions that ‘modules’ for Language and Cognition are hard-wired-in in the human brain constitute the background assumption for many theorists.

The main opponents of the ‘innate linguistic and cognitive mechanisms’ school are those, like Vygotsky and the social-constructivists, who stress the socially-taught and culturally-determined nature of speaking and thinking skills. However, it is probably true that all theorists, whether they align primarily with the biological or with the developmental camp, concede that both kinds of influence are important. For example,

...the more basic abilities, the linguistic ones for instance, must be, to an important extent, genetically determined; the construction of abilities from within, or the internalisation of culturally constituted abilities, can only take place on some well-developed innate foundation (Sperber 1986, p.1308).

The ‘sociogenesis’ of thinking

Vygotsky believes that a plausible developmental account of thinking can be given – an account which identifies thinking’s component skills and shows how they might be, or are normally, taught or otherwise acquired. Such a developmental account, if it had no large holes in it, would be good evidence that thinking is in fact a species of learned action and not an impersonal process. Conversely, if thinking is a learned action, then some developmental story about thinking must be true.

Vygotsky’s idea is that the component skills in thinking include both practical abilities and communicative abilities. The infant is born with both elementary practical abilities — motor and sensory-motor coordinations such as hand-eye coordinations, ability to visually track moving objects, balancing skills, etc., and elementary communicative abilities — gaze meeting and following, imitative skills, vocal and other expressive gestures directed at the other, etc. Whereas Piaget concentrated on the developing solo practical skills as the basis for thinking, Vygotsky emphasised the communicative, social-interactive side – and he emphasised practical and communicative activity in combination:

...the most significant moment in the course of intellectual development, which gives birth to the purely human forms of practical and abstract intelligence, occurs when speech and practical activity, two previously completely independent lines of development, converge. Although children’s use of tools [and practical activity generally] during their preverbal period is comparable to that of apes, as soon as speech and the use of signs are incorporated into any activity, the activity becomes transformed and organised on entirely new lines (Vygotsky 1978, p.24).

According to Vygotsky, the combining of the budding practical and communicative abilities produces a uniquely human type of activity, namely, ‘semiotically-mediated’, or ‘speech-regulated’ practical activity. Although both semiotically-mediated and speech-regulated are standard (and synonymous) terms in the current Vygotsky literature, and although semiotically has the benefit of including communicative means such as signs and gestures as well as speech, I shall use the expression speech-mediated in this connection.

What these various terms all relate to is the fact that speech is a means by which actions can be socially controlled. This enables coordinated and cooperative activity. Actions of one party may be incited, cued, controlled while in progress, or terminated, by the speech of another. Vygotsky thinks speech-mediated cooperative activity is distinctive of human beings. As well, actions which a person normally performs while alone — and which are consequently not social or cooperative in a direct way — are nevertheless typically learned in a social, teaching context. And this educative context is cooperative and speech-mediated in the full sense. Speech is inextricable from either practical, cooperative activity, or from any education the child is getting. According to Vygotsky, by learning to participate in cooperative activity, the infant is mastering the new, distinctively human, form of life. He or she leaves the animal kingdom, as it were, to join the human race.

The new form of behaviour, ‘speech-mediated practical activity’ can be seen to enter the infant’s repertoire in either of two ways. One way is by attaching speech to some pre-existing, pre-social, practical ability of the infant’s. Speech then becomes a kind of handle by means of which the practical ability can be socially controlled. For example, an infant might already be adept at feeding itself, and subsequently learn how to ask for and to receive food items, how to respond to such requests from others, and to eat or desist from eating in response to verbal cues, and so on. Another way in which ‘speech-mediated practical actions’ can enter the repertoire is by the infant’s or child’s being taught them ‘whole’. Here the new skill is not being built on any identifiable pre-existing basis. As examples of these more purely culturally-determined activities, we can look at games in which speech acts play an integral role. There are plenty of these. Or the child learns how to buy and sell things, or make and keep promises, etc.

For Vygotsky, the ability to think is the ability to rehearse speech-mediated practical activity ‘internally’. As Bakhurst puts it, Vygotsky “..represents consciousness as a developmental achievement precipitated by the internalisation of communicative practices, broadly understood” (Bakhurst 1996, p.212). In Vygotsky’s terms, the ability to think is the culmination of the socialisation process. The socialisation process begins with the infant being inducted by various means into speech-mediated practical activity, and is completed when what is learned is ‘internalised’. Internalisation is a two-stage process. First, the child becomes able to perform an action — that is, some contribution to a speech-mediated practical undertaking — while solitary, and independently of any immediate social influence. Second, he becomes able to perform the action entirely internally, or ‘in his mind’. I describe this second phase of internalisation in terms of the ‘in the mind’ metaphor because this is how, for the most part, Vygotsky describes it. Vygotsky’s view of society’s role in the development of the individual’s ability to think is admirably summarised by his colleague, Leontyev:

...the higher specifically human psychological processes can only emerge in the interaction between men, that is, they can only be interpsychological, and only later are they performed by the individual independently, some of them losing their initial external form, becoming intrapsychological processes... Consciousness is not given initially and neither is it generated by nature: it is generated by society, it is produced (Leontyev, quoted in Lektorsky 1984, p.145).

The first combining of sociogenesis and internalisation as an explanation of mental phenomena can be traced back to the French psychologist Janet in the early 1920s (see Van der Veer & Valsiner 1988). However, Vygotsky’s idea of thinking as internally-performed communicating-for-practical-purposes is much more specific than Janet’s. At the time Vygotsky was writing, and even slightly earlier, ideas similar to his were being mooted in American social psychology. The most sophisticated and detailed American account is that of de Laguna 1963 (first published 1927), for whom the primary function of speech is the coordination of cooperative activity. She defines thinking in very much the same terms as Vygotsky:

The higher mental activities — conception and purpose, memory and imagination, belief and thought — so far as these are distinctively human, are found to be closely dependent on speech. They are fundamentally social in origin, being due indirectly to the development of conversation, which, it is argued, has the primitive function of preparing for concerted group action, much as distance-perception prepares the immediate response of the individual. Conversation is shown to have a characteristic structure, adapted to its function, and it is this structure which makes possible the organised activity of thought, in which it is reflected (de Laguna 1963, pp.xi-xii).

The form of conversation from which thought springs is the discussion, which has for its end agreement among the participants regarding some specific conditions of common action... Thinking is the internalisation of this form of conversation and its independent practising by the individual. This is originally and primarily a rehearsal in direct preparation for his active participation in the social enterprise of discussion. It serves also... as a preparation for his own individual primary action (ibid. pp.352-3).

Mead also insists on the social, communicative, origins of ‘mind’: “The internalisation in our experience of the external conversation of gestures which we carry on with other individuals in the social process is the essence of thinking...” (Mead 1962, p.47) (first published 1934). Mead

...accounts for the existence of minds in terms of communication and social experience; and by regarding minds as phenomena which have arisen and developed out of the process of communication and of social experience generally — phenomena which therefore presuppose that process, rather than being presupposed by it... (ibid. p.50).

Vygotsky is perhaps the best exemplar of the sociogenetic approach to thinking. His work was marginally earlier, and somewhat superior in range and detail to that of his American contemporaries. It is also currently better known and more written about.

Some loose ends tidied

Vygotsky died (in 1934) at the age of thirty-eight, leaving several unresolved ambiguities and contradictions in his account of thinking. Before I look more closely at his account – at what is involved in the mastering and internalising of speech-mediated social action – I want to point out some problem areas I will be ignoring for lack of space.

(1) Vygotsky says that human action differs from animal behaviour by virtue of its being speech-mediated. If ‘human action’ is the end product of the ‘speechifying’ process, then we cannot strictly speaking use the term action for whatever it is that the infant ‘does’ before it starts producing proper human actions. Nor can we properly speak of the ‘actions’ of animals. The problem with distinguishing the two ‘levels’ of action is that the pre-social kind is not really a kind of action at all. Vygotsky intends any ‘action’ in the everyday sense to be an action of the second, speech-mediated kind. It is an action with speech already built in. And, it may well be inherent in the concept of an action that an action can in principle be specified in words, exhorted, commented on, etc. As one philosopher has observed, “Actions are what descriptions of actions describe, and different descriptions describe different actions. There is no other way to sort out actions” (Cody 1967, p.179). If there are no words for it, and the animal or presocial infant has no words for it, then it is not an ‘action’. The posited ‘pre-social actions’ (which Vygotsky calls ‘lower behaviours’) are impossible to specify. The moment we attempt to describe one, we are speaking of it as if it were a social action, which ex hypothesi it is not. I am shelving this problem.

(2) According to the ‘internalised speech-mediated social action’ account of thinking, the infant is born with, or develops on his or her own, certain pre-social and rudimentary practical skills, which sooner or later get ‘socialised’ into speech-mediated forms. Totally new speech-mediated actions, with no pre-social precursors, are taught too. Both internalising and the speech-mediated and social nature of what is internalised are essential ingredients in the ‘thinking’ which only we humans can do. However, Vygotsky sometimes also speaks as if there is a more primitive form of internalisation, which animals and pre-social infants are capable of, and which operates on pre-social behaviours. That is, animals and pre-social infants are said to exhibit ‘lower mental functions’. Vygotsky cites memory, perception and basic concept formation. Like other pre-social abilities, these lower mental functions can be transformed into ‘higher mental functions’ by the speech-mediating process.

The process of ‘interiorisation’ of cultural forms of behaviour... is related to radical changes in the activity of the most important psychological functions, to the reconstruction of psychological activity on the basis of sign operations. On the one hand, natural psychological processes as we see them in animals actually cease to exist as such, being incorporated in this system of behaviour, now reconstructed on a social-psychological basis so as to form a new entity. This new entity must by definition include those former elementary functions which, however, continue to exist in subordinate forms acting now according to new laws... (Vygotsky 1994, p.155).

The idea of the infant having ‘animal-level mental functions’ which are subsequently subjected to control by speech is interesting. But it puts a spanner in the works as far as the main ‘internalisation of speech-mediated social activity’ theory of thinking is concerned. Are these lower mental functions completely replaced (‘ceasing to exist’ and ‘being reconstructed’ anew, as Vygotsky says above) or do they remain ingredient in thinking – ‘continuing to exist in subordinate forms’ as he also says? Numerous other questions arise here too, but space forbids addressing them.

(3) Vygotsky is equivocal about the ontological status of speech. He sometimes writes as if speech were simply a kind of action – specifically, a species of meta-action which somehow controls or ‘regulates’ other actions. Thus he asks: “What is it that really distinguishes the actions of the speaking child from the actions of an ape when solving problems?” (Vygotsky 1978, p.26). On the next page speech is referred to as “a method of behaviour”. And speech is described as “This new form of activity, aimed at controlling another person’s behaviour” (Vygotsky 1994, p.117).

At other times he writes of speech as if it were literally a tool, a kind of hardware, like hand-tools but of different material and with different functions. And he employs the familiar synecdoche whereby it is the uttered words themselves, rather than speakers’ in-context acts of speaking, that have the action-regulating powers. Thus: “Now speech guides, determines and dominates the course of action; the planning function of speech comes into being in addition to the already existing function of language to reflect the external world” (ibid., p.28). And “The sign acts as an instrument of psychological activity in a manner analogous to the role of a tool in labour” (ibid., p.52), and “...the basic analogy between sign and tool rests on the mediating function that characterises each of them” (ibid., p.54).

Thirdly, in an apparent compromise between the actional and the hardware views of speech, Vygotsky sometimes writes as if speech is the use of a tool. Thus: “...children’s capacity to use language as a problem-solving tool...” (ibid., p.27) and “...the essence of sign use consists in man’s affecting behaviour [of others] through signs” (ibid., p.54). This third way of speaking implies that the semantic and regulatory effects of speech are due partly to the properties of the tool itself (the words used) and partly to how (with what expression, in what context, etc.) the speaker uses the tool. Like the hardware view, this compromise view implies that words, language, etc., are definable independently of people’s acts of speaking.

Interestingly, Vygotsky is not alone in his equivocation, and perhaps confusion, about the ontological status of speech. At least three other twentieth-century theorists who stressed that speech is conceptually located within cooperative activity and that it cannot be understood outside this context – Malinowski, de Laguna and Wittgenstein – also speak of it alternately as a pure action, as an object of use (a sign, a piece of semiotic hardware), or as the using of that use-object. It can’t be all three.

I think the best course is to commit Vygotsky to the first, purely actional construal of speech. The ‘tool’ and ‘use of tool’ idioms are figurative. Speech can plausibly be regarded as a form of technology, but it does not literally involve the production or use of anything, any more than does walking or swimming. No hardware of any kind is involved in speech. Besides, Vygotsky is going to define thinking as ‘internalised speech-mediated action’, or ‘internal speech’. That is, speech gets to be internalised in thinking. Now, although I have not yet said what Vygotsky thinks ‘internalisation’ or ‘internal speech’ are, I have been assuming that what is to be internalised is some form of action or activity. If Vygotsky retains either a ‘tool’ or a ‘tool-use’ view of thinking, we are going to have to contemplate not only internalised actions but also internalised linguistic hardware. Understanding how an action can be ‘internalised’ is difficult enough, but understanding how a physical entity such as a tool or word could be ‘internalised’ seems impossible. One would also have to explain how, in its ‘internal’ application, the tool/word could retain its technical properties. The situation is a lot simpler if we stick with the purely actional view of speech. If we do, then ‘inner’ speech is just a special case of ‘inner’ actions generally. There is no point in taking on what looks like the far more difficult job of explaining the internalisation of hardware, for the sake of what looks like a figure of speech.

There is awareness of this difficulty in the Vygotsky literature. Bronckart asks ruefully, “What is actually ‘interiorised’: is it language as such (words) or general properties of communicative interaction, or even properties of ‘action mediated through signs’?” (Bronckart 1996, p.92). Some answers are as vague as Vygotsky is ambiguous.

With internalisation, what was originally in the interpersonal (or inter-mental) domain becomes intra-personal (intra-mental) in the course of development. However, this general concept of internalisation is not sufficient for elaborated theoretical use, nor is it helpful in deriving empirical research methodologies. To go beyond generalities it is necessary to specify what ‘materials’ are imported from society into the intra-personal world of any individual, and in what ways this process operates. The first question can be answered here in generic terms. In human internalisation, the materials involved are of a semiotic nature (Lawrence & Valsiner 1993, pp.151-2).

We need to know if we are talking hardware or software. Lawrence & Valsiner’s ‘materials of a semiotic nature’ does not help us decide. And surely we cannot know ‘in what ways’ internalisation operates without knowing what it operates on. To avoid such imponderables, I commit Vygotsky to an actional view of speech.

(4) There is another ambiguity about what is internalised. I have said that it is speech-mediated action which is internalised. And this means ‘action with speech attached’ as it were. For Vygotsky. the term internal speech is a synonym for thinking and shorthand for internalised speech-mediated action. However, there are many points in Vygotsky’s writing where it sounds as if he wants it to be just the speech component which is internalised, with no internalised action to go with it. To resolve this uncertainty, I will proceed on the assumption Vygotsky thinks it is some mix of speech and practical action that is internalised. For Vygotsky, the core function of speech is the controlling or ‘regulating’ of people’s actions – both other people’s and the speaker’s own. If there was no action to be regulated, there would be no speech. As Brockmeier says, this is the whole “..Vygotskian point: namely, to see semiotic action as inextricably linked to other forms of actions” (Brockmeier 1996, p.137).

Even when the speech is internalised, the link with action is maintained. Vygotsky wants internal speech to retain at least some of the action-regulatory properties of ordinary speech. In the case of ‘internal’ speech, the actions being regulated are those of the person P who is doing the ‘internal speaking’. In one kind of case, the actions are actually being performed, as when P is thinking what she is doing whilst doing it. In another kind of case, the actions are being performed ‘internally’ along with the speech – as when P is just thinking about doing X. In a third kind of case, ‘thinking out loud’, the speech is overt and ‘external’, but the action which the speech relates to is merely ‘internal’. In all three of these cases – just as when speech and action are both overtly performed – the raison d’être of the speech is regulating the action and, if there is no action for the speech to regulate, there is no speech.

Speech-mediation: from demonstration to gesture to speech

According to Vygotsky, the distinctive feature of human action is that it is mediated – that is, controlled or regulated – by speech. Mediation boils down to the fact that one person, by speaking, can incite, cue, modify, halt, etc., the actions of someone else. We are essentially talking here about the hortative function of speech. How does hortation work? What sort of transaction is occurring, and how is it learned? And what are Vygotsky’s thoughts on this? In the material in English, Vygotsky is nowhere very explicit. However, the outline of an explanation can be extracted, from various places in the material. We can at least sketch the kind of answer Vygotsky intends. And Vygotsky’s account of hortation, inchoate though some of it is, remains the best that has been offered before or since.

The main idea seems to be that social control is effected first via imitation. Imitation is inborn in normal infants and is, almost from birth, a powerful and versatile disposition (Meltzoff 1996). It is this disposition that the adult exploits. By demonstrating an action, the adult can – sometimes, at least – get the infant to perform the action too. Their doing of whatever it is in concert constitutes a reward for both parties: concerted activity is naturally pleasurable.

As a means of social control, this demonstration-and-imitation technique is unwieldy, a blunt instrument; it works relatively infrequently, and is suitable for only a limited range of simple actions. If parents did not enjoy the proceedings as much as infants, it might not be persisted with. However, it has a great future. Later on, much-abbreviated demonstrations of actions – mere ‘token gestures’ at performing the actions in question – are able to elicit the same kind of imitative response from the child. With practice, the abbreviated action (the gesture) becomes an effective cue for the child to perform the unabbreviated action. Finally, the use of gestures in the behaviour-regulating role is replaced by speech. Over the course of this section of my essay, I will present quotations from Vygotsky which show at least one plausible itinerary, without too many large gaps, for the development from demonstration-and-imitation to speech.

Vygotsky acknowledges the adult’s initial reliance on imitation for teaching the child how to do things when he cites, approvingly, the findings of Shapiro and Gerke concerning social influences on the acquisition of motor skills:

In their view, social experience exerts its effect through imitation; when the child imitates the way adults use tools and objects, she masters the very principle involved in a particular activity. ...The child, as she becomes more experienced, acquires a greater number of models that she understands. These models represent, as it were, a refined cumulative design of all similar actions; at the same time, they are also a rough blueprint for possible types of action in the future (Vygotsky 1978, p.22).

Children are capable of learning by imitation not only specific perceptual and motor skills but also general heuristic, investigative and problem-solving skills. Vygotsky believes human infants are much better at imitating than apes are. For example, regarding their respective abilities to learn totally new behaviours by imitation, there is a significant difference between apes and children.

[Kohler’s experiments] ...reveal that primates can use imitation to solve only those problems that are of the same degree of difficulty as those they can solve alone. [But]...Children can imitate a variety of actions that go well beyond the limits of their own capabilities. Using imitation, children are capable of doing much more in collective activity or under the guidance of adults (Vygotsky 1978, p.88).

The cleverest animal is incapable of intellectual development through imitation. It can be drilled to perform specific acts, but the new habits do not result in new general abilities. In this sense it can be said that animals are unteachable. In the child’s development, on the contrary, imitation and instruction play a major role. They ...lead the child to new developmental levels. In learning to speak, as in learning school subjects, imitation is indispensable. What the child can do in cooperation today he can do alone tomorrow. Therefore the only good kind of instruction is that which marches ahead of development and leads it; it must be aimed not so much at the ripe as at the ripening functions (Vygotsky 1962, p.104; 1986, p.188).

By habitually imitating the infant, in an ‘ostentatious’ way and for fun, the adult accustoms the infant to the demonstrator role in the imitation game as well as the imitator role. The first important modification of the imitation game is when the person demonstrating the action – and it may be either the child or the caregiver – gives only a very abbreviated or ‘token’ demonstration of the action to be imitated. In Vygotsky’s account, a mime or gesture is essentially the performance of the beginning only of the action being incited. After practice in this new streamlined version of the imitation game, these token performances or ‘gestures’, which are a lot less laborious for the demonstrator, come to have the same imitation-inducing effect as a full performance. Of course, since the responder is no longer doing what the demonstrator is doing, the term imitation is now not strictly appropriate. The demonstrator is merely making a gesture, but the responder is performing the whole action. This basic streamlining of the original demonstration-and-imitation procedure seems also to have been achieved by apes:

Kohler describes highly diversified forms of ‘linguistic communication’ among chimpanzees. First in line is their vast repertory of affective expressions: facial play, gestures, vocalisation; next come the movements expressing social emotions: gestures of greeting, etc. The apes are capable both of ‘understanding’ one another’s gestures and of ‘expressing’, through gestures, desires involving other animals. Usually a chimpanzee will begin a movement or an action he wants another animal to perform or to share — e.g., will push him and execute the initial movements of walking when ‘inviting’ the other to follow him, or grab at the air when he wants the other to give him a banana. All these are gestures directly related to the action itself. Kohler mentions that the experimenter comes to use essentially similar elementary ways of communication to convey to the apes what is expected of them (Vygotsky 1962, pp.34-5; 1986, pp.71-72).

Vygotsky speculates about how the ‘pointing’ gesture, could derive by this abbreviation process. He describes how, if an infant wants an object to be handed to him/her, then he/she may make a deliberately abortive reaching and grasping movement in the direction of the desired object. Then,

When the mother comes to the child’s aid and realises his movement indicates something, the situation changes fundamentally. Pointing becomes a gesture for others. The child’s unsuccessful attempt engenders a reaction not from the object he seeks but from another person... The grasping movement changes to the act of pointing. As a result of this change, the movement itself is then physically simplified, and what results is the form of pointing that we may call a true gesture (Vygotsky 1978, p.56).

As Luria (who collaborated closely with Vygotsky) reports, speech makes its first appearance in association with demonstrations of actions, in situations where the adult both demonstrates (or gestures) an action and verbally incites it:

For example, the experimenter may say to the child, Give me the fish, and then lift the fish, shake it, tap on it, or point at it with his finger. If the adult speech is reinforced with some action connected with the object, then ...the child can carry out the task (Luria 1981, p.94).

From then on, there is a gradual transition from demonstration-and-imitation to speech as the preferred means of instruction and/or hortation. The transition is slow:

...the adult’s speech, which focuses the child’s attention or regulates his/her action does not immediately attain these powers. Rather, the formation of this directive function of adult speech goes through a rather long and dramatic development (Luria 1981, p.93).

The change in preference from demonstration-and-imitation, to gestures, to speech as the preferred action-regulating technique seems to be paralleled in the increasing sophistication, with age, of child’s make-believe games. Vygotsky remarks that “..play contains all developmental tendencies in a condensed form and is itself a major source of development” (Vygotsky 1978, p.102) and he reports experimental results which show speech gradually predominating over mime and gesture:

Whereas some children depicted everything by using movements and mimicry, not employing speech as a symbolic recourse at all, for other children actions were accompanied by speech: the child both spoke and acted. For a third group, purely verbal expression not supported by any activity began to predominate. Finally, a fourth group of children did not play at all, and speech became the sole mode of representation, with mimicry and gestures receding into the background. The percentage of purely play actions decreased with age, while speech gradually predominated. The most important conclusion drawn from this developmental investigation ...is that the difference in play activity between three-year-olds and six-year-olds is ...in the mode in which various forms of representation are used. In our opinion, this is a highly important conclusion; it indicates that symbolic representation in play is essentially a particular form of speech at an earlier stage... (Vygotsky 1978, pp.110-111).

The key to the nature of the transition between gesture and speech as the preferred action-inciting technique, probably lies in the first of the two Luria quotes above. The child has got used to responding (with the appropriate action) to either full demonstrations or very abbreviated demonstrations (gestures). Typically, the adult will habitually accompany his or her demonstrations or gestures with speech distinctive of that action. Presumably, as in the fish example above, the speech contributes an added ostentatious, attention-directing factor. And if it is repeated often enough, the speech, which will be distinctive of that action, will be regarded as part of the action. It will be regarded as a noise one makes when (the rest of) that action is being performed. Subsequently, the adult may produce a particular action-specific piece of speech with no demonstrating at all, or with just a minimally affected gestures, facial expression, or tone of voice. Speech is now occurring, effectively, on its own. But it retains the action-inducing effect of the demonstrations and gestures which it is a subtraction from. And the reason is that – because, in the early stages, the speech came to be regarded as an integral part of the activity – the speech-by-itself now counts as just another way of abbreviating the original demonstration, for convenience. For the same reason that a gesture is able to elicit the desired behaviour as effectively as a full demonstration, so speech by itself can now replace gesture in eliciting the behaviour. Because certain speech acts have been a feature of past demonstratings and imitatings of that behaviour, these speech acts are seen as a ‘part’ of the behaviour, or at least as part of the demonstration of it. Speech has obvious advantages over demonstration, mime and gesture. It is easier to perform than any demonstration or gesture, however abbreviated. As well, the hearer of speech does not have to be looking but only listening – “..gestures require that communicative exchanges take place in face-to-face situations, while vocalisations are not spatially restrictive” (John-Steiner & Tatter 1983, p.90).

Presumably, the point of ‘grafting speech onto’ a learner’s experience of given activity is to make the learner biddable in the context of that activity. As de Laguna says, the primary function of speech is to expedite cooperation. Really, speech-mediated activity is cooperative activity.

The mother communicates with the child and gives him/her instructions with the help of speech. For example, she draws his/her attention to objects in the environment (e.g., Take the ball, Lift your arm, Where is the doll? etc.), and the child carries out these spoken instructions. What is the mother doing when she gives the child these verbal instructions? As we have already said, she is drawing his/her attention to something, she is singling out one thing from among many. With her speech she organises the child’s motor acts. Thus the child’s motor act often begins with the mother’s speech and is completed with his/her own movement. Vygotsky pointed out that initially the voluntary act is shared by two people. It begins with the verbal command of the mother and ends with the child’s act (Luria 1981, p.89).

According to Vygotsky, the first stage of socialisation consists in inducing the child to participate, albeit in a guided and reactive way, in shared cooperative activities. The child’s future ability to perform actions solo – and thus embark on the second, ‘internalisation’ phase of the socialisation process – depends on his prior experience of performing them in concert with others and in response to others’ speech.

Vygotsky was a Marxist and looked for ‘cultural-historical’ influences on the individual. Kozulin describes the socialisation or ‘acculturation’ process as follows:

Symbolism and the conventionality of signs were perceived by Vygotsky as important characteristics of human activity that are imposed on an individual’s behaviour, shaping it and reconstructing it along the lines of the sociocultural matrix. The concept of activity thus was perceived as an actualisation of culture in individual behaviour, embodied in the symbolic function of gesture, play and speech systems (Kozulin 1996, p.106).

The ‘speech-mediating’ of the child’s activities is the means by which caregivers ensure the child ‘takes on board’ the culture he or she is born into. That culture determines the speech forms and language-games which will regulate the child’s behaviour. Via speech, the child appropriates the culture and the culture appropriates the child.

Internalisation, phase one: going solo

Once the child has mastered the responder role in cooperative activity, the stage is set for ‘internalisation’. The important thing to be clear about is that, according to Vygotsky, what is internalised is ‘cooperative activity’ or ‘speech-mediated social action’ – it is action which has both a practical component and a communicative, verbal component:

...the process of internalisation consists of a series of transformations: An operation that initially represents an external activity is reconstructed and begins to occur internally. Of particular importance to the development of higher mental processes is the transformation of sign-using activity, the history and characteristics of which are illustrated by the development of practical intelligence... An interpersonal process is transformed into an intrapersonal one. Every function in the child’s cultural development appears twice: first, on the social level, and later, on the individual level; first, between people (interpsychological), and then inside the child (intrapsychological)... All the higher functions originate as actual relations between human individuals... The internalisation of socially rooted and historically developed activities is the distinctive feature of human psychology... As yet the barest outline of this process is known (Vygotsky 1978, pp.56-7).

As I say, there are two phases in the ‘internalisation’ process. First, the child becomes able to act in the distinctive speech-mediated social way while he or she is alone – although, because the child is now acting alone, social can at best mean ‘quasi-social’. Then, secondly, the child becomes able to perform speech-mediated actions ‘internally’ or ‘on the intrapsychological plane’.

Before the child can go solo with his or her new social-actional skills, he or she must be experienced not just in the responder/hearer role in speech transactions, but also in the demonstrator/speaker role. He or she must become able to use speech to regulate others’ actions, solicit cooperation, etc., just as others are able to use speech to regulate his. We can look, for an example, at how the child develops the ability to direct other people’s attention to things. Here is the passage about pointing that I quoted earlier, but with a little more at the beginning. The typical early context will involve an object, out of reach, which an infant sees and wants:

We call the internal reconstruction of an external operation internalisation. A good example of this process may be found in the development of pointing. Initially this gesture is nothing more than an unsuccessful attempt to grasp something, a movement aimed at a certain object which designates forthcoming activity. The child attempts to grasp an object placed beyond his reach; his hands, stretched towards the object, remain poised in the air. His fingers make grasping movements... When the mother comes to the child’s aid and realises his movement indicates something, the situation changes fundamentally. Pointing becomes a gesture for others. The child’s unsuccessful attempt engenders a reaction not from the object he seeks but from another person... The grasping movement changes to the act of pointing. As a result of this change, the movement itself is then physically simplified, and what results is the form of pointing that we may call a true gesture (Vygotsky 1978, p.56).

In the previous section I described how the child, as responder, learns to react to an abbreviated or aborted demonstration as if it were a full demonstration. The child comes to respond to mere gestures on the part of the adult. In the quotation above, we see the child’s first – almost accidental – proactive employment of this abbreviation strategy.

Simultaneously with the child’s production of demonstrations, mimes and gestures, he or she will produce speech. At this early stage the speech is still an indissoluble part of the prevailing activity – and the child’s attempts to demonstrate this activity. Vygotsky thinks that attention-directing is the first and most important ‘action-regulating’ purpose speech serves. In the infant’s earliest experience, speech is still not disentangled from the demonstrating and mimetic gesturing. Speech plays an ‘adverbial’ role. It is part of the ostentatious manner in which the demonstration is performed. The speech has not yet separated out and taken over the main burden of the action-regulating. It is still part of the crude ‘ostentatious performance done to invite imitation’ mode of action-regulating. Thus,

When we observe the child in action... it becomes obvious that it is not only the word mama which means, say, Mama put me in the chair, but the child’s whole behaviour at that moment (his reaching out toward the chair, trying to hold on to it, etc.). here the ‘affective-conative’ directedness toward an object... is as yet inseparable from the ‘intentional tendency’ of speech. The two are still a homogenous whole and the only correct translation of mama, or any other early words, is the pointing gesture. The word, at first, is a conventional substitute for the gesture... (Vygotsky 1962, p.30; 1986, p.65).

And, “...the child embellishes his first words with very expressive gestures, which compensate for his difficulty in communicating meaningfully through language” (Vygotsky 1978, p.32). Speech won’t yet work on its own.

We have now reached a point where, not only is the child amenable to having his or her actions regulated by the speech of others (thus having his or her actions ‘speech-mediated’), he is also able to take the active role and regulate others’ behaviour. Once these two abilities are established, the child can then attempt to verbally direct his or her own behaviour, while he or she is alone. This is the child’s first attempt to ‘go solo’ with this new kind of ‘speech-guided’ action. The first manifestation of this new achievement is what we call ‘thinking out loud’ – and what Piaget and Vygotsky called ‘egocentric speech’. Vygotsky thinks that “..children’s egocentric speech should be regarded as the transitional form between external and internal speech” (Vygotsky 1978, p.27).

Egocentric speech is inner speech in its functions; it is speech on its way inward, intimately tied up with the ordering of the child’s behaviour, already partly incomprehensible to others, yet still overt in form and showing no tendency to change into whispering or any other sort of half-soundless speech (Vygotsky 1962, p.46; 1986, p.86).

The development is gradual: “..egocentric speech is linked to the child’s social speech by a thousand stages” (Vygotsky 1994, p.119). He discusses (ibid, pp.109-20) experiments in which children are set problem-solving tasks with an adult in the room but not assisting. In this context, the children exhibit a “strange alloy of speech and action” (ibid, p.118). They “..solve practical tasks with the help of their speech, as well as their eyes and hands” (Vygotsky 1978, p.26):

...the child not only acts endeavouring to achieve its goal, but at the same time also speaks. This speech as a rule arises spontaneously in the child and continues almost without interruption throughout the experiment. It increases and is of a more persistent character every time the situation becomes more difficult and the goal more difficult to attain. Attempts to block it... are either futile or lead to the termination of all action, ‘freezing’ as it were the child’s behaviour... (Vygotsky 1994, p.109).

The behaviour of a small child in the situation just described presents, consequently, a complex skein; it consists of a mixture: direct attempts to attain the goal, the use of tools, speech either directed at the person conducting the experiment or simply accompanying the action, as if strengthening the child’s efforts, and, finally, a paradoxical-sounding direct appeal to the object of attention (ibid, p.118).

An important change in the role of egocentric speech now occurs. From being a willy-nilly accompaniment to action – “as if strengthening the child’s efforts” – the egocentric speech becomes much more judicious and deliberate. The speech begins to foreshadow actions. The child is laying the foundations for his or her future ability to ‘think ahead’.

The change consists in the fact that the child’s speech, which previously accompanied its activity and reflected its chief vicissitudes in a disrupted and chaotic form, moves more and more to the turning and starting points of the process, beginning thus to precede action and throw light on the conceived of but as yet unrealised action.

...The child’s speech — due to the fact that it is first a verbal mould of operation or its parts — reflects action and strengthens its results, starts at a later stage to move towards the action’s beginning, to predict and direct the action, forming it according to the mould of former operation, that was previously fixed by speech. ...As speech becomes an intra-psychological function, it begins to prepare a preliminary verbal solution to a problem which, in the course of further experiments, perfects itself and, from a speech-mould recapitulating past experience, becomes the preliminary verbal planning of future action (Vygotsky 1994, pp.120-121).

For Vygotsky, the child’s egocentric speech is a way of exploiting the attention-focussing, readying and action-regulating functions of ordinary (social) speech, without the presence of an interlocutor being required. As Wertsch & Stone put it,

The result of mastering and differentiating the self-regulative function of speech as opposed to its social, communicative function is to recognise that the former does not require an overt form of a communicative context (Wertsch & Stone 1985, p.172).

Putting it more simply: Vygotsky is effectively saying that, if the child is already experienced at having his or her actions directed by (others’) speech, and is already experienced at directing (others’) actions by speech, then there is no need for any further explanation of how it is that the child is able to direct his or her own behaviour with his or her own speech. By contrast, in the following recent explanation of thinking out loud (by Dennett), there is an attempt to fill in a bit of (surely superfluous) physiological detail as well:

...the practice of asking oneself questions could arise as a natural side-effect of asking questions of others, and its utility would be similar: it would be a behaviour that could be recognised to enhance one’s prospect by promoting better-informed action-guidance. All that has to be the case for this practice to have this utility is for the preexisting access-relations within the brain of an individual to be less than optimal. Suppose, in other words, that although the right information for some purpose is already in the brain, it is in the hands of the wrong specialist; the subsystem in the brain that needs the information cannot obtain it directly from the specialist — because evolution has simply not got around to providing such a ‘wire’. Provoking the specialist to ‘broadcast’ the information into the environment, however, and then relying on an existing pair of ears (and an auditory system) to pick it up, would be a way of building a ‘virtual wire’ between the relevant subsystems. ...Such an act of autostimulation could blaze a valuable new trail between one’s internal components. Crudely put, pushing some information through one’s ears and auditory system may well happen to stimulate just the sorts of connections one is seeking, may trip just the right associative mechanisms, tease just the right mental morsel to the tip of one’s tongue. One can then say it, hear oneself say it, and thus get the answer one was hoping for (Dennett 1991, pp.195-6).

Vygotsky and Dennett both use the term autostimulation in this connection. And they also agree that speech is not the only means of this kind of autostimulation. Dennett says “Talking out loud is only one possibility. Drawing pictures to yourself is another readily appreciated act of self-manipulation” (Dennett 1991, p.197). And Vygotsky speaks of speech and other sign media, which “..serve the child first and foremost, as a means of social contacts with the surrounding people, and are also applied as a means of self-influence, a means of auto-stimulation, creating thus a new and superior form of activity in the child” (Vygotsky 1994, p.111).

The important thing that egocentric speech has in common with normal interpersonal speech is the actual saying and/or hearing of the words. Given this feature of egocentric speech, Vygotsky can plausibly regard egocentric speech as an abbreviated form of normal speech. In the section before last, when I was recapitulating Vygotsky’s account of how the child’s actions get to be ‘speech-mediated’, I tied Vygotsky to a ‘progressive abbreviation’ theory of speech mediation – whereby speech is the residue (which still has action-inciting powers) when demonstrations, mimings and gesturings of actions are progressively abbreviated out of a primary speech/action matrix. Now, if egocentric speech can plausibly be regarded as an ‘abbreviation’ of ordinary interpersonal speech, then this ‘progressive abbreviation’ theory can be extended. He can extend the abbreviation process past speech to include egocentric speech. So, the whole account so far would go as follows.

(i) The demonstrating of action A has a natural A-performance-inducing effect on the child audience, given his natural tendency to imitate;

(ii) after suitable training, an abbreviated demonstration (e.g., a mime or gesture) of A comes to have a similar A-inducing effect on the child – so the mime or gesture now functions more as a simple cue than as a demonstration to be imitated;

(iii) demonstratings or mimings or gesturings of action A by others are invariably in the child’s experience accompanied by speech that is specific to A;

(iv) after suitable training, speech that is done by erstwhile demonstrators without demonstration, mime or gesture will count as an abbreviated form of demonstration (or mime or gesture), and will thus have action-inducing effects similar to those of a full demonstration or mime, etc. – that is, speech by itself will function as a cue for the audience,

(v) egocentric speech retains sufficient features of ordinary speech to count as an abbreviation of it and would thus tend to have, along the lines of the other kinds of abbreviation above, a residual action-inducing effect on the hearer.

The important thing for Vygotsky is that egocentric speech is necessarily preceded by, and has its form and function determined by, social speech. And egocentric speech provides the same kind of assistance in practical action as is provided by social speech.

The greatest change in children’s capacity to use language as a problem-solving tool takes place... when socialised speech (which has previously been used to address an adult) is turned inward. Instead of appealing to the adult, children appeal to themselves; language thus takes on an intrapersonal function in addition to its interpersonal use. When children develop a method of behaviour for guiding themselves that had previously been used in relation to another person, when they organise their own activities according to a social form of behaviour, they succeed in applying a social attitude to themselves. The history of the process of the internalisation of social speech is also the history of the socialisation of children’s practical intellect (Vygotsky 1978, p.27).

As he becomes older, the child comes to rely less and less on out-loud egocentric speech – during play, problem-solving, etc. Then, finally, “..when egocentric speech disappears from view it does not simply atrophy but ‘goes underground’, i.e., turns into inner speech” (Vygotsky 1962, p.18; 1986, pp.32-33).

Internalisation, phase two: into the mental

The final step in the development of thinking is the disappearance of egocentric speech ‘inwards’. This completes the development begun with the infant’s first participation in cooperative activity. Vygotsky’s main point is that the direction of the development of thinking is from the social to the inner (and ‘mental’):

...our schema of development – first social, then egocentric, then inner speech – contrasts both with the traditional behaviourist’s schema – vocal speech, whisper, inner speech – and with Piaget’s sequence – from nonverbal autistic thought through egocentric thought and speech to socialised speech and logical thinking. In our conception, the true direction of the development of thinking is not from the individual to the socialised, but from the social to the individual (Vygotsky 1962, pp.19-20; 1986, pp.35-36).

The idea being explicitly rejected here is that the infant or child first develops thoughts and then gradually acquires the means to express these publicly. For Vygotsky, thinking arises only after the social, and via internalisation of it. In this quotation, Vygotsky also distinguishes his ‘from speaking to thinking’ continuum from that of the behaviourists. For Vygotsky, the transition from social to egocentric to inner is a developmental one, a matter of skill acquisition, to do with the child learning to adapt the speech technique for use in new (e.g., absent-adult) situations. Although progressive abbreviation of the speech performance is involved in the transition, it is not the physical abbreviation per se that is important for Vygotsky, but the child’s increasing sophistication. For the behaviourist, the transition is entirely a physical matter. Vygotsky was thinking of Watson 1919 but, as late as 1957, Skinner was saying:

The range of verbal behaviour is roughly suggested, in descending order of energy, by shouting, loud talking, quiet talking, whispering, muttering ‘under one’s breath’, subaudible speech with detectable muscular action, subaudible speech of unclear dimensions, and perhaps even the ‘unconscious thinking’ sometimes inferred in instances of problem solving (Skinner 1957, p.438).

Thus, although Vygotsky’s is an abbreviation story, it is not a simple physical abbreviation story like that of the behaviourists. For Vygotsky, the abbreviation is to do with ‘making do with the minimum necessary’, and this is a skill and sophistication factor, rather than a physical abbreviation per se.

Vygotsky appeals explicitly to an abbreviation story when he is explaining how egocentric speech develops into inner speech. He cites Tolstoy’s Kitty and Levin’s (in Anna Karenina) being able to communicate complex information without words. The two know each other so well, that they can communicate using written initial letters of words only, or by exchanging glances. Because each already ‘knows’ what the other is going to say, it need not be uttered out loud. Kitty and Levin’s abilities are relevant to the situation of the child graduating from egocentric speech to inner speech in that, presumably, the child knows himself even better than Kitty and Levin know each other. The solitary child ‘knows what he would say’ were there someone else present.

With the Kitty and Levin analogy, among other informal characterisations, we are given some indication of what factors would tend to abbreviate speech in the relevant ways. Kitty and Levin’s speech is the minimum required for regulating each other’s action. With ‘internal speech’, what is ‘spoken’ is the minimum required for auto-regulation of the agent’s own potential action. But we still have little idea what ‘internal speech’ amounts to. The ‘progressive abbreviation’ theory cannot be further-extended to cover internal speech. An abbreviation story is plausible only while there is something (and preferably something recognisable) of the original action remaining. This is apparently not the case with ‘inner speech’. So ‘inner speech’ cannot be a yet-more-abbreviated form of egocentric speech.

Vygotsky is lax in his use of the term internalisation – and in his associated use of the terms intrapsychological, psychological plane, inner, etc. Sometimes he uses these terms in connection with the initial ‘going solo’ phase, with its overt and audible egocentric speech. This is the stage at which the child has mastered or ‘appropriated’ a given social-practical skill for himself – and can perform some adapted or abbreviated solo version of it whilst alone – although he is not yet able to rehearse it ‘internally’. At other times Vygotsky uses internalisation, interpsychological, etc., in relation to the second phase, with its totally ‘internal’ and ‘silent’ speech. This second phase of internalisation is clearly distinct from, and more sophisticated than, the first. At the second stage, we are talking about something that happens after solo mastery, and involves the mastered activities being transformed into ‘internal’ activity. This is not just individual mastery of a social skill but, as Lektorsky puts it,

...the idea that internal psychical processes are the result of ‘interiorisation’, that is, ‘growing in’ or transposition onto the inner plane of those actions of the subject which are originally performed externally and directed at external objects (Lektorsky 1984, p.145).

It is the stage where egocentric speech ‘goes underground’, down to an ‘inner plane’, to become ‘inner speech’. In yet a third usage, Vygotsky employs the same terms, internalisation, interpsychological, inner plane, etc., to cover both these developmental phases – both the going solo and the “transposition onto the inner plane” – considered together, as a single event. That is, Vygotsky sometimes uses these terms to refer to phase one, sometimes to phase two, and sometimes to both phases together. Although this terminological laxity could be partly due to difficulties of translation, it gives an impression of prestidigitation. The ‘going solo’ phase is unproblematic, and Vygotsky explains it very plausibly, in terms of abbreviation, etc. However, because internalisation is also used to refer to the second phase, and to the two phases together, the impression is given that the whole two-phase sequence has been explained. And this is a false impression.

I shall assume that Vygotsky intends the terms intrapsychological, psychological plane, inner, etc., to refer primarily to what I have been calling the second phase of internalisation. Although I will continue to use the term internalisation to cover both phases collectively, I will use the remaining terms, intrapsychological, internal, etc., solely in connection with the second phase. The terms intrapsychological, etc., have an obvious connotation of ‘mind’ in the ordinary Cartesian sense. It could be argued that Vygotsky needs this connotation, initially at least, so that he can point to what it is he is trying to re-describe. He needs to first indicate the Cartesian concept, so he can then proceed to explain it away. The question is, does he explain it away?

The effect of Vygotsky’s combining his abbreviation story about ‘inner speech’ with the use of terms like higher mental functions and intrapsychological is that we get the impression that speech gradually fragments and abbreviates, right down to the point where no audible (or otherwise perceptible) speaking is being done at all. At that point, the speech slips around the corner as it were, into the mental. At the vanishing point, the mental takes over. One infers that the speech is still going on, only now ‘internally’.

As I said earlier, the abbreviation story cannot explain ‘internal speech’ because, if ‘inner speech’ were abbreviated egocentric speech, say, then there would be some residue of overt speech preserved in the ‘inner speech’. But Vygotsky gives the impression that there is no residue. The idea of doing things on an ‘inner’, ‘mental’ plane fails too, because it is obviously metaphorical. At least, the notion of a person performing an action inside himself is untenable until it is clarified what sense of internal is being used here. And Vygotsky fails to specify the relevant sense of internal. Without any literal spelling out, we are left with only a metaphor – a metaphor for what?

Vygotsky seems unclear about the inadequacy of the two ways of characterising ‘inner speech’ – as abbreviated speech and as internal. He sometimes combines the two, perhaps hoping that, together, they will disguise each other’s failings.

One frequently quoted passages in the recent literature is from Vygotsky’s other close collaborator, Leontyev. The passage is said to show that Vygotsky’s notion of the ‘intrapsychological’ does not presuppose any Cartesian-type mental domain – and that, for Vygotsky, ‘the mental’ is a purely actional and developmental concept. Here is the passage, plus a bit at the beginning not usually quoted:

The older psychology considered consciousness as some kind of metapsychological plane of movement of psychical processes. But consciousness is not granted initially and it is not originated by nature. Consciousness is originated by society; it is produced. For this reason consciousness is not a postulate and is not a condition of psychology but its problem, a subject for concrete scientific psychological investigation. Thus the process of interiorisation is not external action transferred onto a pre-existing internal ‘plane of consciousness’; it is the process in which this internal plane is formed (Leontyev 1978, p.60).

The much-quoted part is the last sentence, which is usually read as implying that the ‘internal psychological plane’ is nothing over and above the internalisation process – as this process is applied to ‘external’ social activities. Or the psychological is in some sense the product of that process. [Hampshire, whose concept of ‘inhibition’ roughly conflates Vygotsky’s ‘abbreviation’ and ‘internal doing’ concepts, makes what looks like the same claim as Leontyev. He says that the child’s “..full inner life begins with, and is constituted by, this power of intentional inhibition” (Hampshire 1971, p.163, my italics).] The Leontyev passage is usually taken as showing some anti-Cartesian sophistication on Vygotsky’s and/or Leontyev’s part(s). However, all we are in fact told in the passage is that a psychological world, or ‘plane’ – of a very Cartesian-looking kind – does not precede but rather follows the (as yet undefined) internalisation process. The idea of an ‘internal plane’ is still needed for the internalisation concept – whether the internal plane is thought of as the prerequisite for, or as the product of, the internalisation process.

Again, there is an impression of sleight of hand. Phase two of the internalisation process seems to be explained in terms of relocation on to the internal, psychological plane – which by implication has always been there. But then, when the status of the internal plane is queried, Vygotsky – or at least, Leontyev – makes as if to define the psychological plane solely as a function or product of ‘phase two’ internalisation. Certainly, Vygotsky does often write as if internalisation is the stashing away of ‘external’, ‘social’ accomplishments in an already existing ‘internal’, ‘private’ repertoire. However, even if the internal repertoire is constituted solely by the material delivered to it, or somehow brought into being by the delivery process itself, we still need to be told what kind of thing the inner repertoire or plane is. And we are not. We are not even told, really, where it is. In people’s heads? Whatever its provenance, Vygotsky’s ‘intrapsychological’ never doffs its Cartesian mask.

Vygotsky’s inability to define either phase two of the internalisation process, or the nature of the ‘intrapsychological plane’, without falling back on the conventional notion of the mental, is disappointing. However, recourse to ‘the mental’ is made inevitable by a prior mistake – that of regarding ‘inner speech’ as a kind of speech. Once one believes that there is speaking (of some kind) going on, one is committed to the inference that it must be going on somewhere. Because it is silent, the speech cannot be going on in the ordinary social world. Therefore, it has to be going on in some other world, or on some other ‘plane’. It is clear Vygotsky thinks that ‘inner speech’ is a kind of speech. He says:

Inner speech is speech for oneself; external speech is for others. It would not be surprising if such a basic difference in function did not affect the structure of the two kinds of speech. Absence of vocalisation per se is only a consequence of the specific nature of inner speech, which is neither an antecedent of external speech nor its reproduction in memory but is, in a sense, the opposite of external speech. The latter is the turning of thought into words, its materialisation and objectification. With inner speech, the process is reversed, going from outside to inside. Speech sublimates into thoughts. Consequently the structures of these two kinds of speech must differ (Vygotsky 1986, pp.225-226; 1962, p.131).

And he talks about the habit of egocentric speech making way for “the birth of a new speech form” (ibid p.135), i.e., inner speech. Vygotsky is envisaging a silent, unvocalised kind of speech. It is bad enough calling egocentric speech a kind of speech. If there is no audience, no communication, no implicit ‘demonstration’ for another person, then, especially on Vygotsky’s own account of speech, there seems to be little of ‘speech’ left – or little, at least, of the ‘semiotic’ aspect essential to speech. The preferred modern term for ‘egocentric speech’ is private speech (see John-Steiner 1992, pp.285-6). The concept of private speech, like the concept of a ‘private language’, borrows prima facie respectability from its relatives: ‘a private talk’, ‘private chat’, ‘private communication’. But these latter concepts all relate to a privacy exclusive to two or more people. ‘Private speech’ is ‘speech’ that is not addressed to anyone. The question of who this speech is private ‘to’, in the usual sense, does not arise. At any rate, whatever the situation is in regard to egocentric speech, we are now asked to consider, as also a kind of speech, a performance which not only has no audience but is completely undetectable. The point comes out better when we realise that this imperceptible performance is being identified as a kind of speaking. To call the performance – assuming inner speech is a performance – a piece of ‘non-speaking’ or a ‘refraining from speaking’ or an ‘imagined speaking’ seems more appropriate.

The implausibility of describing ‘inner speech’ as a kind of speech is easy enough to miss, or to ignore or gloss over. My opinion is that the implausibility is concealed from us by the nature of the rhetoric of, and our overfamiliarity with, ‘internal doing’ metaphors generally. We tend to mistake dead metaphors for literal referring expressions. We are familiar with what the dead metaphor refers to, so we tend to think of it as a literal name. So, we unconsciously infer, if inner speech is a literal name then, what it refers to must literally be a kind of speech.

In his account of thinking, Ryle (1949, 1979) says it does not matter whether Le Penseur says things out loud, or sotto voce, or ‘in his head’ (see Hampshire 1970, p.35; Melser 2004, pp.32-33). This is the same confusion, or deliberate assimilation, of the literal and the metaphorical that Vygotsky is guilty of. It is catachresis. Unlike Vygotsky, Ryle may be writing tongue-in-cheek, but catachresis is still a mistake. When the figurative nature of inner speech is unnoticed, this particular mistake – identifying inner speech as a kind of speech – is almost inevitable. Skinner falls into the same trap. In the passage quoted earlier he talks about ‘the range of verbal behaviour’ extending to include ‘subaudible speech’. In another passage, he says,

...the reinforcing effects of covert behaviour must arise from self-stimulation. But self-stimulation is possible, and indeed more effective, at the overt level. When a man talks to himself, aloud or silently, he is an excellent listener....and is optimally prepared to ‘understand’ what he has said. Very little time is lost in transmission and the behaviour may acquire subtle dimensions. It is unsurprising, then, that verbal self-stimulation has been regarded as possessing special properties and has even been identified with thinking (Skinner 1957, pp. 438-9).

The expressions talking silently and verbal self-stimulation show Skinner is thinking of his ‘covert speech’ as a kind of speech. Dennett repeats the gaffe. He speculates that once “crude habits of [out-loud] vocal autostimulation” become established in the repertoire, “we should expect them to be quickly refined”, and he contemplates

...further enhancements of efficiency and effectiveness. In particular, we can speculate that the greater virtues of sotto voce talking to oneself would be recognised, leading later to entirely silent talking to oneself. The silent process would maintain the loop of self-stimulation, but jettison the peripheral vocalisation and audition portions, of the process, which wasn’t contributing much. This innovation would have the further benefit, opportunistically endorsed, of achieving a certain privacy for the practice of cognitive autostimulation. ...Such privacy would be particularly useful when comprehending conspecifics were within earshot. This private talking-to-oneself behaviour might well not be the best imaginable way of amending the existing functional architecture of one’s brain, but it would be a close-to-hand, readily discovered enhancement, and that could be more than enough (Dennett 1991, pp.196-7).

Dennett’s description of ‘silent talking to oneself’ has a lot in common with Vygotsky’s characterisations of inner speech. Like Vygotsky, Dennett has a ‘skill’ or ‘developmental’ concept of the transition from out-loud to silent speech, rather than a crude ‘physical abbreviation’ one. And Dennett makes the important concession to a developmental approach that ‘the functional architecture of one’s brain’ may be ‘amended’ by educative practices. However, Dennett’s specification of the end-point of the developmental process in question is just as wanting as Vygotsky’s. Dennett’s alternatives to the inner speech terminology – his expressions entirely silent talking to oneself and private talking-to-oneself behaviour – cannot be taken literally any more than inner speech can. Silent speech is just as figurative, although in a different way, as inner speech is. Dennett’s idea that the ‘vocalisation and audition portions’ are ‘peripheral’ to this new kind of speech and can be ‘jettisoned’ surely indicates it is no kind of speech he is talking about. If there is no vocalising and/or nothing to be heard, no actual speaking can be going on.

Whatever kind of activity so-called ‘inner speech’ turns out to be, it may well have meta-actional or ‘action-regulating’ functions similar to, and derived from, those of speech. But this is not a reason for saying that ‘inner speech’ is a kind of speech. Clearly, it is not a kind of speech. For Vygotsky, ‘inner speech’ is either equivalent to thinking or integral in it.Thus, without a satisfactory literal explanation of what ‘inner speech’ is, the last phase of the child’s development as a thinker remains mysterious. But his failure to specify what inner speech is a metaphor for, and his consequent failure to explain what thinking is literally, do not cast doubt on Vygotsky’s account of the developmental stages that precede and prepare for ‘inner speech’. Vygotsky is still very plausible on the subject of how, from the foundational ability to participate with others in concerted and cooperative activity, the child develops the ability to bring ‘egocentric speech’ or ‘thinking out loud’ to bear to assist his solo activity. The plausibility of Vygotsky’s explanations of these earlier developmental phases makes the task of explaining the further, final development – from thinking out loud to thinking simpliciter – relatively easy.

— Derek Melser —


  • Bakhurst, D. 1996. “Social Memory in Soviet Thought”, in Daniels, H., ed., An Introduction to Vygotsky. London: Routledge.
  • Brockmeier, J. 1996. “Construction and interpretation: Exploring a joint perspective on Piaget and Vygotsky”, in Tryphon, A. and Vonèche, J., eds., Piaget-Vygotsky: The Social Genesis of Thought. London: Erlbaum.
  • Cody, A.B. 1967. “Can a single action have many different descriptions?”. Inquiry 10: 164-180. de Laguna, G.A. 1963. Speech: Its Function and Development. Bloomington: Indiana University Press. First published 1927.
  • Dennett, D.C. 1991. Consciousness Explained. London: Penguin, Allan Lane.
  • Hampshire, S. 1970. “Critical Review of The Concept of Mind”, in Wood, O.P. and Pitcher, G., eds., Ryle. London: Doubleday/Macmillan.
  • Hampshire, S. 1971. Freedom of Mind. New Jersey: Princeton University Press.
  • John-Steiner, V. 1992. “Private Speech Among Adults”, in Diaz, R.M. and Berk, L.E., eds., Private Speech: From Social Interaction to Self-Regulation. New Jersey: Erlbaum.
  • John-Steiner, V. and Tatter, P. 1983. “An Interactionist Model of Language Development”, in Bain, B., ed., The Sociogenesis of Language and Human Conduct. NewYork: Plenum Press.
  • Kozulin, A. 1996. “The concept of activity in Soviet psychology”, in Daniels, H., ed., An Introduction to Vygotsky. London: Routledge.
  • Lawrence, J.A. and Valsiner, J. 1993. “Conceptual Roots of Internalisation: From Transmission to Transformation”. Human Development 36: 150-167.
  • Lektorsky, V.A. 1984. Subject Object Cognition. Moscow: Progress Publishers.
  • Leontyev, A. N. 1978. Activity, consciousness and personality. Englewood N.J.: Prentice-Hall.
  • Luria, A.R. 1981. Language and Cognition. NewYork: Wiley.
  • Mead, G.H. 1962. Mind, Self and Society, Morris, C.W., ed. Chicago: University of Chicago Press. First published 1934.
  • Melser, D. 2004. The Act of Thinking. Cambridge, MA: MIT Press.
  • Meltzoff, A.N. 1996. “The Human infant as Imitative Generalist”, in Heyes, C.M. and Galef, B.G., eds., Social Learning in Animals: the Roots of Culture. NewYork: Academic Press.
  • Ryle, G. 1949. The Concept of Mind. London: Hutchinson.
  • Ryle, G. 1979. On Thinking. Kolenda, K., ed. Oxford: Basil Blackwell.
  • Skinner, B.F. 1957. Verbal Behavior. London: Methuen.
  • Sperber, D. 1986. “The mind as a whole”. Times Literary Supplement. Nov 21: 1308-1309.
  • Van der Veer, R. and Valsiner, J. 1988. “Lev Vygotsky and Pierre Janet: On the origin of the concept of sociogenesis”. Developmental Review 8: 52-65.
  • Vygotsky, L.S. 1962. Thought and Language. Cambridge, Mass.: MIT Press.
  • Vygotsky, L.S. 1978. Mind in Society: The development of higher psychological processes. Cole, M. and others, eds. Cambridge, Mass.: MIT Press.
  • Vygotsky, L.S. 1986. Thought and Language. Revised. Cambridge, Mass.:MIT Press.
  • Vygotsky, L.S. 1994. The Vygotsky Reader. Van der Veer, R. and Valsiner, J., eds. Oxford: Basil Blackwell.
  • Watson, J.B. 1919. Psychology from the Standpoint of a Behaviorist. New York: Lippincott.