Connectionism and Language Acquisition

Connectionism

Connectionism is an approach within cognitive psychology that looks to simulate and help psychologists understand different processes in the brain. This is done by constructing simplified models that include brain components such as neurons and synapses ^[1]. Human mental activity can be described in terms of information processing models, also known as neural networks ^[2]. The neural network of connectionist models are simple processing elements that are interconnected and are based on human neurons. These network units are able to process inputs by using connections, or patterns of activations that run between the units, therefore allowing learning to occur ^[3]^[4]. The connectionist approach can be either bottom up, which represents a pattern oriented approach, or it can be top down, which represents symbolic approach to simulating brain functions ^[1]. A pattern oriented approach attempts to simulate patterns that are found in the human processes regarding language where as symbolic approaches attempt to simulate processes using algorithms. Connectionist models tend to be very good, and very successful at simulating the performance of different functions in the brain, and they tend to be very precise ^[5]. Connectionist models tend to be centred around the desire to understand how neurons are able to represent, store and retrieve information, that ultimately allows them to be able to perform a behaviour ^[6]. These models provide us with a scientific understanding of the functioning of some human behaviours ^[7] and has reshaped the way that many psychologists think about human learning and brain processes ^[8].

Connectionism and Language Acquisition

In regards to language, we know that there are aspects of language that are learned. These learned aspects allow for connectionist models to be able to explain their behaviour ^[9]. According to an article by Redington and Chater, there are three main reasons why connectionist models are useful when it comes to studying the phenomenon of language acquisition:

They can provide principled perceptions of learning and learnability
They can provide potential learning mechanisms for particular aspects of language and can provide predictions for new experiments
And they can allow inferences concerning the nature and extent of innate knowledge, and innate learning mechanisms^[9].

Connectionist models are useful in language acquisition because they provide information on the mechanisms that are supportive in language processes; such as phonological learning, semantic learning and grammatical learning ^[5]^[10].

It has been proposed that language is localized and modular, which can be referred to as the language specific module of Universal Grammar, proposed by Noam Chomsky ^[1]. These connectionist models provide us with new ways of thinking about how behaviours or processes can be innate ^[11]. There are many phenomena of language that we are beginning to understand with the use of connectionism^[12]. With the help of these neural networks, it can be understood that learning is the change in synaptic connections. With the use of these models, algorithms can be generated so that development and learning can be studied ^[11].

Research on Connectionism and Language Acquisition

Rumelhart and McClelland - 1986

Rumelhart and McClelland came up with a connectionist model to simulate a child's ability to learn the past tense of verbs, both regular and irregular ^[5]. The model they came up with used a simple 2 layer pattern associator in order to learn the relationship between the verb and it's past tense form ^[1]. This model contained no rules, instead it only contained neuronal units. The connections between the neuronal units represented the relationship between the present tense of the verb and it's past tense form ^[13]. Rumelhart and McClelland suggested that the knowledge required for the acquisition of the past tense of verbs in English is stored in the connections in the neuronal processing units ^[13]. Because of this suggestion, Rumelhart and McClelland concluded that acquisition of past tenses using a rule based approach was no longer used, but as research went on regarding this topic, it was found that this conclusion was not justified and that rule based acquisition might be used more than Rumelhart and McClelland had assumed ^[13].

In this model, a total of 506 verbs (both regular and irregular) were presented to the model. The verbs were divided up into 3 different sections, the first section containing 10 verbs, the second section contained 410 verbs and the third section contained 86 verbs ^[1]. The sections were organized in a way that put high frequency words in the beginning stages of the model. This organization was consistent with child verb acquisition because it is known that children acquire higher frequency words first ^[1]. Rumelhart and McClelland trained the model with the first two sections, and allowed no training for the third section ^[1]. What was found was that the model was able to simulate children's acquisition of the English past tense fairly well - it was even able to simulate u-shaped learning of these past tenses ^[14]. But the model was also flawed in different aspects, for example, the model showed overregularizations which are evidence for a rule based system. These overregularizations were simply seen as inappropriate representations of the relationship between the present tense of the verb and the past tense forms ^[14]. The flaws that were shown contradicted Rumelhart and McClelland's original conclusion that this model would be able to replace symbolic rule based acquisition of the past tense of English verbs. Although these flaws do not support the complete switch from a symbolic rule based system to one of connectionist neuronal units, researchers who have followed up on Rumelhart and McClelland's model have suggested that the flaws in the system are not enough to dismiss the functioning of the connectionist models in regards to past tense acquisition. The model that was introduced by Rumelhart and McClelland was still able to show the foundations of the acquisition of English verb past tenses ^[14].

Looking at the acquisition of the past tense of verbs, this model was important because it showed that we are able to simulate u-shaped learning. This u-shaped learning has given rise to evidence of universal grammar within language acquisition. This universal grammar supports a single learning process where words are put together based on similarity and frequency ^[1].

Plunkett and Marchman - 1993

In 1993, Plunkett and Marchman had taken the criticisms that were seen in the Rumelhart and McClelland model, and answered them in order to come up with their own model to further explain the acquisition of English past tenses. The model that Plunkett and Marchman came up with was a feed forward network that contained one layer of hidden units, located between the input and the output units ^[14]. The outline of the model was as follows: 20 input units, 30 hidden units and 20 output units ^[14]. In this model, unlike that of Rumelhart and McClelland, the verbs were trained to the system one at a time ^[1]. The input units were seen as representations of the verb stem, and the output units were representations of the past tense of the verb. As far as training went, the model was originally trained with a sample of 10 regular verbs and 10 irregular verbs. In order to simulate the acquisition of verbs in children, the sample increased until the system got to 500 verb stems, in which 90% were regular verbs ^[14]. The network used back propagation learning. In order to account for the back propagation, the weights of the connections were randomized initially and through the use of a teacher, the actual output was compared to the target output. These comparisons would either fit with what the target output was, or if there was a discrepancy, the back propagation learning would create an error signal in order to try and get the actual output to become closer to the target output ^[14].

The model was able to account for regular and irregular verbs, and results showed that the model was able to mimic u-shaped behaviour in both the regular and the irregular verbs, which is comparable to the errors that children acquiring English past tense are expected to make ^[1]^[15]. The model did produce some errors, like what was seen in the Rumelhart and McClelland model. The errors that were seen tended to be produced due to interference between the regular and irregular mapping patterns ^[14], but as the model learned additional patterns and became more compatible with the patterns that were being produced, the errors tended to diminish. One difference between this model and the Rumelhart and McClelland model was that the model did not experience any overregularizations, it only experience a small amount of irregularizations (for example, the model produces bat instead of bite) ^[14].

Plunkett - 1992

In 1992, Kim Plunkett created a multi-layer connectionist model that was used to try and simulate early lexical development in children, mainly focusing on categorization ^[1]. The skill of categorization helps in a human's ability to form concepts about the events, or objects that are observed in life^[1]. In order to decide whether something is considered to be part of a category, prototypes are used. A prototype is considered to be a 'typical' example of something within a category. Prototypical features include the most representative features of an object or category. It has been found that most of the categorization errors that children make are due to the complexity of the prototypes ^[1].

The model was multi-layered, and contained both input and output units that were assigned to images and labels. To train the model, an image and a label were presented separately until they were trained to an output unit ^[1]. The results of the model showed that there were overextensions and underextensions present for both production and comprehension of the images and labels ^[1]. What this means is that the model could mimic categorization and prototyping, but was unable to simulate classification characteristics like we see in humans ^[1].

This model was important because categorization is an important skill of humans that allows them to be able to form concepts of the objects that are experienced in the world. This model shows how prototypes are utilized during human, and especially child language processes ^[1].

Elman - 1990, 1993

Elman came up with a recurrent network model that was able to assign grammatical categories to words. This model was able to predict the category of a word based on the word that came before it.^[14]. This model that Elman created is seen as an important step in understanding how lexical structures and hierarchical structures are represented in a connectionist model^[14]. In order to train this model, it was necessary to present sequences of words, one at a time, to the network. The way the words are presented allows the model to train to be able to predict the next word ^[14]. The training included multiple inputs that had a large range, from simple active sentences all the way to complex embedded sentences, in order to make the model mimic a child's abilities as realistically as possible. When input is presented to the model, there is activity running through the connections in order to produce the output. If the output that is produced corresponds correctly to the next word in the sequence, then it is clear that the network is accurate in its predictions ^[14]. This network used back propagation to change the connections of the output so that they were more like the target output the next time the system was run. The back propagation was used to improve the predictive power of this network ^[14].

Some draw backs of this network were that the system wasn't very successful, even after it was trained for an extensive amount of time. One of the reasons for the lack of success came from the huge variety of possible words that could follow the input word ^[14]. Because of this, Elman changed the model so that it would predict the grammatical category of the next word, rather than an actual word ^[14]. The model is able to do this for simple sentences, as well as embedded sentences. The way it works for embedded sentences is that the network is able to remember to subject (looks to see if it is plural or not) and then it is able to generate a verb that will agree with the subject that was presented ^[14]. Another limitation in this model is that the network is receiving input that is very well-formed, and very structured, whereas the input that children receive is usually not. It is not known how the network would respond to more realistic and not so well formed language structures ^[14].

This model is important because it is used to further understand how sentences are broken up and how words are recognized by children. It can help us understand how humans can break a continuous speech stream down into separate meaningful words ^[1]. There are some people who believe that being able to know the next word in a sentence is an acquired trait. This model wanted to show that a simple analysis of the input could lead to the acquisition of this trait ^[1].

Lewis and Elman - 2001

The model from Lewis and Elman in 2001 was an addition to Elman's previous model of predicting words. Lewis and Elman took sentences from CHILDES, a child acquisition data base, and trained the model by sequentially presenting the sentences ^[1]. This model, like Elman's original model succeeded in predicting the class of the word that was to follow, but where they differed was that this model by Lewis and Elman was required to predict the auxiliary that would be following a relative pronoun. What was found was that this model never predicted a form that was ungrammatical ^[1].

Other Views on Connectionism

There has been debate on connectionist models and whether there is necessity for rules, or whether these models can be completely rule free. Looking at the work done by Rumelhart and McClelland, we can conclude that their connectionist model of the acquisition of past tense can't completely replace a rule based approach, and that the connectionist approach is still adequate in explaining the acquisition of past tenses in English. In 1988, Fodor and Pylyshyn thought that symbolic processes weren't a necessary part of a child's learning system, or even in an adult's representation system. As research progressed, it was found more and more that statistical regularities could be extracted from the input and these statistics could help predict language structures ^[5].

It can be seen that through the years, models have been improved and limitations within previous models have been considered and have been acted upon, like what we saw with Plunkett and Marchman. Aside from all the limitations and errors that have been discussed, there are also positives of using connectionist models to explain language acquisition. Using these networks, it is possible to generate hypotheses of child development since these networks are created to mimic child behaviour and child development. As well, even though these networks are simplified, they are still complex enough to be able to realistically simulate brain behaviours and brain processes.

References

^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r ^s ^t ^u Cockayne, G. (2008). The connectionist modelling of language acquisition. Unpublished raw data, Applied Linguistics, University of Birmingham, Birmingham, United Kingdom.
^ Long, D. L., Parks, R. W., & Levine, D. S. (1998). An introduction to neural network modeling: Merits, limitations and controversies. In R. W. Parks, D. S. Levine & D. L. Long (Eds.), Fundamentals of Neural Network ModelingCambridge: MIT Press.
^ Gurney, K. (1997) An Introduction to Neural Networks. London: CRC press.
^ Elman, J. L. (n.d.). Connectionism and language acquisition. Unpublished raw data, University of California, San Diego,
^ ^a ^b ^c ^d Plunkett, K. (1998). Language acquisition and connectionism. Language and Cognitive Processes, 13(2/3), 97-104.
^ Houghton, G. (2004). Connectionist models in cognitive psychology: Studies in cognition. New York: Psychology Press.
^ Part six: Complex behavior-language. In (1997). J. W. Donahoe & V. Packard Dorsel (Eds.), Neural-Network Models of Cognition (p. 436). Amsterdam: North-Holland.
^ Gasser, M. (1990). Connectionism and universals of second language acquisition. Studies in Second Language Acquisition, (12), 179-199.
^ ^a ^b Redington, M., & Chater, N. (1998). Connectionist and statistical approaches to language acquisition: A distributional perspective. Language and Cognitive Processes, 13(2/3), 129-191.
^ Cangelosi, A. (2005). Modeling language, cognition and action: From connectionist simulations to embodied neural cognitive systems. In A. Cangelosi, G. Bugmann & R. Borisyuk (Eds.), Modeling Language, Cognition and Action (Vol. 16). Toh Tuk Link: World Scientific.
^ ^a ^b Elman, J.L., Bates, E.A., Johnson, M.H., Karmiloff-Smith, A. Parisi, D., & Plunkett, K. (1996). Rethinking innateness: Connectionist perspectives on development. Cambridge MA: MIT Press.
^ Pinker, S. (1994) The Language Instinct. London:Penguin.
^ ^a ^b ^c Pinker, S. and Prince, A. (1988). 'On Language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28, 73-193.
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r McLeod, P., Plunkett, K., & Rolls, E. T. (1998). Introduction to connectionist modelling of cognitive processes. (pp. 178-202). New York, NY: Oxford University Press. Cite error: The named reference "McLeodPlunkettRolls1998" was defined multiple times with different content (see the help page).
^ Plunkett, K. and Marchman, V. (1993) 'From rote learning to system building: acquiring verb morphology in children and connectionist nets'. Cognition, 48, 21-69.

[Cockayne2008-1] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r ^s ^t ^u Cockayne, G. (2008). The connectionist modelling of language acquisition. Unpublished raw data, Applied Linguistics, University of Birmingham, Birmingham, United Kingdom.

[LongParksLevine1998-2] Long, D. L., Parks, R. W., & Levine, D. S. (1998). An introduction to neural network modeling: Merits, limitations and controversies. In R. W. Parks, D. S. Levine & D. L. Long (Eds.), Fundamentals of Neural Network ModelingCambridge: MIT Press.

[Gurney1997-3] Gurney, K. (1997) An Introduction to Neural Networks. London: CRC press.

[Elmann.d.-4] Elman, J. L. (n.d.). Connectionism and language acquisition. Unpublished raw data, University of California, San Diego,

[Plunkett1998-5] Plunkett, K. (1998). Language acquisition and connectionism. Language and Cognitive Processes, 13(2/3), 97-104.

[Houghton2004-6] Houghton, G. (2004). Connectionist models in cognitive psychology: Studies in cognition. New York: Psychology Press.

[7] Part six: Complex behavior-language. In (1997). J. W. Donahoe & V. Packard Dorsel (Eds.), Neural-Network Models of Cognition (p. 436). Amsterdam: North-Holland.

[Gasser1990-8] Gasser, M. (1990). Connectionism and universals of second language acquisition. Studies in Second Language Acquisition, (12), 179-199.

[RedingtonChater1998-9] Redington, M., & Chater, N. (1998). Connectionist and statistical approaches to language acquisition: A distributional perspective. Language and Cognitive Processes, 13(2/3), 129-191.

[Cangelosi2005-10] Cangelosi, A. (2005). Modeling language, cognition and action: From connectionist simulations to embodied neural cognitive systems. In A. Cangelosi, G. Bugmann & R. Borisyuk (Eds.), Modeling Language, Cognition and Action (Vol. 16). Toh Tuk Link: World Scientific.

[ElmanBatesJohnsonKarmiloff-SmithParisiPlunkett1996-11] Elman, J.L., Bates, E.A., Johnson, M.H., Karmiloff-Smith, A. Parisi, D., & Plunkett, K. (1996). Rethinking innateness: Connectionist perspectives on development. Cambridge MA: MIT Press.

[Pinker1994-12] Pinker, S. (1994) The Language Instinct. London:Penguin.

[Pinker&Prince1988-13] Pinker, S. and Prince, A. (1988). 'On Language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28, 73-193.

[McLeodPlunkettRolls1998-14] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r McLeod, P., Plunkett, K., & Rolls, E. T. (1998). Introduction to connectionist modelling of cognitive processes. (pp. 178-202). New York, NY: Oxford University Press. Cite error: The named reference "McLeodPlunkettRolls1998" was defined multiple times with different content (see the help page).

[PlunkettMarchman1993-15] Plunkett, K. and Marchman, V. (1993) 'From rote learning to system building: acquiring verb morphology in children and connectionist nets'. Cognition, 48, 21-69.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]