Monday, August 30, 2021

9b. Pullum, G.K. & Scholz BC (2002) Empirical assessment of stimulus poverty arguments

Pullum, G.K. & Scholz BC (2002) Empirical assessment of stimulus poverty arguments. Linguistic Review 19: 9-50 



This article examines a type of argument for linguistic nativism that takes the following form: (i) a fact about some natural language is exhibited that al- legedly could not be learned from experience without access to a certain kind of (positive) data; (ii) it is claimed that data of the type in question are not found in normal linguistic experience; hence (iii) it is concluded that people cannot be learning the language from mere exposure to language use. We ana- lyze the components of this sort of argument carefully, and examine four exem- plars, none of which hold up. We conclude that linguists have some additional work to do if they wish to sustain their claims about having provided support for linguistic nativism, and we offer some reasons for thinking that the relevant kind of future work on this issue is likely to further undermine the linguistic nativist position. 

59 comments:

  1. I am getting the sense that once again, there is a lack of distinguishing between OG and UG happening here. It seems to me that much of what Pullum and Scholz are arguing based on is OG, where children can make errors and be corrected. But how does this argument hold up for UG? For example, they argue that genre is an issue that needs to be considered, and what type of input the child is getting. There is talk of all of this variability in the input being received, but this variability should only apply to OG. The point of UG is that it is universal, so variability would be irrelevant. While I found the thought-process behind their argument compelling for taking a critical look at whether linguists have proven their case, I find these articles difficult to wade through when UG/OG distinctions aren't being recognized.

    ReplyDelete
    Replies
    1. I think I agree with you.

      One thing that confused me in this article is that all the acquirendum were rules of English grammar. I thought that what Chomsky says is innate is not those language-dependent grammar rules (or we could say surface structure rules), but more abstract, deep structure level rules that are consistent across language (even though they get manifested differentially). On page 30 (in the pdf), Pullum and Scholz quickly touch upon this, saying Chomsky would argue that the real issue with the auxiliary-initial clauses argument has to do with more general item of knowledge about constituent structure and not word positions. The answer of the authors seemed weird to me : they say that it is unreasonable to suppose that a learner of English needs to know those general deep structure. Of course, most learners of English do not know/learn these rules explicitly, but isn't the whole idea of UG that they are born with those rules innately and unconsciously? Aren't Pullum and Scholz doing something circular here, by dismissing UG in their paper that criticizes the poverty of stimulus argument? Or I am just misunderstanding?

      Delete
    2. Both of you are right. Failure to make the OG/UG distinction and vagueness about what the "poverty of the stimulus" really means. Not "few UG violations" but none at all: no negative evidence. Category members only, no nonmembers, either hear or spoken by the child.

      Hence trial-error-correction learning (supervised learning) is not possible. (Unsupervised learning even less so! Why?)

      Delete
    3. Supervised learning requires feedback. This is not possible because the amount of corrective feedback is not enough for the child to learn. This is the Poverty of the Stimulus.

      Unsupervised learning is passive. This is especially not possible because we have determined that adult language is far too complex to pick up on the rules of UG by just hearing it. So, it isn't just a matter of not getting the feedback, rather, it is too complex.

      Delete
    4. Children never speak or hear UG errors. So they never get negative evidence: no violations of the rules of UG (just violations of the rules of OG). This is the “Poverty of the Stimulus” for UG rules. Without errors, there can be no error-correction: no supervised (reinforcement) learning. You can't learn a category if you don't ever encounter non-members, just members, because you have no way to find the features (rules) that distinguish members from non-members. So since all children already obey the rules of UG, they must be born already “knowing” them.

      Supervised learning is learning from the correlations distinguishing the features of UG-correct and UG-incorrect sentences (positive and negative evidence; members vs. non-members). Unsupervised learning is learning from the feature-feature correlations in what you hear, passively: no trial and error. Children hear (and speak) only positive evidence of UG. Everything they hear and say obeys the rules of UG. The features distinguishing what does and does not obey the rules of UG cannot be learned from just the feature-feature correlations of what does obey the rules of UG.

      Delete
  2. It certainly seems valuable to bring into question examples of “unlearnable grammar” that are commonly referenced in support of UG theory, but Pullman and Scholz seemed, to me, to too hastily reject the importance of poverty of stimulus arguments other than the one they focus on (their focus being the argument that "People attain knowledge of the structure of their language for which no evidence is available in the data to which they are exposed as children” - pg.14 -> 6 in pdf).

    At the beginning of their article, P+S recognize that they are concentrating on only one aspect of the “many headed” hydra that is a comprehensive UG-supporting argument from the poverty of stimulus (APS). But, having effectively shown deficiencies in this one “head", at the end of the article they seem to suggest that there has yet to be any successful support for the APS, that "defenders of the APS are still searching for a well-confirmed case of the type they need” (pg. 46 -> 38 in pdf).

    The authors do make clear that they are *not* trying to defend empiricism, but still seem to end on the note that the burden of proof lies on defenders of APS in the nativist (UG) vs. empiricist debate. This seems, to me, to understate the significance of other reasons that APS might be true (all the reasons listed at the beginning of the article that P+S laid out and then said they weren’t going to consider).

    ReplyDelete
    Replies
    1. And, I was thinking, the placement of empiricism as the default might come from an underestimation of how different language acquisition is from other kinds of learning.

      “How do children learn language” seems similar to (or really, a subset of) the question "how do children figure out how to categorize their world such that it is intelligible, and why do we all seem to categorize it similarly”.

      But with language, there’s less feedback for "doing the wrong thing with the wrong kind of thing” (misforming sentences) than there is for interacting with objects in the real world. If a you miscategories a cactus as food, you’ll get some painful needles stuck into your tongue. There is immediate, material reinforcement to guide your learning. But since using language is just an interaction with symbolic terms, there’s no feedback on whether you’re doing the right thing with the right kind of thing unless another person tells you that you’re using these symbolic terms incorrectly (or, if you use them so incorrectly that you’re not understandable so can’t get what you want, but it seems probable, as discussed by Pinker and Bloom - see pg.29, that this would not be enough to guide the development of correct grammar). Additionally, in our general categorization of the world, there aren’t a lot of mistakes that you *never* see occurring. You see kids touching cactuses, eating dirt, or trying to pet wild racoons all the time, whereas you never hear a kid say "Is the dog that in the corner is hungry?”. Basically, it seems that language involves learning process different in kind from most other learning that occurs during development, so even if assuming empiricism as the default explanation for much of learning might be the way to go, it’s not so clear that, in the case of language, this is true.

      Delete
    2. Hi Caroline, you make a very interesting point when saying that learning language can be seen as a subset of categorizing the world "“How do children learn language” seems similar to (or really, a subset of) the question "how do children figure out how to categorize their world such that it is intelligible, and why do we all seem to categorize it similarly”."

      I agree with your point about the feedback for incorrect use of language being far less noticeable than the feedback for "doing the wrong thing with the wrong kind of thing" with objects in the world. I suppose this has to do with the idea that when learning a language, it is a set of symbols which are one level removed from the environment whereas when learning about objects in the environment, we are directly interacting with the environment to form the category. Direct feedback from the environment such as eating a cactus, causing pain is more dramatic than the feedback often received from incorrect language use.

      Delete
    3. I think you will see what is at issue much more clearly if you think about learning grammar, rather than learning "language," and you distinguish between OG and UG.

      If all grammar were OG, it would simply be learned, by unsupervised learning, supervised learning or instruction, just like the rules of chess or checkers. Right moves, wrong moves, and feedback. But for UG, there are no wrong moves, hence no feedback. That's the Poverty of the Stimulus.

      Can you still make the points you were making once you take these distinction into account?

      Delete
  3. ''it is calculated that a child in a working-class family will have heard 20 million word tokens by theage of 3, and a child being raised in a family on welfare will have heard only 10 million (p. 132). ''


    The wide difference in the number of words that a child hears depending on their environment, coupled with the fact that you don't see more UG errors in children from a family on welfare really shows how strong UG is. The author tries to use this fact to show how there is enough data (Even for children from welfare class) to learn the generalization of language (UG) in a data-driven manner. What they fail to mention is that if that were the case, we would see a difference in error making (of UG) between the two groups of children. In other words, if children do learn the generalization of language in a data-driven way, shouldn't the children from the working class have a better understanding of generalization? What we see is only a difference in OG between the two... -Elyass

    ReplyDelete
    Replies
    1. True, but since no one makes UG errors at all, the comparison is irrelevant (unless we don't make the OG/UG distinction, as Pullum doesn't).

      Delete
  4. The steps from acquirendum characterization - lacuna specification - indispensability argument- inaccessibility evidence - acquisition evidence seems to be a solid schema for analying the argument from poverty of stimulus. In order not to misread the article, at each analysis, I did need to think again about the specification of APS the authors were analyzing, which was a challenge. This framework led me to perceive that their suggestions, mathematical learning theory and data-driven corpus research meaningful, were be meaningful as future directions in research.

    ReplyDelete
    Replies
    1. Besides the UG/OG problem my classmates have brought up, I considered case 2, auxiliaries. On pg 30, the authors present linguists' correction of Kimball's proposed rulle and say 'All of this can be learned from examples containing one item acting as head of the complement of another' (p.31). Isn't this a 'deeper' structure?
      However, I am not sure if this can be used to counter the authors' analysis unless children learn quickly how to apply this correctly without hearing many many ("sufficiently many")examples of the 'parochial' (p.31) subcategories.

      Delete
    2. Grammar rules that can be learned by unsupervised or supervised learning are OG. UG is what's left over -- and it's plenty.

      Delete
  5. The discussion in the lecture today, regarding a mechanism that allows our trail and error experiences to be successful was interesting. The fact that we need a mechanism that highlights certain features while ignoring others (in order to correctly categorize) reminds me of the computational pandemonium model of pattern recognition. Although the model was constructed to describe how we processes visual stimuli, there are many similarities with this model and the characteristics necessary for such a mechanism that were described in class. Primarily, the “feature demons”, which respond to specific features, are akin to our feature detector that recognize differences between two things. The “cognitive demons” are responsible for patterns and ‘listen’ to the feature demons to determine how closely the features align with the pattern. These ‘cognitive demons’ are similar to the distinction between categories - if the features will be more closely aligned with some categories than others, so we are able to narrow down the possibilities of membership. Finally, the “decision demons”, responsible for the final decision of what pattern is the most likely to be the one we are perceiving, are like the final decision we make regarding the things membership to a category. We are most likely to decide something belongs to one category, based on the similarities that the thing shares with other things of the category. To conclude, I think the Pandemonium model of pattern recognition exemplifies what we would need from a brain mechanism in order to have error free categorization based on sensorimotor experimentation.

    ReplyDelete
    Replies
    1. The test of a learning model is what it can actually learn. Pandemonium was an early precursor of today's unsupervised and supervised learning models. The Perceptron was too. In one way or the other, all "pattern-learning" models are category-learning models, and what they learn is to detect which features distinguish members from non-members.

      Delete
  6. I am still finding OG and UG a little bit confusing!

    I’m fairly sure that UG can be learned passively, through unsupervised learning. UG on the other hand, cannot be learned passively. This is because no matter what a baby overhears or witnesses, the poverty of the stimulus issue is pervasive. However, children also cannot learn UG through supervised learning. This is evidenced (according to the readings) through the fact that in order for this to happen, children would have to make UG mistakes and be corrected and taught the correct grammar. According to the readings from this week and last week, this does not happen. Therefore, there must be some inborn capacity in children to do this before they are ever taught language. Where I am confused is when it comes to the parameters of UG. As discussed last week, an example of a parameter could be whether or not a language drops a pronoun at the end of the word by turning it into inflection. How can parameters of UG be learned when UG is innate?

    If someone could explain this to me I would really appreciate it! I am confused about the components of UG that are universal and inborn versus parameters of UG that are learned and the distinction between these concepts. How are the parameters of UG different from OG?

    ReplyDelete
  7. As brought up by other students, it appears that the authors of this text do not believe that negative feedback from one’s environment is significant/necessary for language acquisition. The text argues that the properties of a child’s environment can only provide them with positive language data exposure, but, as brought up in Pinker’s text, children do get negative feedback in the form of corrections from adults or from the general misunderstanding of others. I would also argue that children learn a lot through imitation (mirror neurons?) and that their use of language will refine itself to confirm to the conventions of those around them. While reading this text, I was led to reflect on the necessity of negative data in language acquisition. Personally, I believe that negative feedback is necessary, as UG only provides individuals with the necessary hardware to acquire language and has tremendous flexibility (which leads to linguistic diversity). As much as you can learn through imitation and never have anyone tell you your sentences are not grammatical, someone (who would have started a long chain of imitations) would have needed to have learned or constructed a language which differentiates grammatical from ungrammatical sentences in order for people to understand each other. The way a language is mastered, regardless of the means of acquisition, partly relies, in my opinion, on understanding the distinction between what is right and wrong in a language, which can only be done through negative feedback.

    ReplyDelete
    Replies
    1. Hi Camille, I think you make some very interesting and valid points here! I also believe that negative feedback is necessary for language acquisition, and is quite unavoidable in a child’s environment. The authors state that children’s data-exposure histories are positive only, as they are not given negative feedback on what is ungrammatical. However, I agree that they will be given this feedback if an adult corrects them if they say something ungrammatical. I think there could also be non-verbal cues that could serve as negative feedback, such as facial expression and body language that might indicate to a child that they are not using language correctly. I also think that your point on imitation is very interesting, and I also believe that it plays an important role. For instance, there were many phrases and words that I learned from my parents as a child that peers would not be familiar with. I had believed they were common phrases since I heard my parents saying them often, but not many other people were familiar with them. This shows the direct influence our environment has on our language, and that imitation likely does play a role.

      Delete
    2. I thought your inclusion of the concept of mirror neurons was a really interesting point that I had not thought about! Mirror neurons are inborn and innate, and I think they could potentially provide some support to the internal mechanisms of UG because they allow for the intake of unspoken rules. While I don't think mirror neurons are exactly UG, I do think they could be a supplementary innate feature of cognition that allows for not just imitating sounds and words, but also providing evidence to the brain of what language is, and then maybe there is some kind of feedback system with the UG that helps language skills develop.

      Delete
    3. Camille, Pinker fails to distinguish OG corrections (plenty, because the child makes plenty of OG errors) from UG corrections (none, because the child makes no UG errors). Many OG rules are simple enough so the child can learn them through unsupervised learning too, through mere observation and imitation. (You could learn to play tic-tac-toe with neither supervision/reinforcement [error-corrective feedback] nor verbal instruction, through mere observation; but UG rules are too complex and general for that to work.)

      Evelyn, no error-correction for UG errors, because the child makes no UG errors, just OG errors. (Imitation works for simple OG features, such as vocabulary, pronunciation, idioms, and “sayings,” like the Latin sayings I had often heard my [Hungarian] parents use, but my [Canadian] school-mates had not.)

      Leah, “mirror neurons” can only help with rote imitation, as with phonology, and parrots. It won’t tell you “John is easy to please Mary” is wrong.

      Delete
  8. Something that I found interesting in this reading was the consideration of differences in grammar between dialects in the same language. This is discussed on page 18: “There is some evidence that no universal generalization can describe which plurals can occur in which types of compound… British dialects favor regular plural non-head elements considerably more than American dialects.” They give the example of American English using the phrase ‘a drug problem’, while British English uses the phrase ‘a drugs problem’. Another example of this that springs to mind is the American/Canadian dialect using the term ‘math’, and British English using the term ‘maths’, which would seem grammatically incorrect in Canada. This is interesting since this is still the same language without much difference between dialects, and yet there are still slight grammatical differences. They use this to go against the idea that universal grammar dictates the principle, and I think this shows the limits to UG.
    I also think these slight grammatical differences between dialects is a very interesting consideration in the topic of language acquisition. Particularly, I think this is interesting for cultural and environmental influences.

    ReplyDelete
    Replies
    1. These are all just simple examples of learned and learnable OG.

      Delete
  9. Great example Evelyn, I agree that grammatical differences in dialects is a very interesting topic! However, I would argue that this example is more of an example of Gordon mischaracterizing this principle as part of UG than a dig at UG as a whole. There are various properties of grammar without universal generalizations and just because varieties in plural elements cannot be included in UG does not mean that other principles cannot.

    ReplyDelete
    Replies
    1. Yes, the failure to distinguish (learnable) OG from UG runs through both the Pullum article and the two Pinker articles.

      Delete
  10. Universal grammar is intuitive and innate. Despite the diversity of environments in which children grow up acquiring language, UG remains consistent and unanswered, the variability exists within OG. Certain dialects of a same language are dependent on the environments and culture, however these dialects vary only in OG, not in UG. As children learn to speak, they make OG errors despite only hearing correct OG. They will, however, never make UG errors, indicating that they subconsciously “know” UG already. On the other hand, OG is learnt through unsupervised learning, trial and error, and with corrective feedback.

    ReplyDelete
    Replies
    1. Hi Lola. I agree with your understanding of the distinction between Ug/Og as far as it relates to a necessity to learn the rules about them. However in the 9a reading, we learn about how UG still has parameters that we must learn to employ. This is a question for myself but perhaps you also know the answer; why is it still called innate if there are still certain boundaries we need to understand? If we can't simply understand them like we do with the other grammar rules covered by UG wouldn't it not be considered UG but rather OG?

      Delete
    2. See Iris's reply about parameter-setting and UG/OG.

      Delete
  11. This article broadened my understanding of the stimulus poverty arguments which, up until now I thought were very compelling. What stood out to me the most was Pullum and Scholz breakdown of the premises that formed the poverty stimulus arguments in the first place, and a proper definition of the term itself which says that “people attain knowledge of the structure of their language for which no evidence is available in the data to which they’re exposed to as children”. By outlining properties of the child’s environment and accomplishments which have resulted in the poverty stimulus arguments, I now understand that these premises are insufficient to establish the falsity of empiricist claims about language learning.

    However, I still have some unanswered questions and confusions about what’s needed for language acquisition. Remarkably, the authors of this text seem not to accept that negative feedback from one’s environment is necessary and that positive feedback is the only thing possible given the properties of their environment. But, I still think its important to consider how children typically do receive an abundance of negative feedback from things like their parents correction, or learning by observation through conversations with peers. It’s difficult to accept that negative feedback is not at all necessary for language acquisition when it’s such a constant and inevitable part of a child’s development.

    ReplyDelete
  12. I had a question about Wk 9's lecture, when we continued discussing the dictionary.
    My understanding after class and looking at comments from 8b:
    kernel: we get this by removing words not involved in defining other words.
    what remains defines everything inside and outside
    minset: satellite + kernel core. smallest set for grounding
    not a dictionary because it cannot define all the words within itself
    but has the potential to define every word
    satellite: not a dictionary
    kernel core: defines in-words
    What is btw the minset and the kernel? How would we characterize this portion of the dictionary? I am probably not getting how we get from the kernel to the minset.

    ReplyDelete
    Replies
    1. Hi April! In regard to your question, I would say that kernel words are words that form the basis of our language, consisting of satellite words and core words. MinSets, as you have written above, are the minimal grounding sets, and with them, theoretically, we are able to ground everything. MinSets are constituted by satellite words and core words, and only used a part of all the kernel words, and different MinSets use different kernel words, moreover, not all kernel words are included in a MinSet. As shown from the graph on the slides, the kernel forms the core of the dictionary, and the MinSets exist in the domain of the kernel words.

      Delete
  13. I’m thinking back to what Professor Harnad said in the lecture last Friday (the Nov 12th lecture), where he claims that language in our species did not begin with speech, but began with gestures. There is evidence of gestures used as language in the formation of different sign languages around the world. However, I was wondering if there was any evidence of gestural languages pre-dating the emergence of spoken languages (or if this would even be possible to see in historical documents or artifacts). If this is true, would that mean the human species was mute when we first evolved and then eventually evolved to have spoken languages?

    ReplyDelete
    Replies
    1. I don’t think that using gestures as a language before spoken language would necessarily mean that we were mute when we first evolved. I think it’s possible we still had vocal cords and made noises, just our communication was visual first before it became auditory. We still had the ability to, just not the skill yet. Eventually, we could have realized that verbal communication allowed us to communicate in different ways and thus we adapted to incorporate it more.

      Delete
  14. One thing I’m wrestling with after reading the paper and all the skywritings is the concept of negative evidence. From the readings and the lectures, I understand that we often see violations of OG but not UG, and I understand this to be because there is no such thing as negative examples of UG (the poverty of the stimulus argument). That being said, why do we distinguish positive from negative examples of UG in the first place? If there is no such thing as a negative example of UG (an unthinkable thought), why do we need to specify “positive” vs. “negative”? Isn’t every example of UG we see positive, thus inherently eliminating the need for a “negative” label?

    ReplyDelete
    Replies
    1. Hi Emma!
      I think negative evidence is an indispensable component for supervised learning, and that's how "trial and error" learning happens. For example, if we only receive positive feedback for what is an edible mushroom, we still do not understand what is an "edible mushroom category" -- we need to know the features of inedible mushrooms to learn what is an edible mushroom. Similarly, if we learn UG through supervised learning, we need negative evidence. However, since we never receive negative evidence for UG when we were developing languages, it means we did not learn it through supervised learning. Thus, we have to be born with UG.

      Delete
  15. Certain elements of the poverty of stimulus argument, as well as of the distinctions between OG and UG were made clearer to me by the discussion around this week’s reading (although the reading itself did not retain a strong enough distinction between OG and UG). I still, however, have some reservations about accepting the poverty of stimulus argument for UG. So certainly, without negative feedback it would have been impossible for a child to learn the rules of UG, such as the example Harnad gave last week:

    UG:
    John is eager to please Mary (UG+)
    *John is easy to please Mary (OG-)

    This is because this kind of sentence is never produced by the child. I suppose maybe my confusion can be chalked up to a lack of knowledge (explicit knowledge, that is) about the rules of UG. For this particular example, I feel as though the incorrectness of the second sentence could stem from that “easy” is a word to describe actions/activities, and not people (unless we are referring to their sexual proclivities). Maybe there is something I am missing here -- or perhaps I need to hear more examples of violations of UG in order to have a better grasp of the strength of the argument for its existence. Does the fact that some sentences never occur necessarily have to point to a giant underlying structure like UG? Or can these examples of UG-violating sentences that would never occur to a child be explained by something else, perhaps more related to the semantic content of the words than to the structure of UG?

    ReplyDelete
    Replies
    1. Hey Sofia,
      I think it is perfectly natural to have questions like yours. As we repeatedly discoursed about language throughout the course, I think the [confusion between UG & OG] and [doubts about UG] arise from the fact that we all speak language. Thus, we all have an intuitive idea of what language is and how we learn them because we have gone through it ourselves (for both 1st and 2nd languages) and seen different kid-sibs go through it numerously as well. As seen in the readings, even academics like Pinker & Bloom, Pullum & Scholz struggle to make the distinction between UG and OG and I believe it is for similar reasons.
      It is a bit counterintuitive to think that there could be a part of language that is unlearnable and hardwired. And it is harder because as you said, other than Chomskian linguists, we do not know explicitly what UG is. UG follows different rules from what we do explicitly know, OG, and hence causes all this confusion and doubt.
      Prof Harnad even mentioned that UG was divisive amongst psychologists when first proposed by Chomsky. Pinker (kind of like a middleman) who had to explain what UG actually is to psychologists wanted to calm the hostility and hence wrote “Most of language is learned anyway and the part that isn’t unlearnable (UG) could have evolved” (so calm your horses, essentially). And while conveying that message Pinker oversimplified the problem of the “evolution of language” and completely skipped the controversial argument of the “Poverty of the Stimulus”.
      Also, remember that Chomsky initially started out by criticizing Skinner’s book “Verbal Behavior” which accounted for language acquisition only through the behaviorist’s lens of conditioning. Imagine how controversial UG must have been amongst psychologists when the most dominant theory of its time, behaviorism, and its pioneer, Skinner, was being criticized.

      Delete
    2. Now to (sort of) answer your other question: “Does the fact that some sentences never occur necessarily have to point to a giant underlying structure like UG?”.

      So first off, the argument of the poverty of the stimulus (POS) is an “underdetermination of theory by data” (Chomsky, 1975). In other (simpler terms), the language output during language acquisition is underdetermined by the primary linguistic data that is available in the environment. This is the basis of the POS argument. It’s the fact that only UG-compliant sentences are heard and spoken (even during language acquisition) that points to the underdetermination. Language acquisition is SUPER underdetermined by the linguistic data available.

      Second, the fact that “some sentences never occur” just consequently shows how UG is underdetermined and cannot be learned. As we saw from our categorization lectures, there are three ways of learning: unsupervised, supervised and hearsay/instruction.

      The absence of negative evidence makes unsupervised learning impossible because everyone speaks UG-compliantly. Children cannot passively induce the features distinguishing OG when all they receive (are exposed to) is the features distinguishing OG and never features that does not distinguish OG.
      The absence of negative evidence makes supervised learning impossible because there can be no error-correction if children only speak UG-compliantly. Thus, because this “absence of negative evidence” (poverty of the stimulus) shows UG cannot be learned, it leads to the logic that such grammatical rules must be inborn. Hypothetically, if we were NOT inborn with UG, we could still learn UG through our learning methods (thanks to lazy evolution) if we had enough negative evidence. However, again, because we do not have any negative evidence, we know that it is NOT learned.

      Delete
    3. Just to add one more thing, this absence of negative evidence or the presence of only positive evidence is one of the reasons why UG is met with an evolutionary problem (where other aspects of language is not). Just like how UG cannot be learned by us, it could not have been learned by evolution either: “It is not at all clear what would serve as error-correction, and what would count as right and wrong, in order to shape UG in the usual Darwinian way: through trial and error genetic variation, and adaptive selection on the basis of advantages in survival and reproduction” (Harnad, 2008).

      Another reason why UG is met with an evolutionary problem is because it has “no apparent adaptive advantages. The absence of a biological advantage for UG is an even greater handicap than the poverty of the stimulus” (Harnad 2008).

      Although I can’t comment on whether UG-violating sentences relate to the semantic content of the words, hopefully this clarifies the argument of the POS.

      Delete
    4. Thank you, Iris, for this thoughtful and very helpful response !!

      When we talk about the evolutionary problem for UG, the problem sounds a lot like the one we met when discussing propositionally: how did the ability to talk about something in the abstract/generally/that is not immediately present arise instantly in the evolution of humans? Or, how did gestural communication evolve into proposition language (which is just a fancy way of saying language !!) Could a similar explanation be possible for both UG and the emergence of propositionality--are they essentially the same thing?

      Anyway, your answer as to POS gave me a lot of food for thought !!

      Delete
    5. I have been having similar questions and I found your clarification, Iris, to be very helpful in explaining them for me. I had been questioning for quite a while the extent to which any linguist actually knows the rules of UG. In all the readings we have done so far, I have yet to see a rule described that is shared by all languages. As you point out, it may rather be that we cannot describe the rules of UG because they are inherently, as you say, underdetermined and cannot be learned. In adulthood, as scientists and linguists, we struggle to identify the features of UG that are so deeply ingrained. For want of negative evidence of UG, we cannot define it or distinguish it from what is not UG, hence, the difficulty in its description. This can lead to criticisms that UG and the poverty of the stimulus arguments are unfalsifiable and therefore pseudoscientific in Popper’s demarcation criterion. Some may argue that UG rules are simply observed features defined after the fact, rather than hypotheses that are predicted and tested, but in defense for UG, one may rebut this criticism in that the poverty of the stimulus predicts what a children may or may not be exposed to in their language experience, and what they may or may not learn. Based on the arguments you highlighted above, it must logically follow that UG cannot be learned.

      Delete
    6. This thread was incredibly helpful to understand and clarify the doubts that I was facing in completely understanding the distinction between OG and UG. Understanding that there is no negative evidence from which one could learn UG, how would one then go about to teach or code this when trying to get a Turing machine to learn language. By having no negative evidence for the UG, one does not only have to assume that it is an innate mechanism, but then I also question how one can “teach/integrate” this into a robot. When trying to reverse engineer a machine to be as similar as a human that would be able to pass the Turing test, I wonder if figuring out how to somehow teach it to learn language will be affected when the UG humans have cannot be “passed on” or integrated to the robot seeing as we are unable to teach it.

      Delete
  16. I feel like I have a better understanding of the reasoning for and against linguistic nativism after reading both papers, but I still have some fundamental doubts about how this material applies to reverse engineering cognition. The overall theme of these readings seems to be whether we can conclude that people learn the structure of their language despite the paucity of evidence available to children during the language acquisition process. This raises the question of how much language acquisition is dependent on intrinsic mental systems. I am confused about how knowing the degree to which linguistic nativism is correct helps us build T3. T3 obviously needs language to do everything we can do, but do we need to worry about how we acquire language in order to derive its mechanism? I understand that language serves a wide range of functions, including symbol grounding and categorization. Language supplies the symbols upon which components of the actual world are "grounded," hence the ability to acquire categories through instruction is obviously language dependent. As a result, T3 is in need of language skills (but we already knew that). However, if Pinker’s linguistic nativist argument turns out to be right, or vice-versa, what changes about how we conceptualize T3?

    My guess is that UG is what we’re really interested in because if UG is needed to pass T2, then it must be at the head of T3. The implications of that might be that a T3 robot who has categorization capacity (i.e. can learn by induction or by unsupervised learning) might not have language capacity.

    ReplyDelete
    Replies
    1. Hi Melissa! I enjoyed reading your comment because I am thinking in a similar way. I think I have strengthened my understanding of UG as an innate feature we all have to understand and use grammatical rules. Because there are no contradictions to UG, there is insufficient negative evidence for it to be due learning (unsupervised or otherwise), so it is chalked up to an innate feature within human development.
      I wonder then, like you said, how this relates to the issue of reverse-engineering consciousness. In order to develop a T3 robot we need for it to have the ability to cognize and use language indistinguishably from us. To do this, we may need to understand more about the modularity of UG. Looking at the functions of UG (a way to increase the speed at which we develop language, which is meant to serve as a social tool for survival and reproduction, as well as a way to represent complex thoughts) could help us get there, as could considering theoretical advantages to UG (although I understand from earlier skywritings and Harnad’s writing that there is “no apparent adaptive advantage” to UG, which creates an even larger problem than the poverty of stimulus argument).
      My thinking is that the OG/UG distinction complicates what we already know about reverse-engineering consciousness because it suggests there is an inborn feature that we know very little about, and that there is no apparent way for it to have evolved. This casts more doubt in my mind that we are on our way to having sufficient understanding to reverse-engineer consciousness without first answering more of these problems. Additionally, could the concept of UG suggest that there exists more inborn features of human experience that we are not aware of due to the absence of negative evidence? How would this further complicate the issue of reverse-engineering consciousness successfully?

      Delete
    2. Hi Malissa and Madelaine, Prof.Harnad posed a question before about whether GPT-3 has UG, to which I respond yes, because GPT-3 seems to have the ability to command the syntax of many languages, not just English, which means its ability to learn the syntax of any language is at least partially on par with humans. Of course, this does not mean it actually understands language because of the symbol grounding problem. But since UG implied only the ability to command the syntax of any language, I am prone to believe GPT-3 already has UG.

      Delete
    3. Hi Zilong. On the opposite, I don't think GPT-3 has UG. From my understanding, since UG is something inherent, we cannot even describe the rules of UG. To simulate the process of language acquisition in a computer program, we usually build in some assumptions about the units of language, which cue our 'machine' to look for the pattern worth paying attention to. However, since UG is not describable, I can't see the possibility of we could enumerate all the UG and put this information in GPT-3, not to mention that we don't even have enough negative evidence to learn UG. Therefore, from my point of view, GPT-3 is still an inputs/outputs T0 robot.

      Delete
  17. “What if we locked 8 babies in a room for 20 years and see if they make their own language”
    This question was raised in a meme that was posted in the chat after the lecture that Prof. Harnad asked the question: what is the thing that we have, but the animal doesn’t give us language. Surely, it is neither practical nor ethical to answer the question by locking babies for 20 years. However, it reminds me of the case of Nicaraguan Sign Language that Prof. Harnad mentioned, which be seen as evidence for gesture theory, suggesting that human language was developed from gestures instead of vocal signals.
    In our lecture, Prof. Harnad proposed that it’s a mistake to look for the origin of language in the origin of speech. However, when I look into the case of Nicaraguan Sign Language, I wonder if it could be actually seen as evidence for gesture theory under the frame of discussion in our class. One question is that, without doubt, this ‘experiment’ shows that children could collectively possess the capacity to learn and create ‘language’, but it did not explain how these students develop the pidgin-like form language to sign language with a higher level of complexity.
    The later analysis on the language that the young children developed shows that its spatial modulations, which are the building blocks of grammars in sign language, were signed more frequently in the later-exposed signers. And more significantly, when describing complex motions, the early-exposed signers were signing the motions simultaneously while the later signing sequentially, indicating that this combinatory change is the ‘thing’ that denotes a shift from gestural to language like expressions. But how did they learn not only the signs of words and the ‘verb agreement’ but also other conventions of grammar so fast?
    To answer this, I try to investigate more on this case to find out when this more complex system emerged, was it spontaneously? Before they came to school, these deaf children were using simple home sign systems and gestures to communicate with others: they do not seem to have the ability to claim something or say that something is true. And the language scheme that those teachers first applied to these children was not suitable; many children failed to grasp the concepts of words. So, did they learn ‘proposition’ by just communicating with one another instead of being linguistically connected( under supervised learning) with their teachers? Clearly, the case of Nicaraguan Sign Language, the process from pantomime to sign language, is in favor of gesture origin of language. However, given the question I had, I think gestures might be necessary, but not sufficient to give us the power of creating and using language.

    ReplyDelete
  18. From what I’ve understood in learning about the poverty of the stimulus argument, Chomsky’s primary coup in formulating it was that it was a refutation of previously dominant behaviorist theories of language acquisition through operant conditioning. Pullum and Scholz seem to get bogged down in the grammatical analysis of components of English sentences such as word order, which have nothing to do with Chomsky’s APS and UG, but do have to do with ordinary English grammar. I will admit that some of this text was lost on me, since it was very dryly written. But I do wonder, since it is impossible to formulate a sentence that violates UG, how one would even formulate a proper argument against it.

    ReplyDelete
    Replies
    1. Hi Milo,

      I agree with your final statement, that it may be difficult to formulate a proper argument against UG if it is impossible to create a contradictory sentence - this would thus be a paradox of some sort. Therefore, I believe that perhaps this is why the main arguments against UG are arguments based on biological and evolutionary ideas - as Iris mentioned above, "just like how UG cannot be learned by us, it could not have been learned by evolution either", as well as the fact that there is "not clear biological advantage".

      Delete
  19. When Chomsky put forward the APS, he intended to provide an argument that behaviorism alone is not enough for children to learn language. Children seem to abide by the rules of UG without any supervision, and they learn language so fast that just receiving positive or negative feedback would not be enough to account for such a rate of learning. Chomsky coins the term UG, which is a form of grammar that is already there at birth.

    However, I find it questionable that it is a completely different approach to behaviorism. For behaviorism, what happens in the brain is considered to be part of the ‘Black Box’, that is of no major concern to the behaviorist – all they care about is the input and the output. Chomsky says it’s not a Black Box, it’s UG, but does giving it a different name make it any clearer how and why this innate language ability is there? True, he did say that behaviorism alone cannot be the answer, but what UG truly is still remains elusive.

    ReplyDelete
    Replies
    1. Hi Shandra, I really relate to the points you make about the mysteriousness of UG, and whether giving a label to something we can’t define/examine is useful at all.
      Clearly, Pullum & Scholz failed to distinguish UG from OG in this article — and while this is a pretty significant thing to overlook on their part, deep down, I can’t help but sympathize. UG is so elusive, as you said, so there really is no way to properly argue for or against it.
      In fact, how can we distinguish UG from OG when we don’t know the mechanism by which UG operates? And if we don’t know what UG is not — because of the absence of sentences that are not UG-compliant — how can we distinguish UG from non-UG? As you mention, calling this black box “UG” doesn’t really shed light on what goes on inside it, and doesn’t help
      distinguish UG “members” from “non-members”.
      It seems that speaking of UG in relation to OG is the most comprehensive way of speaking of UG at all; but in doing so, anyone runs the risk of conflating the two, which I think is what happened here with Pullum & Scholz.

      Delete
  20. The poverty of stimulus argument states that when a child firstly ‘learns’ a language, there’re not enough environmental/linguistic stimuli for them to learn the language by trial-and-error, or by purely associative learning. Children are not learning language by the ordinary grammatical structure (as adult learn a second language), but rather, the inherited language acquisition ability allow them to learn their mother language in a surprisingly high speed. I think the reason why a newborn/or anybody will face the problem of poverty of stimulus in language learning is because there are infinite ways of expression, and there will never be enough stimuli for language, even we are learning OG not UG as adults.

    ReplyDelete
  21. Basically, this paper wants to show that linguists have not shown the reasoning behind why we can say that children are born with knowledge of language without having to learn through experience. As Chomsky suggested, the best way to study that would be to search for cases where it would be most likely impossible for children to experience the language structure tested. Another thing that’s important is to determine what is enough experience to learn a language structure and what is not. To understand this question, we need to know the utterances the infant is exposed to and where the infant’s attention is. Lastly, the paper mentions that specialties will be required to resolve the problem: mathematical learning theory and corpus linguistics. Mathematical learning theory is useful to determine boundaries for how much experience is needed for children to learn language structure and corpus linguistic is useful to understand the typical content of language(corpora) and be able to say what is accessible and inaccessible for children.

    ReplyDelete
  22. So based on this reading, and based on past readings my understanding is that the “something innate” which helps children not make errors is Universal Grammar, however the Ordinary Grammar is learned. Since UG is assumed to be innate, then reverse-engineering this would be futile since, again, it is innate. And if it’s not possible to reverse-engineer then it wouldnt be possible to deploy this into a robot?

    ReplyDelete
  23. I am still a little bit confused about the poverty of stimulus argument that universal grammar (UG) is unlearnable. The argument sounds circular to me.
    According to this argument, ordinary grammar (OG) can be learned though unsupervised or supervised learning or through instruction. However, UG cannot be learned, and thus is innate. The reason is that, in the case of OG, we can make mistakes on OG and be corrected by parents. However, in the case of UG, we cannot make mistakes, and thus have no negative evidences—everything we say is in accord with UG, so we do not know what is not counted as UG. However, without negative evidences, we cannot learn categorization. Hence, the conclusion is that UG is not learned, but inborn.
    The part I do not fully grasp is why we cannot make mistakes on UG? If it is because it is by definition and construction something we must follow and on which we cannot make mistake, then we have already presupposed it to be something inborn. It seems to me that we have presuppose in the beginning that UG is a set of abstract principles, and under the guidance of these principles, the environment sets the parameters, and we learn OG. Hence, by hypothesizing the existence of UG, we have already hypothesizing something unlearnable.

    ReplyDelete

PSYC 538 Syllabus

Categorization, Communication and Consciousness 2021 Time : FRIDAYS 11:35-2:25  Place : ZOOM Instructors : Stevan Harnad & Fernanda Pere...