Monday, August 30, 2021

6b. Harnad, S. (2003b) Categorical Perception.

Harnad, S. (2003b) Categorical PerceptionEncyclopedia of Cognitive Science. Nature Publishing Group. Macmillan. 
Differences can be perceived as gradual and quantitative, as with different shades of gray, or they can be perceived as more abrupt and qualitative, as with different colors. The first is called continuous perception and the second categorical perception. Categorical perception (CP) can be inborn or can be induced by learning. Formerly thought to be peculiar to speech and color perception, CP turns out to be far more general, and may be related to how the neural networks in our brains detect the features that allow us to sort the things in the world into their proper categories, "warping" perceived similarities and differences so as to compress some things into the same category and separate others into different categories.

Perez-Gay, F., Thériault, C., Gregory, M., Sabri, H., Harnad, S., & Rivas, D. (2017). How and why does category learning cause categorical perception? International Journal of Comparative Psychology, 30.

Pullum, Geoffrey K. (1991). The Great Eskimo Vocabulary Hoax and other Irreverent Essays on the Study of Language. University of Chicago Press.

69 comments:

  1. Professor Haranrd explains that if CP is innate, then we have categorically biased sensory detectors which have prepared colour and speech-sound categories. This allows us to pick things out of the world and fit them into existing schemas. Theoretically, categorically biased sensory detectors would do this faster and more accurately than if perception was continuous. This made me wonder; if categorical perception (CP) is both inborn and can be induced by learning, is CP an evolutionary mechanism? If CP is the mechanism by which we learn different categories (and much of cognition is categorization) is CP like Chomsky's Universal Grammar? Maybe we are able to modify the boundaries of categorization of primary colours and speech based on learning BECAUSE of some innate ability to categorize.

    ReplyDelete
    Replies
    1. The "evolutionary mechanism" is the (evolved) capacity to learn (which includes the capacity to learn new categories).

      CP is a subtle side-effect of learning new categories (not yet well understood) that occurs under some conditions and not others. It might be related to how difficult it is to learn the category, and the need to make the category "pop out." It is a weak Whorf-Sapir effect. (What is that?).

      Learned categories are modifiable, but as far as is known, innate category boundaries like basic colour boundaries can only be modified in the short term, as in colour adaptation effects, not permanently.

      Innate speech phoneme boundaries, in contrast, can be modified, mostly through losing (during a critical period in infancy to adolescence) the phonemes that your first language(s) do(es) not use, hence do(es) not need. (Can you explain how this difference between color and speech is related to perception/production sensory/motor "mirror" capacity?)

      It may turn out that learning categories verbally (through grounded verbal definitions) can generate CP (this has yet to be tested), but I am not sure what you have in mind about Universal Grammar (UG) (Week 8 and 9). (What is UG?)

      Another thing to reflect on is that although the nuclear weapon of propositional language is unique to humans, it was preceded by an even more powerful nuclear weapon, which is learning itself, which is shared to varying degrees by all animal species. (What does it mean to say that the capacity to learn evolved because evolution is "lazy"?)

      Delete
    2. Re: first question about the Whorf-Sapir effect - the Whorf-Sapir effect is that language shapes how we see the world, but because language is not deterministic, it is a weak effect? Weak vs strong refers to the extent to which language is viewed to influence our cognition?

      Delete
    3. Grace, yes, that's right. But W-S is more about how language influences perception rather than just cognition. Of course language influences cognition (thinking) because if someone tells you something you need to know (like where to go for a covid booster), your thinking has changed: before you were uncertain about where to go, now you know (uncertainty reduction).

      W-S is not about that (nuclear) power of language (to convey information with subject/predicate propositions) but about the influence of vocabulary (and sometimes syntax) on how you perceive the world (what things look like to you). The four most famous examples were wrong (color terms, inuit snow vocabulary, the Hopi concept of the future and Mandarin counterfactuals). How were they wrong?

      Delete
    4. I believe for colours, it was wrong because it assumed categories of colours were arbitrary and learned, thus shaped by language. In reality, different cultures name colour categories quite similarly, and the reading noted that for even those who don't, compression and separation are the same such that people can still differentiate between different shades of green/blue even if their language doesn't. Similarly, non-Inuit speakers would be able to recognize the different "types" of snow that Inuit speakers have different words for.

      Delete
    5. I tried to answer Prof Harnad's second question on color and speech and got a bit confused. So one difference between color and speech is that speech sounds, in addition to being perceived, can also be produced by humans, while colors can only be perceived. This means that mirror capacities can link the perception of certain sounds with the motor ability to produce them. So hearing a sound could activate similar neurons network to the ones used to produce it, therefore could help us to learn how to produce that sound? And if we never hear it in the critical period (if our mother tongue doesn't have that sound), it will be harder/impossible to learn how to make that sound later in life.

      That makes sense to me but it doesn't explain why we also loose the ability to perceive that sound after the critical period. Is that specific to phonemes, as an evolutionary adaptation to free up cognitive space? Or would the same happen to colors too (ex: if a child never sees a particular color during the critical period, it would be harder for them to perceive it later on), the only reason we don't see this phenomenon being that kids necessary see all colors in critical period?

      Delete
    6. In reading the Categorical Perception article and the skywritings in this thread, I was wondering if the results that inspired the W-S were seen not because language is the basis for CP of colors, but due to the color boundaries being modified as the result of learning? This comes from page 4 of the article and is supported by "recent demonstrations" that primary color and speech categories are likely innate, but can change as the result of learning.

      Delete
    7. I'll try to answer prof Harnad's question on how the difference between innate color CP and modifiable speech CP is related to perception/production sensory/motor "mirror" capacity.

      The reason that we have modifiable speech categories is that we learn language through sensorimotor associative learning, when our mirror neurons can be changed in radical ways by observing others' speaking or when we try to speak. However, if we stop learning other languages after a certain age, we tend to lose certain CPs due to the laziness of evolution (when there is not enough stimulus for associative learning and certain mirror neurons stop firing). For innate color CP, we do not obtain them through sensorimotor learning or mirror neuron firing, but through perception (by our predetermined photoreceptors).

      Yet, it seems that even for color CPs, some of them are continuous and learned (e.g grey is learned between black and white). I'm curious to know what determines the difference between learned or innate color CPs.

      Delete
    8. In response to Prof. Harnad's last question (What does it mean to say that the capacity to learn evolved because evolution is "lazy"?), is this getting at the idea that the genes will choose to offload as much work as possible to the environment in the name of efficiency? Thus, because learning is presumably an intensive task, learning evolved because evolution lazily chose to not precode learning within genes, but rather leave it in the environment?

      Delete
    9. Louise, good understanding, good points. Yes, even vision might have critical periods for “use-it-or-lose-it”. That’s part of the message of Held & Hein’s (cruel) active/passive kitten experiments. But there too the motor production is as important as the sensory input. “Use-or-lose-it” pruning is important in many biological functions as well as their computer models. Look up “neural pruning” and “apoptosis.” In competitive pruning the objective is to sharpen perception for the features you need (e.g., in your language) and ignore the ones you don’t.

      Madeleine, the possible changes in color perception are not in the primary colors, apart from short-term changes in the boundary caused by repetition and adaptation. Color subcategories (like chartreuse or vermillion) can be learned, but they are not as powerful as the inborn primary colors.

      Xinti, the main difference between color CP and phoneme CP is that we are able not only to perceive but also to produce phonemes, whereas we can only perceive colors. This is more than just sensorimotor “association,” especially in the case of learned categories vs. inborn ones (“mirror neurons”). (Yes ,“pruning” of inborn feature detectors is an example of the “laziness” of evolution.)

      Grace, about “snow terms:: Inuit languages are synthetic and “agglutinative,” which means they wrap into a single word what “analytic” languages like Mandarin and English say with lots of separate words, in a phrase or a sentence. English nevertheless has pneumonoultramicroscopicsilicovolcanoconiosis and German, which is slightly more synthetic, has Kraftfahrzeug-Haftpflichtversicherung. But the point is that Inuit has potentially as many words for “computer” as for “snow” (“snow-that-is-melting,” “snow-that-fell-overnight”) and that the early linguists were as mistaken to try to list them all as dictionary entries as we would to try to lexicalize every possible description such as “the cat that is on the mat.” What we do lexicalize is the categories that are worth defining in a dictionary for “efficient communication” (e.g. “horse with stripes” = “zebra”).

      Lazy evolution does not “like” to pre-code what can be learned from experience (including verbal experience).

      Delete
  2. Harnad states: "But, as with colors, it looks as if the effect is an innate one: Our sensory category detectors for both color and speech sounds are born already "biased" by evolution".

    This reminded me of something I learned in a previous linguistic class: that babies are born with categorical perception of all languages, but after a certain age, they are only able to categorically perceive speech sounds that are used in their native language. I wonder how this loss of CP would be categorized since there's an original innate CP of many speech sounds, but then there is a loss of certain CPs. I also wonder if there would be a way to reverse engineer this "loss" of CPs or if it would be the same just to create program with the CPs of an adult English (or x-language)-speaking human.

    ReplyDelete
    Replies
    1. Part of evolution's laziness is to leave what can be learned to learning rather than innately precoding it genetically, which would be more costly and less flexible. Only the capacity to learn itself needs to be genetically precoded.

      Temporary "use-it-or-lose-it" categories, available only during a "critical period" in infancy or adolescence, are intermediate cases.

      What needs to be reverse-engineered is the learning capacity itself, including any critical-period capacities. Those would all be part of a species's T3 (or T4).

      Delete
    2. I find this extremely interesting, that evolution results in less hard-coding, and more flexibility and potential to learn. However, surely evolution would have favored individuals that happened to genetically have a more "innate" ability, or at least a very natural and quick working learning ability, rather than being lazy and allowing every individual to have to learn the innate advantage of precise CP. I suppose this is less of a cognitive consideration and more of a biological one...

      Either way, I see that a species's T3 accuracy would be linked to that learning capacity, and thus, CP.

      Delete
    3. Hey Alex:) To try to answer your question, if humans lived in (and had evolved in) completely consistent environmental conditions, hard-coding would be the way to go. Theoretically, in an environment that never changes there is one most adaptive way to perceive, categorize, and interact with that environment (and whoever has this way of interacting most effectively hard-coded in will do best). But, on both the scale of an individual lifetime and (even more so) an evolutionary timescale, the world is always changing. The more unstable an environment is, the more beneficial it is to have flexibility/potential to learn rather than hard-coding. Hard-coding even becomes a drawback the moment an environment is different enough that the hard-coding is no longer applicable, so actually makes a lot of sense that what has been evolutionarily selected for is flexibility. Evolutionary hard-coding would have to be incredible complicated to account for environmental instability, so it is “easier” (ie more likely to have occurred) to just “give” humans the ability to adjust their perception/categorization/interactions based on whatever world they find themselves in.

      Delete
    4. Alex, "Baldwinian Evolution" -- in which what is encoded is not a full behavioral pattern but a genetic disposition to learn the pattern -- is intermediate between genetically coding the pattern and leaving it entirely to general learning capacity. It is a "preparation" or head-start on the learning, and is often associated with a critical period in which it can be readily learned.

      Delete
    5. I thought considering our evolutionary disposition towards hard coding or flexibility was interesting to think about in terms of how this would shape our Turing machine. Initially, I would have thought that a TM would need to be hard coded, especially if we are only trying to simulate human intelligence. In a simulation, environments and responses would be more or less predetermined because there would be a limited number of scenarios and ways to respond to them since it only attempts to predict the world but cannot account for everything. Thus, with predetermined responses, there would be no room for spontaneity and the robot would have to be hard coded. However, this is the flaw of the simulation; it would not actually be human intelligence. With this in mind, it is clear that we are very flexible and any machine that attempts to pass as a human would also need to be more flexible than hard coded.

      Delete
  3. It is explained in “To Cognize is to Categorize: Cognition is Categorization” that language, through hearsay, presents an evolutionary advantage by making it possible to skip sensorimotor training when creating categories. I was reminded of this when reading this text and I was wondering if the categorical perception of speech could be one of the reasons why language is advantageous in creating and understanding categories. Would it be wrong to think that the ways in which words can be easily grouped according to within-category compression and between-category separation could explain the efficiency of language when learning and creating categories? I am curious about the role of CP in explaining the evolutionary advantage of language.

    ReplyDelete
    Replies
    1. Camille, phoneme CP is useful in (1) speech perception and production, but it is also useful in (2) grounding categories perceptually. Then the grounded and named categories are available for (3) language, through verbal definitions (if the organism has propositional capacity too).

      Delete
  4. I have an additional question in relation to the Borges story. Funes is clearly incapable of categorization, given how every instance and object (at a specific time and space) is its own “thing” according to him. Because of this, the narrator of the short story suspects on the last page that Funes “[…] was not very capable of thought.” How could we link categorization to thought? Is it only via propositions that we can think in a coherent and original manner?

    ReplyDelete
    Replies
    1. I believe that categorization is related to thought in the sense that when you think or speak about an object, you will automatically imagine/think about that object you are referring to in your head. Whether is it someone else that spoke/read the word or yourself, there always seems to be some image pop up (in my brain at least) that will give you the image of the category.

      I think when referring to Funes not being capable of thought, the narrator was maybe referring to this ability of creating a general image that will represent a whole category of words. From my understanding, he considers a link between the two since thought is required to think of a specific categorization.

      Delete
    2. Thank you for your comment Mariana! You bring up some very valid points. I am now thinking that Funes is incapable of thought for two main reasons. For one, every allusion to a thing (such as a written or spoken word) will be impossible for Funes to reflect on and mentally picture. If someone were to tell Funes about a dog, an infinite amount of images might pop up in his head and he would be incapable of pinpointing what “dog” is being referred to and to link it with further thoughts. On the flip side, if Funes were to come across a cool-looking mushroom on the side of the road, he wouldn’t be able to infer its general characteristics and to link it with previous experience (the mushroom being its own independent entity to him). He therefore wouldn’t be able to formulate many thoughts (propositions?) on it, hence why the text states that Funes “[…] was not very capable of thought.” 

      Delete
    3. Mariana, mental imagery, as we discussed earlier in the course, is real enough, but homuncular, so not explanatory. Unsupervised and supervised learning of sensorimotor categories, abstracting category-distinguishing features, and filtering sensorimotor input through such learned feature-detectors -- all that is getting closer to a mechanism. Such filters could also influence imagery, by also filtering internal analog copies of sensorimotor input.

      Camille, thought is not just verbal, but categorization is needed for nonverbal thinking too – though maybe not for imitation (mirroring).

      Delete
  5. "The network's "bias" is what filters inputs onto their correct output category. The nets accomplish this by selectively detecting (after much trial and error, guided by error-correcting feedback) the invariant features that are shared by the members of the same category and that reliably distinguish them from members of different categories; the nets learn to ignore all other variation as irrelevant to the categorization. (Page 4)"

    This quote connected some of the main points in both 'Categorical Perception' and 'To Cognize is to Categorize: Cognition is Categorization' in my mind. It connects machine learning to the credit-assignment problem and begins to provide an answer: with enough supervised learning, the algorithms can create bias for certain categories and invariant features which can help resolve underdetermination problems. However, we don't know how our brains are doing this, so focusing on finding an algorithm that can categorize like us may tell us more about our ability to do so (thus the importance of reverse-engineering in this case).

    In the last paragraph Prof. Harnad details neural net simulation results that show that if a category has been grounded with sensorimotor experience, higher level abstractions can be made. This seems in line with what the 'Categorical Perception' was saying, and what we intuitively know about T2 machines, which is that it is hard to imagine gaining the ability to make higher-order abstractions without sensorimotor experiences to ground you in first. Furthermore, it could be taken to mean that Categorical Perception is a combination of first sensorimotor experiences, and then heresy and verbal instruction building off of knowledge derived from the experiences.

    ReplyDelete
    Replies
    1. Good synthesis, Madelaine, but hearsay, not "heresy" (otherwise talking would be even more dangerous than it sometimes already is(!

      Delete
  6. In discussing inborn (in humans, evolved) categorical perception versus learned CP, this article got me thinking about an added complexity that might need to be included in our T3.

    To mimic the behaviour involving evolved CP in humans, it seems like our T3 might need to have the same “hardwired” CP that is influencing the relevant behaviour. If certain compression/separation biases are inborn in humans (are a feature of our mechanistic structure), would our T3 ever be able to act indistinguishably from us if it weren’t structurally determined to perceive certain things, like colour, the way we do?

    As of yet, though, we have a far from complete understanding of what types of perception are inborn in humans. It is likely a lot more than just colour, but we need a much more complex understanding of human sensorimotor processing to come up with a complete list (could we ever even have a complete list? I’m not sure how one would absolutely determine if something were an inborn category outside of situations where there is a mechanistic constraint like with rod cells and perceiving colour). So if would be very difficult, maybe even impossible, to know what types of CP would need to be hardwired into our T3.

    Building a T3 that involves only learned CP seems more possible, and I wonder if the need for inborn CP could be avoided if the T3 prioritized feedback coming from humans? Maybe a T3 doesn’t have to be built to perceive just like we do, just able to use human perception as the *standard* for its learned CP.

    ReplyDelete
    Replies
    1. Eric is indistinguishable from any of us in what it can do in the world, bodily or verbally. To model that, we have to build in whatever makes a T3 capable of that, including what's inborn and what it's capable of learning. (Learning capacity itself is inborn, but so are feature-detectors for color.)

      Delete
    2. Hey Caroline, I agree with you that T3 might need the same “hardwired” CP to be indistinguishable from humans. Your point about what types of CP to hardwire into T3 and whether we could even determine ALL the inborn categories is interesting but let’s say that we DO have a list of what hardwiring needs to be done, can we build a robot that passes T3 without the neural aspects/connections that allows the hardwiring?

      I think at this point, I’m getting the sense that maybe we would need a T4 to even pass a T3 (just like how we need a T3 to even pass T2). To be indistinguishable from a human [bodily and verbally] as Prof Harnad states, we must build a T3 that includes what is inborn and is capable of learning.

      The “capable of learning” aspect already seems hard enough (in my ignorant opinion) because although we saw [how we learn through supervised learning, unsupervised learning, and verbal hearsay] in article 6A, reverse-engineering that ability to a robot is something I believe would be extremely difficult.

      As for the “includes what is inborn” aspect, not only do we have the problem of that list you mentioned, but I believe we would also require neural networks to achieve that. Feature-detectors for color are hardwired into our cone cells that discriminate the basic colors (blue, green, red). Cone cells are part of our nervous system. So that does mean that we would need some aspects of T4 (neural) to have an indistinguishable robot bodily and verbally (T3)?

      This is where I’m starting to see the strong aspect of the strong Church/Turing thesis. Just like how Prof Harnad simulated GENERATIONS of mushroom foragers (shown in article 6A), we need the simulation aspect prior to building T3 or T4. This saves time and resources because we can trial-and-error within the simulation before moving on to the mechanical version.
      I have no idea how we would [trial-and-error] computations that produce the learning capacity and critical-period-capacities even in a simulated version of the Turing robot but thanks to the strong Church/Turing thesis at least we CAN through simulation.

      Delete
    3. Professor Harnad,

      I remember you mentioning during the Lecture 2 (The Turing Test lecture) that "Stevan says" you can NOT simulate a successful T3 passing robot in the virtual world.

      Could you elaborate on why you think so?

      Delete
    4. Hi Iris! Although I’m not Prof Harnad, I would think that you cannot simulate a successful T3 passing robot in the virtual world because of the fact that it would be a simulation. In a simulation, all the features that make a T3 robot T3 (the sensorimotor functions specifically) would not actually function. To my understanding, the point of a T3 robot is that it can interact with the physical world and use this interaction for symbol grounding. It has the experience of interaction which informs its vocabulary rather than relying on the periscope of definitions. Moving the T3 into a simulated digital environment takes away the robot’s ability to directly interact with the world and reinstates the issue of the periscope. For this reason, the simulated T3 robot would not actually be able to pass a T3 test.

      Delete
  7. I found the Whorf Hypothesis to be quite compelling for the arbitrary categorization of colours. The terms ‘scarlet’ and ‘crimson’ were used as examples of categories that are likely not innate. This makes sense to me, and when reading those words, it is quite difficult for me to imagine what those colours actually look like. I also believe that if I was given a sample of 10 different shades of red and asked to pick out which was scarlet and which was crimson, I would not be able to do this. To me, this makes sense that some categories such as colours would be influenced by things such as culture and language and also possibly by learning. For instance, I am now wondering if the reason I would not really be able to differentiate these colours is something unique to me (something I never learned), or if this is something more related to the culture I grew up with, or something else?

    ReplyDelete
    Replies
    1. I had also found this idea of the “Whorfian power of naming and categorization” and its influences on our perceptions of the world to be very fascinating. It reminded me about an article I’d read on the absence of the word “blue” in Ancient Greek text. In Homer’s The Odyssey, he refers to the sea as “wine-dark”, which is has both a poetic purpose but is also makes historians question what exactly people were seeing at the time it was being written. The word blue is not mentioned once in the book (which takes place quite a bit at sea), leading historians to a number of hypothesis about how color categorization developed and some even postulating that Ancient Greeks may have been colorblind to the color blue. I think this illustrates the power of language and categorization, and how the presence or absence of categorical words can lead to more questions than answers.

      Delete
    2. Leah, I found your comment interesting so I went and googled about Ancient Greeks not being able to see blue. I came across this post on quora that has some compelling answers. https://www.quora.com/Is-it-true-that-the-ancient-Greeks-could-not-see-blue

      While quora isn't exactly a scholarly primary source, they bring up some solid points. What do you think?

      Delete
    3. Evelyn, learning to identify scarlet and crimson probably takes some supervised sensorimotor learning, and that’s probably what underlies cultural effects and language differences, but the boundaries around these colors are probably even fuzzier than the ones around the “basic” colors.

      Leah, the ancient Greeks may not have a name for “blue,” but Berlin & Kay’s cross-cultural studies suggest that they would have seen the qualitative difference between green and blue just as we do. The effect is innate, not linguistic.

      Laurier, google scholar is a better source than quora. Google (berlin kay color) in gs…

      Delete
    4. This discussion reminded me of a study from a previous class (Davidoff et al, 1999) about the Berinmo tribe in Papua New Guinea which supports the Whorfian hypothesis. In their language, there are 5 colour terms, a few of which represent colours slightly different from the primary ones labeled in English. They found that Berinmo participants were better at discriminating colour boundaries from colours in their own language, and vice versa for English participants.

      Delete
    5. Lucie, yes, these are Weak W-S effects: sharpening of some basic color boundaries and some secondary-color CP, both the result of learning and frequency of exposure.

      Delete
  8. In saying "most cultures and languages subdivide and name the color spectrum the same way, but even for those who don't, the regions of compression and separation are the same", this is effectively taking an opposition standpoint to the Whorf-Sapir hypothesis as the languages seem to be representative of the outside reality.

    This made me question the WS hypothesis and what status it rests in today, is it considered a joke or does it have merits with our current understanding?

    ReplyDelete
    Replies
    1. Laurier, I don't think that this view is opposed to the Whorf-Sapir hypothesis. As discussed in class today, colours are one of a few categorizations that have an innate aspect (the others including phonemes and faces). Most categories are not innate but learned, as evidenced by the example of opening up a dictionary, where the vast majority of words would be learned categories. However, from what I've understood, many categories are AIDED by innate feature detection mechanisms built into our nervous systems. This is a weak version of the Whorf-Sapir hypothesis, where the categories are not innate, but they are influenced by innate biological mechanisms.

      Delete
    2. Laurier, as you point out, color CP effects are not the best to exemplify the WHorf-Sapir hypothesis mostly because we have inborn detectors for colors in our retinas and brains, and therefore most of us (not the colorblind, of course) perceive colors in the same way regardless of linguistic differences. Even those cultures that have the same word for green and blue can discriminate between these two because they have inborn neural mechanisms to do that.

      In any case, as Milo mentions, the Whorfian effects that have been described experimentally correspond to the names of LEARNED categories, for which we did not have innate detectors, therefore we had to learn the features that are diagnostic or relevant to categorize. See: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0226000 and https://link.springer.com/article/10.3758/s13415-018-00679-8

      However, Milo, please note that Whorf and Sapir did not imply that categories needed to be innate. On the contrary, the idea was that the language we LEARNED shaped our perception of the world. The weak version of the hypothesis (or linguistic relativity) corresponds to the type of effects we discussed in class (Learned CP effects); learning new semantic categories modifies how similar or different we perceive stimuli that belong or not belong to them. The strong version of the hypothesis would be linguistic DETERMINISM; language contraints and determines thought and perception (meaning, that, for example, you would be unable to see something you don't have a word from).

      While the strong version of the hypothesis has been mostly discarded, there is behavioural and neural evidence to support the weak version of the hypothesis (words can somehow shape the way we perceive and understand the world) that keeps emerging in many domains, see: https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(20)30213-8

      Delete
  9. “Can categories, and their accompanying CP, be acquired through language alone?”

    After reading both this paper and the one from 6a, it feels like the “easy” answer is a simple “no.” If we follow the framework of 6a, we must accept the premise that categorization is a sensorimotor skill with a focus on the sensory facet of it. Therefore, just language alone (propositions, with subjects and predicates) could not categorize. Categories, and consequently their accompanying CP, would not be able to arise from just language, since they must be grounded in some kind of sensorimotor experience or interaction.

    This paper brings up how neural net simulations are able to create categories through Boolean combinations of “features” that have been grounded in sensorimotor experience, and these Boolean categories end up inheriting their own CP effects. This makes sense to me, but I’m struggling to see how neural net simulations would be able to understand what these category names represent in the first place without any kind of sensory input (maybe a secondary question I’m asking is what T-level robot a neural net would be qualified as?).

    I’m wondering if I’m interpreting everything in the original question correctly. Is “language” just a set of propositions that any T2 robot would be able to access? In that case, I believe that categories would not be able to be acquired, since they would have nothing sensorimotor to ground the categories in. However, if “language” is more than just the set of all possible propositions and encompasses more, such as body language and the act of speech, T2 would not be sufficient. Features would NEED some kind of sensorimotor (I feel like I’ve used this word way too much in this skywriting!) experience to be able to be used in categorization. The reading in 6a says that “the capacity to categorize comes somehow prestructured in our brain” — trying to tie this into 6b, my answer to the original question remains “no,” since the brain is so much more than just language.

    ReplyDelete
    Replies
    1. 1. All computational and robotic models so far are just toys today. Nowhere near Turing scale.

      2. It has been shown that toy models can learn through supervised sensorimotor category learning, and that this produces a CP-like effect in their feature-filtered internal "similarity" space.

      3. It has also been shown that toy models can learn new categories through a very rudimentary (toy) "verbal" associative learning (recombining already learned category names).

      4. But it has not been tested whether this very rudimentary (toy) "verbal" associative learning would generate CP.

      5. Language is definitely not just verbal associations: subject/predicate propositions with truth-values (T/F) are essential to it too. And that's what generates language's infinite productive capacity (effability, Strong CT Thesis).

      Delete
  10. Categorical perception (CP) is when within-category perceived differences are compressed and between-categories perceived features are expanded. When we link this concept with neural network, we get that CP seems to be a means to an end which compresses inputs that have similar outputs and differentiates inputs with different outputs. CP effects can even be induced by language. Indeed, previous grounding through sensorimotor trial-and-error makes us able to combine category names and reach abstract concepts. This transition from sensorimotor experience to combinations seems to serve the same purpose as chunking in numbers to reach complexity. Are categories evolutive?

    ReplyDelete
    Replies
    1. Category learning and categorization are definitely adaptive. But read above reply on the limitations of today's toy models.

      Delete
  11. Sensorimotor inputs are an important part of learning and categorizing. For example, babies touch everything to get sensory feedback from it, they put everything in their mouths as a form of learning – what is edible and what isn’t. Taste is as any other sense and can help us categories, and give further feature to certain objects. For instance, while as adults we know dirt is not edible, we still have knowledge of what it hypothetically tastes like, and how it’s different from other non-edible. Does this form of sensory learning and categorizing differ from learning to categorize more abstract things like “truth” “curiosity”?
    According to the Sapir-Whorf hypothesis, language influences the way we categorize and perceive categories. There are infinite ways to categorize which can be context dependent, there are also varying order of categories, such as high order (fruits, vegetables) and low order (banana, orange). Therefore the boundaries of a category and the belonging of a specific ‘item’ are malleable. Does this malleability of category for example an “orange” as belonging to the category of a colour, a fruit, edible, round, something you can throw influence the way it is perceived?

    Categorization involves finding the invariant aspects and ignoring variation of members of a same category. This often occurs through structured learning with feedback of positive examples (members) and negative examples (non-members). In Fernanda’s study, does the fact that the visual textures had names ‘Kalamites’ and ‘Lakamites’ had an influence on the categorization and categorical perception, than if they were just referred to as ‘A’ and ‘B’?

    ReplyDelete
    Replies
    1. More abstract ("higher-order") categories are more likely to be grounded indirectly, through verbal definitions and descriptions rather than direct sensorimotor learning. But the words in the definitions have to be already grounded, either directly or indirectly: they name the features that distinguish the members from the non-members of the category.

      The same inputs can be categorized in many different ways, based on different features and consequences, depending on context. Features reduce uncertainty about what is the right thing to do with the right kind of thing in a given context (e.g., eat it, harvest it, call it “edible,” or call it “fruit,” or “solid matter”). Most of the categories of which the input is a member will not be lexicalized as a word in the dictionary; but they can be the referents of a verbal description.

      In Ferna's experiment the categories had "names," but the different keystroke response was all that was needed for the learning and the CP. ("Doing the right thing with the right kind of thing."

      Delete
  12. "To show that it is a full-blown language effect, and not merely a vocabulary effect, it will have to be shown that our perception of the world can also be warped, not just by how things are named but by what we are told about them."
    This quote made me think of Gestalt illusions demonstrating how our perceptions can be subject to manipulation by our expectations about the world. One example of an illusion is an image that jumps between resembling a duck or a rabbit to the viewer. If before seeing the image, someone was told they were going to be shown a picture of a duck, I imagine they would likely see the duck over the rabbit. In such a case, this person's perception of the image would be determined by what they were told about it. Could phenomena like this be used to support the idea that the Whorf hypothesis is a "full-blown language effect"?

    ReplyDelete
    Replies
    1. There is a good analogy there, and no doubt some of the effects of what we are told are based on expectations. Perhaps the most dramatic (though still controversial) effect of language is hypnosis.

      (This may be because an important feature in the evolution and adaptive value of language was the default assumption that what we are told is true. This is especially salient in our era of mass disinformation. In the village a liar is quickly discovered and discredited -- but not on the anarchic Web.)

      Tests of CP effects would be interesting to do with false feedback. Would they change for similarity judgments but not for ABX discriminability? If there were an effect, would it be perceptual or just a response bias?

      Delete
    2. Lucie, interesting analogy! In the example you name, our expectations and what we are told before seeing the image definitely influence what we then first perceive — this relates to the fact that we are expected to notice some discrete elements. We can extend this and say that generally, illusions guarantee that there is something we will see (whether it’s the duck or the rabbit), given that the image was created with that purpose.
      This made me think of the Rorschach test in particular, in which, unlike illusions, the image doesn’t necessarily have predetermined components or meanings that we are meant to notice — our emotions and thoughts (and possibly languages and cultures?) are what cause us to see the inkblot as having discrete components. I may interpret the ink as depicting, say, a tree, and someone else might see something completely different — I would have to explicitly tell this person to notice the tree in order for them to notice it (if they can even notice it at all).
      I’m not sure whether the Rorschach test has any value in terms of CP, or if there’s anything concrete we can learn from it, but I wonder how much our perception of the inkblot are determined by the languages we are socialized with. If they do influence how we perceive the inkblot (and if I’ve understood correctly), then this example demonstrates the weak WS hypothesis to some extent!

      Delete
  13. Regarding the compression and separation of stimuli, particularly of birds, I find that perhaps even what we may claim are purely categorical perceptions, lie somewhat on a continuum of membership. While “birdness” does not lie on a continuous spectrum as color wavelengths, we nevertheless are able to compress and separate birds amongst eachother. I like that the penguin was used as the example bird in this text, as it is not the most prototypical bird (finch, dove, or pigeon may be a more popular choice). Penguins lack a central feature of typical bird classification, flight, but we know they are genetically related to other birds despite this missing trait. I am curious as to how we weight features in a category such as this. One can compress the emperor penguin and rockhopper penguin into the category of penguin, just as they can compress robin and house finch into a prototypical bird category. There are separation effects as well, as one might claim that penguins are more distinct from the common loon than a robin is from a chicken, even though this is not the case. Evidence such as this makes me think that CP can work holistically, beyond one sensory modality at a time, but the composition of features that categorize such objects may be too complex to study at one time.

    ReplyDelete
    Replies
    1. Taxonomically, the category bird is categorical (i.e., all-or-none) based on its biological features. How it is perceived is another matter. I doubt that a verbal explanation could over-ride the perceptual similarity between whales and fish, but it could still alter our categorization (i.e., what we call them) and our understanding (i.e., what else we know and can say about them).

      Delete
  14. In this paper “Categorical Perception”, Prof. Harnard discussed different hypotheses/debates behind CP, which including the innate vs learnt CP ability, and the interactive relationship between language and sensorimotor experience(if language influences our perception, as in color categorization; or vice versa, e.g. why we can invent words without sensorimotor experience.). I’m very curious about how definite physiology supports continuous perception (“anatomy does not allow any intermediate”, page 2)? I suppose one of the candidates might be the lateral inhibition in receptive fields, and the layered structure of brain networks? If nowadays technological devices/surgery further assisted our perception (most common example might be eyesight correction surgeries – and of course beyond), is it possible to make a shift to this discussion?

    I’m very convinced by the Whorf Hypothesis, and as Lera Boroditsky talked about during her speech. Language itself often closely linked to the culture it belongs to and will certainly influence how the speakers behave. In some languages (like Chinese), each character is consisted of smaller but meaningful parts which infers which category this character belongs to. Sometimes even though this character is no longer explicitly linked to this category today, people still make unconscious inference to it, result in stereotypical judgement and behavior which we have to be aware and be careful. (Ref. https://languagelog.ldc.upenn.edu/nll/?p=23043) This is also somehow disagreeing the argument during the lecture that "it is not 'language itself shaping cognition' but rather 'the neural mechanism supports language is shaping cognition'", it would be grateful if someone could point it out if this make sense or it is a completely different scenario than in color categorizations.

    ReplyDelete
    Replies
    1. Neural function includes both categorical activity (action potentials) and continuous activity (graded postsynaptic potentials). And even firing frequency can be a graded signal (e.g., correlated with the intensity of a stimulus). So there’s plenty of capacity for both categorical and continuous function in the brain.

      Lateral inhibition is an unsupervised learning effect, enhancing boundaries.

      I’m not sure if the “layers” you are talking about are those of deep-learning computational models or the hierarchical layering in the functional anatomy of, say, neuronal visual receptive fields.

      Chinese is an interesting case, not just because of possible subtle effects of the history of words that might be preserved in the ideographic symbols of the written language but because the written version gives more clues to the meaning of a new word than an alphabetic language – especially in Chinese, which is not just ideographic but its words are mostly compound ideograms. A compound ideogram is a bit more like a description than just a word. I don’t know how true this is for the spoken version of Chinese words. Do you? (Remember that spoken language came before literacy in the evolution and history of language, and also in the language learning of the child.)

      Delete
    2. I think the sound of many of the Chinese characters are the same as their 'stem's, but pure prononciation usually can't indicate the meaning of a single Chinese character, because there are so many characters with the same prononciation. However, in spoken Chinese, it is the combination of characters that forms word, in effect the combination of the prononciation of words, that indicate their meanings. And in this respect, I believe the spoken Chinese is the same as spoken English, and the W-S hypothesis will still apply.

      Delete
    3. I agree with Zilong that spoken Chinese is similar to spoken English, and different from written Chinese as compound ideograms. I agree it is more like a short description, and my example here was trying to say the power of implicit categorization – which is unconscious, happens really fast, and are not easily ‘unlearned’ through education. I’m wondering, if the ability to be able categorize is as this important to cognition, can this be an example of evolution, i.e., the resistance of changing/unlearnt the mindset is difficult because it has evolutionary benefit?

      (by layered brain networkes, I'm referring to the biological brain functional structure, as mentioned, like the hierarchical organization in visual percpetion, rather than computational modelling. sorry for the condusion!)

      Delete
  15. The idea that language shapes the world as we perceive it is very interesting to me. Or, perhaps more specifically, the idea that the world is formed simultaneously to the forming of language around our experience of ‘whatever is out there’ is interesting. That there is compression, or that certain perceptual details must be left out in order for Categorical Perception to occur, is also an interesting idea. In order to distinguish yellow 1 from yellow 2, we must ignore their shared quality (that they are both yellow) and focus on their non-shared qualities (differences in their relative warmths, their tone, notes of other colours, etc). But I am not sure about which direction this goes in. The last paragraph of this text seems to indicate (as the Wharf hypothesis does) that it is language that would have to form the world as we perceive it, rather than that the two originate at the same time. Do we not speak things into existence in viewing them as distinct from one another? As we are constantly inventing new ‘higher-order’ combinations, such as, for example, the peekaboo unicorn, are we not inventing our own ability to discern that peekaboo unicorn, if we were ever to come upon one (I know, impossible!!). Discerning, then, would be a creative task, and this would complicate the idea that the world could be warped “not just by how things are named but by what we are told about them”. Did not the warping already occur when the thing was named/brought into existence?

    ReplyDelete
  16. In the final paragraphs, the paper asks the question “Can categories, and their accompanying CP, be acquired through language alone?”.
    In response, I believe that it is not possible for categories and their categorical perceptions to be acquired through language alone. As we have seen in previous class material, the first thing that must be done for language to have meaning is through sensorimotor interactions with the environment (ie. direct grounding). It would not be possible for language to have any inherent meaning without the direct grounding of at least some words. Furthermore, without the interaction of our sensory faculties we would not be able to learn new words, combine words or understand compound words. Additionally, the affordance of a category allows us to know ‘what to do with what object’ and categorical perception is a ‘side effect’ of learning categories. The use of language to describe categories, or Hearsay, allows us to learn a category without having to go through the trial-and-error sensorimotor experimentation. However this is only possible if there are already some words directly grounded so that the description makes sense — and thus, category learning “can’t be hearsay all the way down”.

    In regards to CP, it appears that categorical perception only arises out of certain instances of categories. As discussed with the colours vs. vowels distinction in the reading, could it be more likely that we have better categorical perception for the things that we are more intimately able to connect with as humans? Our CP for speech sounds is more distinct because we are able to perceive them and produce them, whereas for colour CP we are only able to perceive colours.

    ReplyDelete
    Replies
    1. Interesting question AD. First of all, I think CP happen on things that are quite similar. In class we spoke of two very different things - I can't remember exactly what they were but take zebras and apples (??). There is no CP because they are obviously different to start with.
      Your consider that speech is may be more "intimate" to humans than colour perception. I take it that you mean better CP = more fine-tuned discrimination between narrower ranges on the spectrum. Since color CP is more innate, I suppose it could just feel like it's 'there' and humans with typical visual perception do not need to engage with it more. On the other hand, color experts have a myriad of categories for colors beyond ROYGBIV. One could examine the existence of CP in learned colors and how great the compression and separation are.

      Delete
  17. The Whorf Hypothesis is that language and culture can change the way things look to us. In other words, it changes the way we perceive the world. Therefore, if language changes the way we see things then how do we know we are we are categorizing properly? If I categorize (try to do the right thing with the right kind of thing) based on feeling, rather than objective features, then it would be a subjective category. This can be seen whenever you do a sensorimotor categorization because it is based on feelings. Indeed, it feels like something to see, touch, taste something. However, I think it is important to remember that introspection does not explain how we do things. In fact, we're still waiting for cognitive science (T3 or T4) to reverse-engineer how your brain managed to do that. On the flipside, supervised (or reinforcement) learning is simply trial-and-error learning based on direct experience, guided by feedback from performing the correct or wrong behavior. The input does not have to come from humans (for example, mushroom picking), but from categories based on social consensus as to what is what and what will be labelled what, in which case the feedback is social. However, if we acquire categories through verbal teaching (definition, description) rather than actual experience, we will likely learn the characteristics from other individuals. In brief, I think whether the information comes from an objective or subjective source, it's always about the features and the abstraction when one needs to categorize.

    ReplyDelete
  18. In the section “resolving the ‘blooming, buzzing confusion’”, we learn about how the reason we are able to experience an orderly world of discrete objects as opposed to a confusing continuum of different parts of objects is because we’re born with innate category-detectors that prepare these categories for us in advance. This makes me look back at the T3 vs T2 debate and how there’s a general consensus that that the T3 level is sufficient to replicate human cognition; but if T3 robots lack this innate category detecting ability that humans due, wouldn’t this have an impact on their capacity to categorize the way humans do and wouldn’t this eventually produce notable differences between the robot and human beings? Reading about the motor theory of speech suggestion only supported this idea for me when it talks about how our ability to perceive differences in speech sounds is by virtue of the fact that we can produce them and actually feel the difference.

    ReplyDelete
    Replies
    1. Hi Adebimpe! I think the idea that the T3 level is sufficient to replicate human cognition is more based on the concept that a T2 robot is insufficient to pass T2 because symbol-grounding requires sensorimotor experiences. The point you bring up about a T3 robot lacking an innate human ability to categorize seems to me to more speak to the fact that we still don’t have the ability to reverse-engineer cognition (which we definitely don’t! I don’t even think we have anything resembling a T2 robot at this point).

      Delete
  19. I find it fascinating that humans (I can’t speak for other species) are endowed with just the right level of categorical perception ability (at least, it seems pretty convenient, I don’t know how it would feel for it to be any better). CP seems to be just the right balance between flexibility and rigidity. We can categorize ‘something’ as being a ‘thing’, even though we might encounter countless variations of that thing. But at the same time, we can tell the difference between all these variations, and can separate these variations into yet other categories. It seems like evolution, by being lazy, allowed for this ability to learn and the freedom to do what we want with what we learn. Unlike Funes the Memorious, we have the ability to abstract over certain features, while retaining the ‘essence’ of something.

    Before reading this paper, it didn’t occur to me that CP could extend to phonemes (which was admittedly quite stupid of me), but it does make sense because we don’t only categorize visual objects out there in the world, but also sounds and smells and tastes. So, I guess a T3 robot would need to have this categorization capacity to be able to pass the Turing Test, by virtue of its sensorimotor interactions with the world.

    ReplyDelete
  20. Boundaries can be modified or lost as a result of learning and secondary boundaries can be generated.” I find this point super interesting and applicable. For example, if you think about the difference between artists and non-artists. If a non-artist looks at a landscape, they take in all “features” of the landscape to form the general image. However, if an artist were to look at a landscape, they would take note of the shadows cast on the landscape by different elements and lines of intersection between different items (here by items I mean tree, rock, mountain, etc.). When comparing the two though, does this mean that CP can be additive as well? The non-artists still see the same things and can designate “this is a tree”. However, the artist can say “this is a tree” and “this section is a shadow of the branch on the trunk of the tree that separates it from this highlighted area from the sun shining through the branches”. The formation of “weaker boundaries based on learning alone” is also slightly confusing to me because the non-artist could be asked, “is there a shadow on the trunk of the tree” and they would say yes and be able to point at it. In this case, is this considered innate? Hypothetically then, would a learned category be one that is invisible to the non-artist until the artist teaches them the features to look for and then suddenly it is prominent to them every time they look at the landscape?

    ReplyDelete
    Replies
    1. Hi Katherine, it's a really interesting point!
      Regarding this example, I think that both artists and non-artists perceive the same thing and form similar cognitive images of the landscape. The difference would be that artists make additional interpretations on those details that are useful/necessary for artistic works, such as the shadows, the light, etc. These interpretations are not possible, or at least not obvious for non-artists to make. As you've said, non-artists are also aware of the shadow, but their awareness of shadow is more in a general-life view rather than an artistic view. So I would say that above the basis of the general image that both non-artists and artists have, artists would form additional artistic interpretations ("artistically encoding features") that non-artists are not able to do until they have been taught to do so. This can be seen as an analogy of weak boundary based on learning alone, however, I believe the original colours and phonemes examples would be more clear for demonstration.

      Delete
    2. Hi Katherine, I like your comments on artist vs non-artists’ categorization differences! I think this has to do with expertise vs non-expertise in general, as brain plasticity will be shaped by experience. One of the questions I’m always interested in is how, and to what extent, the artists see the world differently from let’s say, scientists (because, as you can imagine, scientist may also bring their own lens in viewing artworks, for example, if it’s somehow contradicting to the law of physics, they could give specific and scientific comments on that). However, I’m also very fascinated by the fact that visual artists who are not trained in visual perception or visual neuroscience can still accurately present the distance cues and 3D constructions on 2D surface in their work. I think it is the power of human cognition to do so, that we share a basic understanding of the world and the specialized lens are only add-ons to make the world more diverse and colorful.

      Delete
  21. When I was reading this article, the part about color categories confused me a lot. From my understanding, color perception or color vision is an innate ability for humans, and more precisely, any living organisms with more than two kinds of cones could categorize color. The ability to perceive color, from my understanding, is pure because of the physiology structure of the retina: different types of tones tuned to detect visual light, each with a specific range of wavelength--it is the stimuli's intensity, and wavelength determines the amount of light absorbed by the photoreceptors and therefore create the exact match to any color. However, pointed out in this article, 'the categorization indeed follows the perceived similarities gradient rather than causes it.' I wonder why the perceived similarities and differences are not in the color input and how did Jacobs conclude that there are other specialized neural feature-detectors for perceiving color categories.

    ReplyDelete
  22. I thought the connection with Lera Boroditsky's research was really interesting. Her work is what got me interested in cognitive science a few years ago, but seeing it again through the lens of what we learned in this class allowed me to understand the implications of her work a lot better. I think the most important takehome from this section is that words can (and should) be thought of as categories, through which we can organize the world (in the broad sense of the term, which includes physical entities but also more abstract concepts such as time and truth). When we think of language in this manner, it makes the Worf-Sapir hypothesis a lot easier to comprehend because it illustrates how something as intangible as language can affect behaviour.

    ReplyDelete
  23. Professor Harnad defines categorization in palatable terms saying that it’s “doing the right thing with the right kind of thing” and also mentions that "Machine learning algorithms from artificial intelligence research, genetic algorithms from artificial life research and connectionist algorithms from neural network research have all been providing candidate mechanisms for performing the “how” of categorization." I think it’s interesting that while the machine learning algorithms can helps us understand categorization from the “how” standpoint, we can also see ways in which we have categorized poorly and disruptively. As made more obvious my AI, humans are predisposed to biases, heuristics, whatever you want to call them. When AI clearly reveals blindspots, and more importantly apparent moral convictions such as racism in our algorithms, it’s interesting to remember that this intelligence we’ve engineered is reflecting what we use to build it, the data and patterns we feed it is summed up and spit out back at us. There’s a “how” of categorization shown but also can help us think about why.

    ReplyDelete

PSYC 538 Syllabus

Categorization, Communication and Consciousness 2021 Time : FRIDAYS 11:35-2:25  Place : ZOOM Instructors : Stevan Harnad & Fernanda Pere...