upvote
I can't actually figure out what you're reacting to - perhaps you could elaborate?
reply
Try pasting it into an LLM and asking it to explain it to you.

If an LLM can explain what you can't understand, after you literally repeat unoriginal drive-by slogans you heard other people say like "stochastic parrot", without understanding how banal and reductionist and inaccurate that is, or addressing how LLMs actually work and what they can do, then what does that say about you as a human?

By your own reductionist argument, it says you don't actually understand anything, it's just neurons firing.

Are you claiming that parrots can write and debug code, explain mathematics, translate languages, summarize research papers, and engage in extended technical discussions?

Stanislaw Lem's Solaris, His Master's Voice, Golem XIV, The Cyberiad, and his other works describe how humans have a pathological tendency to mistake their own conceptual categories for universal truths. You have identified a mechanism and mistaken it for an explanation.

The term "stochastic parrot" is a slogan masquerading as an explanation, only a shallow surface description of the mechanism, that totally fails to explain the phenomenon, or account for all that LLMs and language itself can do.

reply
I think (rather ironically) you're reacting to the version of my comment you have in your mind rather than what I actually wrote. My point was that "stochastic parrot" is reductionist and irrelevant as most people would agree that a real life parrot has some form of inner life, even if we don't really know what form it takes. For all we know an LLM has to build a complex world model in order to predict the next token.

Incidentally, when I pasted our exchange into Claude it managed to comprehend the nature of my argument. Perhaps its attention mechanism is more finely tuned.

reply
I was reacting to you saying "LLMs are stochastic parrots", which gives parrots a lot more credit for being able to write and debug code than they deserve, regardless of how rich their inner lives may be.

It sounds like you're saying there's no more to LLMs than what a parrot randomly does (which is to repeat things it has already heard, not synthesize new things): no emergent behavior, no compressed generalizable transferable knowledge and problem solving ability, no ability to write and debug code and iteratively diagnose and solve problem.

>My point was that 'stochastic parrot' is reductionist and irrelevant as most people would agree that a real life parrot has some form of inner life.

That's not how your claim "LLMs are stochastic parrots" literally reads -- use a /s sarcasm tag if that was what you meant. But that is exactly the point I was trying to make: claiming that LLMs are stochastic parrots is reductionist, thought-stopping, and explains nothing.

The question is not whether parrots have an inner life (though I'm sure they do), but about whether calling LLMs stochastic parrots is reductionist.

What do you mean by saying by "LLMs are stochastic parrots"? Does an LLM behave the same way a parrot behaves? Are LLMs only "parroting"? Do they use the same mechanisms as parrots? Are they limited to only what a parret can do? Can LLMs do more than parrots?

If you taught a parrot to squak:

  main() { 
    printf("Hello, world!\n");
  }
Does that parrot understand that it is executable code, and can it simulate running it, and tell you what the output will be, or diagnose and fix bugs in the code you taught it? Can a parrot write that code (and much more sophisticated enormous bodies of code) from scratch if you tell it what you want it to do?

Here is the original 2021 Stochastic Parrot paper, which was actually not a claim that LLMs are literally parrots, nor primarily an argument about consciousness. It was a critique of the risks of increasingly large language models: bias inherited from training data, environmental and financial costs, concentration of power, lack of transparency, and the tendency of people to anthropomorphize language models and attribute understanding where there may be none. The phrase "stochastic parrot" was introduced as a cautionary metaphor about statistical language generation, but it later escaped into pop culture and became a drive-by anti-LLM slogan often used as a substitute for analyzing what LLMs can actually do.

Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell (under the pseudonym "Shmargaret Shmitchell") "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?"

https://s10251.pcdn.co/pdf/2021-bender-parrots.pdf

>Abstract: The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.

Ironically, even critics of the Stochastic Parrot paper argued that it risked replacing scientific analysis with rhetoric. Michael Lissack's response accused it of being an advocacy piece that focused on harms while ignoring benefits, assumptions, and trade-offs. The debate over "stochastic parrots" started almost immediately after the term was coined in the 2021 paper.

The Slodderwetenschap (Sloppy Science) of Stochastic Parrots -- A Plea for Science to NOT take the Route Advocated by Gebru and Bender

https://arxiv.org/abs/2101.10098

>This article is a position paper written in reaction to the now-infamous paper titled "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" by Timnit Gebru, Emily Bender, and others who were, as of the date of this writing, still unnamed. I find the ethics of the Parrot Paper lacking, and in that lack, I worry about the direction in which computer science, machine learning, and artificial intelligence are heading. At best, I would describe the argumentation and evidentiary practices embodied in the Parrot Paper as Slodderwetenschap (Dutch for Sloppy Science) -- a word which the academic world last widely used in conjunction with the Diederik Stapel affair in psychology [2]. What is missing in the Parrot Paper are three critical elements: 1) acknowledgment that it is a position paper/advocacy piece rather than research, 2) explicit articulation of the critical presuppositions, and 3) explicit consideration of cost/benefit trade-offs rather than a mere recitation of potential "harms" as if benefits did not matter. To leave out these three elements is not good practice for either science or research.

reply
My point was the "stochastic parrot" label can be both true and irrelevant. LLMs are predicting the next token based on their training data, so at that level "stochastic parrot" is accurate. But it tells us nothing about the complexity of the system that is responsible for making the prediction. One might argue humans have evolved consciousness in order to allow world modelling that enables them to make better predictions.

The difference between a 1B LLM and Claude Opus matters, because we're talking about emergent phenomena. Is a 1B LLM conscious? I don't know, perhaps a tiny amount. Maybe Opus is more conscious. Is a jumping spider conscious? Perhaps a tiny amount.

reply