undefined

points

by D-Machine4 hours ago |

comments

by whattheheckheck2 minutes ago|

[-]

Do you think we actually exist?

by observationist3 hours ago|

prev|

[-]

Some sort of software like ComfyUI with variable application of model specific personality traits would be great - increase conscientiousness, decrease neuroticism, increase openness, etc. Make it agentic; have it do intermittent updates based on a record of experiences, and include all 27 emotional categories, with an autonomous update process so it adapts to interactions in real time: https://www.pnas.org/doi/10.1073/pnas.1702247114

Could be very TARS like, lol.

It'd also be interesting to do a similar rolling record of episodic memory, so your agent has a more human like memory of interactions with you.

Another thing to consider about LLMs is that the nature of the training and the core capability of transformers is to mimic the function of the processes by which the training data was produced; by training on human output, these LLMs are in many cases implicitly modeling the neural processes in human brains which resulted in the data. Lots of hacks, shortcuts, low resolution "good enough" approximations, but in some cases, it's uncovering precisely the same functions that we use in processing and producing information.

by D-Machine3 hours ago|

parent|

[-]

> Another thing to consider about LLMs is that the nature of the training and the core capability of transformers is to mimic the function of the processes by which the training data was produced; by training on human output, these LLMs are in many cases implicitly modeling the neural processes in human brains which resulted in the data. Lots of hacks, shortcuts, low resolution "good enough" approximations, but in some cases, it's uncovering precisely the same functions that we use in processing and producing information.

I would argue this is deeply false, my classic go-to examples being that neural networks have almost no real relations to any aspects of actual brains [1] and that modeling even a single cortical neuron requires an entire, fairly deep neural network [2]. Neural nets really have nothing to do with brains, although brains may have loosely inspired the earliest MLPs. Really NNs are just very powerful and sophisticated curve (manifold) fitters.

> Could be very TARS like, lol.

I just rewatched Interstellar recently and this is such a lovely thought in response to the paper!

[1] https://en.wikipedia.org/wiki/Biological_neuron_model

[2] https://www.sciencedirect.com/science/article/pii/S089662732...

by observationist2 hours ago|

parent|

[-]

>> I would argue this is deeply false

I am making the case that this is distinctly and specifically true, for these types of models. They're eliciting many of the underlying functions and processes that brought about the data; transformers are able to model the higher degrees of abstraction that previous neural architectures could not. This was one of the major features of transformers that make them so powerful.

It's comparable to the idea that if you trained a model to output human sounding speech, many of the functions that shape the voice will correspond to the physical attributes that affect the sound of actual human voices. Volume of the mouth, shape of the lips, position of teeth, what the tongue does, etc - some of those things will be captured, others will be mashed into "good enough" , and others will be captured as an optimization possible in silicon but not for flesh and blood. It's not a one to one correspondence, but capturing process semantics and abstractions is why we have ChatGPT with transformers and not CNNs (although RNNs could have pulled it off back in the 90s, see: RWKV)

Anyway - the training methods, the paradigm of next token generation (in contrast to things like diffusion) and other aspects of LLMs restrict them to a subset of human capabilities, but it's reasonable to make the claim that many of the same functions that operate in Werncke's area and Broca's area in the human brain are resident in transformers. Many of the same associations between language and emotion and those abstract correlations - unspecified, implicit context that exists in the training data, but only as a deep subtext, sometimes even distributed across many texts, like cultural trends and so forth - are modeled by LLMs, not as an explicit feature of the data, but an implicit feature or function of the processes which produced the data.

Plus, there seems to be some support for the idea that for intelligent systems, modeling the world will result in comparable structures, networks, and features for similar concepts and knowledge - because you're modeling consistent, persistent things using modalities that are shared, or overlap, the way in which things are modeled converges on "universal" forms, simply due to constraints of utility and efficiency.

https://arxiv.org/abs/2405.07987

by Nevermark4 hours ago|

prev|

[-]

Agreed.

Everything in a model is a correlation of behavior with context and context with behavior.

"Mind set" is a factor across the continuum of scales.

Are we solving a math problem or deciding on entertainment? We become entirely "different brains" in those different contexts, as we configure our behavior and reasoning patterns accordingly.

The study is still interesting. The representation, clustering, and bifurcations of roles may simply be one end of a continuum, but they are still meaningful things to specifically investigate.

by devmor4 hours ago|

prev|

[-]

Thank you, I came here to say so much in less eloquent terms.

It's not surprising to find clustered sentiment from a slice of statistically correlated language. I wouldn't call this a "personality" any more than I would say the front grill of a car has a "face".

Deterministically isolating these clusters however, could prove to be an incredibly useful technique for both using and evaluating language models.

by D-Machine4 hours ago|

parent|

[-]

It's not even really the researchers' fault, academic psychological personality research is in general philosophically very weak / poor, in that they also almost always conflate "models of / talking about personality" with actual personality, and rarely actually check if things like the MBTI or Five-Factor Model actually correlate meaningfully with real behaviours.

Those that do find correlations between self-reported personality and actual behaviours tend to find those to be in a range of something like 0.0 to 0.3 or so, maybe 0.4 if you are really lucky. Which means "personality" measured this way is explaining something like 16% of the variance in behaviour, at max.

by devmor4 hours ago|

parent|

[-]

I don’t think this is even limited to this part of academia - or academia at all, but I do think it’s a bit irresponsible of them to assume prior rigor in those personality tests.

On top of that, a confounding issue is that human nature is to anthropomorphize things. What is more likely to be anthropomorphized than a construct of written language - the now primary method of knowledge transfer between humans? I can’t help but feel that this wishful bias contributes to missing the due diligence of choosing an appropriate metric with which to measure.

by D-Machine3 hours ago|

parent|

[-]

Yup, I agree it is a general problem, and related to a tendency to over-anthropomorphize. At least in this case there was still something pretty good in the paper anyway.