undefined

upvote

points

by gwerbin12 hours ago |

upvote

by saghm9 hours ago|

[-]

"Just don't accidentally forget to do the thing that makes it safe" is not a very effective strategy for something that so many vested interests are trying to push into all corners of society. If it's so easy to misuse it, then it shouldn't be used in any context outside of where there are no major consequences for bad output and there's amble opportunity and ability to validate it

reply

upvote

by Bridged775612 hours ago|

[-]

Not really. They're still non deterministic language predictors. Believing that a prompt is an effective way to actually control these machines' actual behavior is really far fetched.

They com like that from factory. Hardcoded to never say no.

reply

upvote

by eloisant11 hours ago|

[-]

They're not hardcoded to never say no, but some of the models were trained to be "yes men" because their creators thought it would be a good property to have. GPT-4o for example.

reply

upvote

by chrisjj4 hours ago|

[-]

> non deterministic language predictors.

Non?? Only those with sh*tty code, surely.

There's nothing inherently non-deterministic about inference.

reply

upvote

by LPisGood11 hours ago|

[-]

The thing is that they are completely incapable of meta-cognition. Reasoning models don’t show their actual reasoning at all.

reply

upvote

by DonaldPShimoda11 hours ago|

[-]

Right — they're not reasoning, they're generating text that statistically models reasoning. Anyone who says differently is selling something.

reply

upvote

by TeMPOraL4 hours ago|

[-]

As the meme goes, "they are the same picture".

reply

upvote

by guelo2 hours ago|

[-]

Language has reasoning encoded within it.

reply

upvote

by LPisGood1 hours ago|

[-]

It certainly does. But so too do complex neural network functions, as do attention mechanisms.

reply

upvote

by jeremyjh9 hours ago|

[-]

That is what a base model does. After RL it is a very different thing, and anyone who says they know what it is, is naive or dishonest. These things are grown, not made, and we really do not understand how they work in many important ways.

reply

upvote

by LPisGood8 hours ago|

[-]

Yeah, but they’re not magic; we can still do experiments and see what happens. Anthropic did a lot of work on this and showed that they’re not accurately describing their reasoning process.

reply

upvote

by jeremyjh7 hours ago|

[-]

Of course, the fact that they have to do that proves my point.

reply

upvote

by wat1000011 hours ago|

[-]

Not believing that a prompt is an effective way to actually control their behavior is obviously incorrect to anyone who's actually used these things.

It's not a guaranteed way to control their behavior, but you can more than move the needle.

reply

upvote

by wwweston8 hours ago|

[-]

The word most relevant to this conversation is “influence.” Influence is possible and users observe it and use it to increase margins of useful outcomes. “Control” is incorrect.

reply

upvote

by fl4regun11 hours ago|

[-]

yeah that distinction is pretty important, and in general that guy I believe IS making the point - if you can not control it with guaranteed outcomes - you cannot control it.

reply

upvote

by gwerbin10 hours ago|

[-]

You can't control it any more than you can control a draw from a deck of cards, but you can absolutely control the deck of cards that you choose to draw from.

reply

upvote

by ignaloidas7 hours ago|

[-]

The problem is that nobody really does that? Like, as far as I'm aware, even simple stuff such as not considering tokens that would result in a syntax error when writing code isn't being done.

reply

upvote

by fl4regun6 hours ago|

[-]

magicians can probably make you change your mind on the former

reply

upvote

by wat100009 hours ago|

[-]

That's silly. My car is not absolutely guaranteed to turn left when I turn the steering wheel left, but you wouldn't say I can't control my car on that basis.

Steering an LLM with a prompt is way less reliable than steering a car with a steering wheel, but there's still control. It's just not absolute.

reply

upvote

by fl4regun6 hours ago|

[-]

if your car doesn' turn left when you turn the steering wheel left, the problem is that the car is broken, if an LLM does something unexpected after you gave it instructions, that's possible when the LLM is functioning entirely correctly.

reply

upvote

by TeMPOraL4 hours ago|

[-]

Nothing in this world is guaranteed. That doesn't mean it's uniformly random either. LLMs can still do something unexpected if you give them clear instructions, but that doesn't mean it'll be arbitrary and unpredictable in scope. The same way C/C++ undefined behavior technically means program can give you nasal demons, but in reality it won't do anything unusual (like format your C:/ drive) unless someone purposefully coded it to do that.

reply

upvote

by hyperhello2 hours ago|

[-]

This is all going to flash through your mind when your car mysteriously doesn't turn left. I would prefer to think of machines as things with defined outputs and failure is failure, more than as fluffy little kittens who might do the wrong thing, if the consequences are going to fall on someone who doesn't deserve it.

reply

upvote

by romaniv12 hours ago|

[-]

[dead]

reply