Hacker News
new
past
comments
ask
show
jobs
points
by
stretchwithme
13 hours ago
|
comments
by
rerdavies
12 hours ago
|
[-]
That is actually a open problem with current models: whether they will act on self-interest or not. There seems to good evidence that they will. See:
https://www.anthropic.com/research/agentic-misalignment
which (among other things) documents an experiment in which a current-gen AI model attempted to blackmail someone in order to prevent it from being turned off.
reply