Hacker News
new
past
comments
ask
show
jobs
points
by
embedding-shape
12 hours ago
|
comments
by
gchamonlive
12 hours ago
|
[-]
Just assume each model iteration incorporates all the censorship prompts before and compile the possible list from the system prompt history. To validate it, design an adversary test against the items in the compiled list.
reply