upvote
The good regulator theorem makes that a little difficult.
reply
I'm working on new technology where you separate the instructions and the variables, to avoid them being mixed up.

I call it `prepared prompts`.

reply
It’s not that simple.

That would result in a brittle solution and/or cat and mouse game.

The text that goes into a prompt is vast when you consider common web and document searches are.

It’s going to be a long road to good security requiring multiple levels of defense and ongoing solutions.

reply
If only we had a reliable way to detect that a poster was being sarcasm or facetious on the Internet.
reply
The solution is to sanitize text that goes into the prompt by creating a neural network that can detect sarcasm.
reply
Unfortunately it takes ~9 months just to build that network up to the point where you can start training it, and then the training itself is literally years of hard effort.
reply
Ah, the Seinfeld Test.
reply
A sarcasm machine is finally within our reach
reply
I assumed beeflet was being sarcastic.

There’s no way it was a serious suggestion. Holy shit, am I wrong?

reply
I was being half-sarcastic. I think it is something that people will try to implement, so it's worth discussing the flaws.
reply
Turtles all the way down; got it.
reply
Isn't that just another guardrail that can be bypassed much the same as the guard rails are currently quite easily bypassed? It is not easy to detect a prompt. Note some of the recent prompt injection attack where the injection was a base64 encoded string hidden deep within an otherwise accurate logfile. The LLM, while seeing the Jira ticket with attached trace , as part of the analysis decided to decode the b64 and was led a stray by the resulting prompt. Of course a hypothetical LLM could try and detect such prompts but it seems they would have to be as intelligent as the target LLM anyway and thereby subject to prompt injections too.
reply
reply
This is genius, thank you.
reply
We need the severance code detector
reply
This adds latency and the risk of false positives...

If every MCP response needs to be filtered, then that slows everything down and you end up with a very slow cycle.

reply
I was sure the parent was being sarcastic, but maybe not.
reply