upvote
They are suggesting that you should assume the user has full access to the same tools as the agent, which is a helpful way to approach it. You mentioned the prompt side of things, and I think you should use a similar mindset there—just assume the user can read the entire prompt exactly as it’s sent.
reply
You should also assume the user can read any data you send back from a tool call or data you add to a user response. If any part of the input or output is controllable by an attacker, you should be assuming some prompt injection is possible that allows them to access all data and tool calls the agent had and has access to.
reply
Yes, that's part of the "entire prompt"
reply
Agreed. The agent and tools are different types of vulnerabilities. Both are important especially if you have dedicated finetuning (which won't be user dependent of course).

But also stuff like RAG: usually support agents have access to all internal support kbase material. Including stuff you don't want to leak verbatim. And there's other things to consider too like your agent being used to run other people's prompts. Not a data loss issue but could be a financial issue.

But yes I do agree that for the tools' security the agent shouldn't be considered as part of the security model. Any protections there are nice to have but shouldn't be relied upon.

reply