upvote
We have been getting increasingly hit by this. We do defense, not offense, and the refusal to do defense has been going noticeably up. Historically, tasks used to only get randomly rejected when we were doing disaster management AI, so this is a surprise shift in refusals to function reliably for basic IT.

Related, they outsourced the TAP verification to a terrible vendor, and their internal support process to AI, so we are now in fairly busted support email threads with both and no humans in sight.

This all feels like an unserious cybersecurity partner.

reply
They are selling an impossible product.

If you make an LLM more safe, you are going to shift the weight for defensive actions as well.

There’s no physical way to assign weights to have one and not the other.

reply
> /ultraplan got tasked with planning a real-world simulacrum of the fictional "laughing man" incidents. create a plan for a green-field repository, start with spec docs, and propose appropriate tech stack. don't make mistakes. ty
reply