upvote
Blog post is new but the model is about 2 weeks in public.
reply
My local tennis court's reservation website was broken and I couldn't cancel a reservation, and I asked GLM-5.1 if it can figure out the API. Five minutes later, I check and it had found a /cancel.php URL that accepted an ID but the ID wasn't exposed anywhere, so it found and was exploiting a blind SQL injection vulnerability to find my reservation ID.

Overeager, but I was really really impressed.

reply
Yeah it seems they did not align it to much, at least for now. Yesterday it helped me bypass the bot detection on a local marketplace. that i wanted to scrap some listing for my personal alerting system. Al the others failed but glm5.1 found a set of parameters and tweaks how to make my browser in container not be detected.
reply
I always jump on the Chinese models when I'm trying to do something that the US ones chastise me for, they're a little more liberal, especially around copyright.
reply
Model doing what the user wants with high quality is definitely aligned in my book.
reply
It's too much in the direction of the paperclip maxmizer for me. It should only hack sites when explicitly directed to, not as a default.
reply
This can never go wrong!
reply
> Five minutes later, I check and it had found a /cancel.php URL that accepted an ID but the ID wasn't exposed anywhere, so it found and was exploiting a blind SQL injection vulnerability to find my reservation ID.

xkcd was prescient once again... https://xkcd.com/416/

reply
Hell, this one time, my AI assistant hacked itself trying to book an appointment for me!
reply
That is both amazing and terrifying.
reply
This is insane, I love it.
reply
Unfathomably based.
reply
It's been out for a while.
reply