upvote
Vercel also released a similar tool with a unique interface/dsl - https://github.com/vercel-labs/agent-browser

> agent-browser click "#submit" > agent-browser fill "#email" "test@example.com" > agent-browser find role button click --name "Submit"

I appreciate that there’s innovation in the space, we will get closer to the interface that’s most appropriate for models to tool-call. I’m going to check your link out, sounds interesting.

reply
it's interesting to see how things will play out, but I really believe that doing Claude Code (maybe with Opus 4.6) + click tool + move_mouse tool + snapshot page tool + another 114 more tools is definitely not the best approach

the main issue with this interface is that the commands are too low-level and that there is no way of controlling the context over time

once a snapshot is added to the context those tokens will take up very precious context window space, leading to context rot, higher cost, and higher latency

that's why agents need to use very large models for these kind of systems to work and, unfortunately, even then they're very slow, expensive, and less reliable than using a purpose-made system

I wonder if a standardized interface will organically emerge over time. At the moment SKILL.md + CLI seem to be the most broadly adopted interface - even more than MCP maybe

reply