upvote
So LLMs are not bad at generating selectors but they always need a loop to test it in + rules to produce the best selectors. When you have an llm write a playwright script for example to scrape a product page on an e-commerce, while the agent is building the script, it will check the DOM and based on that, it will generate selectors that it put inside the "page.locator" calls. In theory the selectors look great but in practice, the agent will not first shot a working selector.. This can be for few reasons like: llms can't see the whole DOM (usually html DOMs are massive), another reason would be that the llm is not running a selector matching algorithm in it's head, very similar to you giving an llm complex math problem without giving it a calculator tool or a coding tool. To fix the selector problem, few things will happen, it will start adding more and more html attributes to match the target element and it will start filling the context with useless error messages that it only needed to fix the broken selectors.
reply