undefined

points

by k1m9 hours ago |

comments

by 0-_-09 hours ago|

[-]

Agents don't buy stuff they see in an ad

by Retr0id9 hours ago|

parent|

[-]

So why serve them at all?

by Gigachad8 hours ago|

parent|

[-]

If your website itself is advertising a product or service you sell you would still want LLMs to see and fetch it. If you are a news site, blog, or any other website that doesn’t exist to sell something, you are only harmed by ai agents.

by Retr0id8 hours ago|

parent|

[-]

In those situations you wouldn't have ads on the human version of the site either, surely?

by mcmcmc5 hours ago|

parent|

[-]

Sure, if it’s paywalled. Web hosting isn’t free

by fullstackchris6 hours ago|

prev|

[-]

modern agents already do this via content negotiation and will attempt to retrieve the markdown version of a given site

https://www.sanity.io/learn/course/markdown-routes-with-next...

by k1m5 hours ago|

parent|

[-]

But that isn't that different from requesting the llms.txt version. Why not just make it so the useful content you want the LLM to focus on is easily retrievable from the same HTML the user's browser gets?

The sanity.io page writes:

> serving agents a bunch of HTML might just bloat their context window.

That's only true if you assume the the agent can't extract the useful text before it goes into the model as tokens. Your browser's reader mode uses heuristics to identify what the actual content is in a large HTML response and strips away the rest.

To me this is a far better approach than worrying about an llms.txt files or looking at HTTP headers to see if markdown is preferred. Such efforts could easily be directed at ensuring the useful content on your site carries the appropriate markup for an agent or any other tool to extract it. And it would require less work to implement for the publisher of the content.