upvote
Why would this be surprising? That’s exactly how much of the code they were trained on is presented in PRs, Forums, etc.
reply
Is that true? That depends on how their web scraping works, like whether it runs client-side highlighting, strips out HTML tags, etc.
reply
The highlighting isn't what matters, its the pretext. E.g. An LLM seeing "```python" before a code block is going to better recall python codeblocks by people that prefixed them that way.
reply
deleted
reply