upvote
Haven't seen anything particular about that, but lots of the documents with names that were half-redacted contain OCRd text that is completely garbled, but olmocr-2-7b seems to handle it just fine. Unsure if they just had sucky processes or if there is something else going on.
reply
Might be a good fit for uploading a git repo and crowdsourcing
reply
Was my first impulse too but not sure I trust that unless I could gather a bunch of people I trust, which would mean I'd no longer be anonymous. Kind of a catch22.
reply
GitHub would ban you
reply