undefined

points

[-]

It seems to heavily depend on what exactly you're transcribing, the performance/quality between them is really uneven. Some models work really well for old cursive but then fail reading 8-bit segment LCD digital fonts, vice-versa or any combination out there.

Basically, to find the answer you really need your own benchmark you run with real examples from what you want to do. Basically the same goes for anything ML nowadays as the public benchmarks cannot really be trusted to give you any sort of indication on how we'll it'd work for you.

by InsideOutSanta10 hours ago|

prev|

[-]

It's really good. I didn't do any type of statistical evaluation or comparison to other models, but it's so good that it doesn't matter to me if there's an option that might be even better.

by potsandpans3 hours ago|

parent|

[-]

This just dropped: https://huggingface.co/baidu/Unlimited-OCR

Which can run comfortably on 12gb of vram. I gave it a whirl and it does seem pretty competitive. I wonder how that compares for your usecase

by nok22kon10 hours ago|

parent|

prev|

[-]

curious if you tried local LLM models for OCR, like a Gemma4, or your volume is too much for that

by InsideOutSanta10 hours ago|

parent|

[-]

Haven't tried them in a while, so I can't comment on current performance.