upvote
All AI companies are working on models with specialisms. Which are really good at one task.

Mistral is just a bit more forward about this. I guess because they don't need/want to "wow" an audience with generalist user-facing tools (chat) that seem to be experts in everything (but in reality quite often will be a lot of such specialist models chained together).

Here, what you want, is really just a few python scripts away. Voxtral to turn your spoken prompt into text, piped into mistral large 3 with extra system prompts that creates a prompt for ocr and paths to files. It could do this in a loop to actually find those files. which you throw at ocr3, is pased back to misteal large 3 to interpret and turn into decisions.

This is common. It's rather uncommon, really, to build something like this using only one model for everything.

reply
Why would anybody do that you would simply get terrible results compared to dozens of other more capable models. It's for converting to text not answering questions. Just seems like you need some sort of weird angle to bring out an anti AI stance
reply
Guess you haven't met management yet. Clearly nobody should do that but that official warning is not going to stop them from trying.
reply
I think his comment is referring to a scenario where a decision is made on financial numbers that are misrecognized. E.g. 9.0% actual is OCR’d as 90%
reply
“I delegated critical financial decisions to my OCR software, and you won’t believe what happened next.”
reply