upvote
Yes I'm saying

> Imagine face recognition to work like a text chat, where the PC gets the frame from the camera and writes in the chat: "Who's that? Here's the RGB888 image in hex: ...".

that's p much how it works.

reply
But that isn’t a specialized model like the grandparent claimed, but rather a single, multi-modal model.
reply
Yes, the "imagine" was showcasing the opposite of a specialized model to call it a bad idea.
reply