upvote
Exactly, and all on an embedded system with quite restrictive settings and no overclocked Intel lastest generation combined with NVIDIA's 10k graphic cards.
reply
Embedded systems can make network calls to powerful, GPU equipped servers.
reply
Sure. Claude does that. "Cogitated for 1m 50s" doesn't work for real-time applications.
reply
You can submit many queries in parallel to increase throughout. Smaller models and faster hardware can reduce the time per query too.
reply
None of that gets you the 100ms response time the parent poster talked about, for something like "who is at my doorbell?" real-time uses.
reply
Ok. Claude will not work for this use case because none of the sample data (weirdly blurry ID images) is in the training data.
reply
They really shouldn't, though.
reply
It can offer a ton of user value. There is a whole industry built upon this idea, Internet of Things.
reply
IoT wasn't not built on "send all the data off to a hosted GenAI". It predated them by quite a few years.
reply