upvote
You can submit many queries in parallel to increase throughout. Smaller models and faster hardware can reduce the time per query too.
reply
None of that gets you the 100ms response time the parent poster talked about, for something like "who is at my doorbell?" real-time uses.
reply
Ok. Claude will not work for this use case because none of the sample data (weirdly blurry ID images) is in the training data.
reply