upvote
Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

(github.com)

I honestly didn't expect this to get much attention, but we hit ~330+ clones in the last 24 hours.

That unexpected load actually helped me find a few bugs in the setup script (specifically with the pgvector config on Windows), which I've just patched. If anyone else hits memory issues on 4GB cards, let me know—I'm actively optimizing the quantization now

reply