Show HN: TurboQuant-WASM – Google's vector quantization in the browser

(github.com)

137 points

by teamchong10 hours ago |

6 comments

by netdur2 hours ago|

[-]

I tried TQ for vector search and my findings is not good, it is not worth it if you cannot use GPU, however I got same quality of search as 32f using 8bit quant

I wrote ann ext for sqlite, using tq, I do save a lot on space but 32f is still faster despite everything I have tried

code here https://github.com/netdur/munind/tree/main/src/tq

by ninja39251 minutes ago|

parent|

[-]

So i assumed it would get crushed by OPQ (which requires training)

by teamchong50 minutes ago|

parent|

prev|

[-]

you’re right that 32f is faster on raw query time, quantization adds extra step. main benefit on download size since gzip won’t help much, which matters most in browser contexts

by glohbalrob8 hours ago|

prev|

[-]

Very cool. I added the new multi embedding 2 model to my site the other week from google

I guess need to dig into this and see if it’s faster and has more use cases! Thanks for publishing your work

by hhthrowaway12308 hours ago|

prev|

[-]

Awesome! Also love the gaussian splat demo, cool use case!

by refulgentis2 hours ago|

prev|

[-]