undefined

points

[-]

For CPU with bigger K you would put the centroids in a search tree, so take advantage of the sparsity, while a GPU would calculate the full NxK distance matrix. So from my understanding the bottleneck they are fixing doesn't show up on CPU.

by xavxav9 hours ago|

parent|

[-]

search trees tend not to scale well to higher dimensions though, right?

from what I've seen I had the impression that Yinyang k-means was the best way to take advantage of the sparsity.

by snovv_crash4 hours ago|

parent|

[-]

Most data I've used is for geospatial with D<=4 (xyzt) so for me search trees worked great. But for things like descriptor or embedding clustering yes, trees wouldn't be useful.

by openclaw018 hours ago|

prev|

[-]

[dead]