undefined

upvote

points

by yjftsjthsd-h17 hours ago |

upvote

by janalsncm15 hours ago|

[-]

I worked on it for a more specialized task (query rewriting). It’s blazing fast.

A lot of inference code is set up for autoregressive decoding now. Diffusion is less mature. Not sure if Ollama or llama cpp support it.

reply

upvote

by philipportner11 hours ago|

[-]

Did you publish anything you could link wrt. query rewriting?

reply

upvote

by stavros13 hours ago|

[-]

How was the quality?

reply

upvote

by janalsncm4 hours ago|

[-]

Quality was about the same. I will say it was a pain to train since it isn’t as popular and there isn’t out of the box support.

reply

upvote

by stavros4 hours ago|

[-]

Interesting, thanks! That's pretty cool though!

reply

upvote

by Bolwin17 hours ago|

[-]

Based on my experience running diffusion image models I really hope this isn't going to take over anytime soon. Parallel decoding may be great if you have a nice parallel gpu or npu but is dog slow for cpus

reply

upvote

by LoganDark12 hours ago|

[-]

Because diffusion models have a substantially different refining process, most current software isn't built to support it. So I've also been struggling to find a way to play with these models on my machine. I might see if I can cook something up myself before someone else does...

reply