undefined

points

by AnotherGoodName3 hours ago |

comments

by wds3 hours ago|

[-]

I imagine it'd work the same as merging all the good-tasting foods to get an even tastier one

by nylonstrung1 hours ago|

prev|

[-]

If you go to Civitai this is pretty how it works in that corner of the image generation world

Everything is using Stable Diffusion as underlying model, then most of the usage is merged of checkpoints

by avereveard2 hours ago|

prev|

[-]

most merge improve a small subset of "feeling" benchmark (too small, too specific, or out of distribution) and tend to show degradation on actual benchmark, with especially punishing result on long chain benchmarks.

also only work on matching architectures (i.e. finetunes/loras of the same model)

by dindunuf2 hours ago|

prev|

[-]

that kinda worked in llama 1/2 era, not between different models but between finetunes of the same model. the briefly legendary Mythomax was IIRC a merge of 5+ tunes, some of which were merges themselves.

by _3u103 hours ago|

prev|

[-]

No, they need the same arch, but you can distill them into a single model. And yes, if you use the API directly Claude will often say it’s an open weight model (likely the ones it was distilled from)