upvote
I wonder if it really needs to be worse. I am playing with the idea of fine tuning a model on my exact stack and coding patterns. I suspect I could get better performance by training “taste” into a model rather than breadth.
reply
I also wonder about JS only, Python only, etc models.

Maybe the future is a selection of local, specific stack trained models?

reply
These models being able to generalise at coding will likely get worse if you remove high quality training data like all of python.
reply
Fine tuning these models (at least with PPO or equivalent) requires even more VRAM than inference does, potentially 2-3 times more.
reply
You could use PEFT? Operating on only a subset of weights is fairly standard practice nowadays …
reply