Hacker News
new
past
comments
ask
show
jobs
points
by
datadrivenangel
12 hours ago
|
comments
by
digitaltrees
11 hours ago
|
[-]
I wonder if it really needs to be worse. I am playing with the idea of fine tuning a model on my exact stack and coding patterns. I suspect I could get better performance by training “taste” into a model rather than breadth.
reply
by
epicureanideal
7 hours ago
|
parent
|
next
[-]
I also wonder about JS only, Python only, etc models.
Maybe the future is a selection of local, specific stack trained models?
reply
by
andy_ppp
7 hours ago
|
parent
|
[-]
These models being able to generalise at coding will likely get worse if you remove high quality training data like all of python.
reply
by
andy_ppp
7 hours ago
|
parent
|
prev
|
[-]
Fine tuning these models (at least with PPO or equivalent) requires even more VRAM than inference does, potentially 2-3 times more.
reply
by
rusk
1 hours ago
|
parent
|
[-]
You could use PEFT? Operating on only a subset of weights is fairly standard practice nowadays …
reply