Locally I haven’t gone much further than 8k. That is sufficient for small changes on small code bases. And you need condensed tool output.
I haven’t tried any tool that compresses the tokens yet.
1. The hardware will eventually catch up.
2. This keeps the delta between frontier models smaller.
3. We can still fine tune and own the weights.
4. The models will be more useful, faster, and reliable.
RTX is hobbyist tier, not professional tier.
Gated cloud models from hyperscalers treat us like hobbyists in their own right.
We need equivalent scale models, but open.
I have absolutely zero interest in free. I honestly don't think I'm even remotely in the same demographic as people using free tiers / models.
I want to pay. I don't want my data used for training. I want it to be open. I want it to be consistently up (more than Claude!). I want it to be fast. I don't want it to be subsidized as that's just an excuse for shitty quality. Deepseek flash knocks it out of the park on all of these except you're data is used in training. I'm fine with it being hosted since there's no way I'm using it 24/7, but data MUST be private.
Basically I want Hetzner and OVH to run open model clouds. I'm convinced this is going to happen eventually when everyone realizes this is a commodity.
For me, paying from $200 - $500 / month is reasonable if I can sustain a disruption free flow that doesn't require constant yak shaving. What I've found experimenting with DeepSeek on some open source library stuff is that it's actually going to cost me much less if I don't need frontier vibing (which I don't).
There'll probably need to be a threat of massive litigation should they fail to comply with such a policy.
I'm interested in this thought. There is significant motivation for providers to create a verifiable way for them not to deal with having access to client interactions with LLMs at all. Whatever standards and protocols have to be come up with in order to reassure clients.
Any good standards for privacy when interacting with LLMs could also trickle down to smaller providers, and everyone could offer guarantees. Even if the guarantee was literally just an insurance policy and a private court to decide if it pays out.
I wonder if there are competent models trained purely on permissive open-source code like MIT or Apache 2.0.