upvote
I will say many closed source repos are probably equally as poor as open source ones.

Even worse in many cases because they are so over engineered nobody understands how they work.

reply
yeah, but isn't the whole point of claude code to get people to provide preference data/telemetry data to anthropic (unless you opt out?). same w/ other providers.

i'm guessing most of the gains we've seen recently are post training rather than pretraining.

reply
Yes, but you have the problem that a good portion of that is going to be AI generated.

But, I naively assume most orgs would opt out. I know some orgs have a proxy in place that will prevent certain proprietary code from passing through!

This makes me curious if, in the allow case, Anthropic is recording generated output, to maybe down-weight it if it's seen in the training data (or something similar)?

reply
I'd bet, on average, the quality of proprietary code is worse than open-source code. There have been decades of accumulated slop generated by human agents with wildly varied skill levels, all vibe-coded by ruthless, incompetent corporate bosses.
reply
Not to mention, a team member is (surprise!) fired or let go, and no knowledge transfer exists. Womp, womp. Codebase just gets worse as the organization or team flails.

Seen this way too often.

reply
There's only very niche fields where closed-source code quality is often better than open-source code.

Exploits and HFT are the two examples I can think of. Both are usually closed source because of the financial incentives.

reply
Let's start with the source code for the Flash IDE :)
reply