Absolutely correct. Anthropic showed that 250 examples can "poison" an LLM -- independent of LLM activation count.
I have to steer models hard for C++. They constantly suggest std::variant :P
Godbolt got a 2x speed improvement switching from what he thought was a good fast impl to std:variant
Dimensionality gets bizarre in 1000-D space. Similarity and orthogonality express themselves in strange ways and each dimension codes different semantic meaning.
Therefore, if the training data is highly consistent you are by definition reducing some complexity and/or encoding better similarity.
In Go the statement
result, err := Storage.write(...)
Is almost always going to be followed by if err != nil { ... }
In a highly dynamic language you may not get try { Storage.write() } catch (error) { ... }
Unless explicitly asked for.https://github.com/Tencent-Hunyuan/AutoCodeBenchmark/blob/ma...
Being dynamic is secondary. A language that uses exceptions for errors does not always need to surround every try with a catch if the code doesn't need to. You have a top level handler that would catch everything.
...for which ample training data is available.
> This makes sense, given that they are derived from text translation systems.
...for languages with ample training data available.
Yes, LLMs can combine information in novel ways. They are wonderful in many respects. But they make far more mistakes if they can't lean on copious amounts of training data. Invent a toy language, write a spec, and ask them to use it. They will, but they will have a hard time.
The only code that exists on the internet for this is test data and a few docs in the github repo. It’s not wildly different from most scripting languages, from a syntax point of view, but it is definitely niche.
Both Codex and Claude figured it out real fast from an example script I was debugging. I was amazed at how well they picked up the minor differences between my script and others. This is basically on next to zero training data.
Would I ask it to produce anything super complex? Definitely not. But I’ve been impressed with how well it handles novel languages for small tasks.
Sure. But given the relation with translation systems, it seems far more likely that there are diminishing returns to larger volumes of training data.
An experienced Rust developer is going to be in a better position to drive an agent to generate useful Rust code than a Python programmer with little or no Rust experience. Not sure I agree with the author that everyone should just generate reams of Rust now.
At least if your get paged at 3am to fix the 300k AI-generated Django blog you’ll have a chance at figuring things out. Good luck to you if Claude is down at the same time. But still better than if it was in Rust if you have no experience with that language.