upvote
https://arxiv.org/abs/2508.09101

In this benchmark, models can correctly solve Rust problems 61% on first pass — A far cry from other languages such as C# (88%) or Elixir (a “buggy dynamic language”) where they perform best (97%).

I wonder why that is, it’s quite surprising. Obviously details of their benchmark design matter, but this study doesn’t support your claims.

reply
This is great, but aug 2025 is almost a lifetime ago with how fast these models are improving. Opus 4.5 came out November 2025 fwiw
reply
The downside is that even simple Rust projects typically use hundreds of dependencies, and this is even worse with LLMs, who don’t understand the concept of “less is more”.
reply
Nobody forces dependencies on you. You can control that.
reply