upvote
Problem with Python and other non-strict typed languages is that if you let an LLM to write some stuff, you cannot truly be confident that nothing has broken. Even if your tests all pass. The LLM could have broken some path that only gets run in production in a very specific case. At least with strongly-typed languages you get a compiler error. In big codebases is non-negotiable
reply
Python has had type hinting for quite a while, and adding validation with mypy/pyright/ty as a step in CLAUDE.md (as well as having it as part of your CI pipeline) can emulate static type checking pretty well.
reply
Agree.

I am using type hints in Python as much as possible for my hand-coding. And it catches a lot of bugs (especially during code refactoring) that I would not have noticed so easily.

reply
> And it catches a lot of bugs (especially during code refactoring) that I would not have noticed so easily.

Can you give me an example of a recent experience with this? I've been working without type annotations for many, many years, and I keep finding that every time I find a bug I just don't feel like type annotations would have helped catch it, at least not to an extent that justifies the effort to put them in in the first place.

reply
Dynamically typed languages just add one more type of bug that can’t be caught at compile time. That’s not helpful, sure, but it’s one type of bug among many.

The issue you mention, execution paths not hit by test cases, is made worse by having more complicated code. Duck-typing can help reduce the number of paths.

Static vs dynamic… I don’t see an obvious winner here.

reply
My take is that I can never be confident that anything an LLM produce will not be broken. Since I will have to check everything it produces anyways, why not write it in a human friendly language, i.e. python? C and rust may have better strictness, but the amount of boiler code to set up that system takes up a lot of mental space that could be better used to architect the problem at hand.
reply
Perhaps we could do it in Python in the first pass for validation purpose. And then vibe rewrite it in Haskell.
reply
so it just boils down to strictness even when we're talking LLMs?

I agree with you about fast failure being a nice feature , but I also think that if you're TDDing a bunch of stuff and it fails in some categorical way , well then the test suite was lazy.

reply
> so it just boils down to strictness even when we're talking LLMs?

The article describes what I've been doing for the past few months - I did small python projects in the past because of the ecosystem: I couldn't possibly write a ton of the stuff required for the things I wanted to do, so I leaned into python because someone already wrote it for me. Quality of deps was mostly ok for the happy paths, but always a chore to patch the broken ones.

Nowadays I tell Claude what I want to build and I always ask it whether rust is a good choice for it. It'll pick up the right crates or choose whether it should DIY, do all the plumbing, nail all the logic, and in ~30m I'll have something very solid that would have taken me 3+ weeks of part-time evening coding in python. I think the article is right and rust is the closest to the "best language" we have for LLM coding at the moment: the strict typing and the tooling dramatically reduce the output space for LLMs, and 99% of errors have a clear, precise explanation that is actionable, and the compiler helps you a lot there too.

I think it also boils down to the fact that you cannot reliably and quickly answer "why is this arg None?" in languages like python without figuring out the call graph and evaluating possible states and inputs/outputs. Rust makes all that explicit and forces you handle it, which I feel dramatically cuts the time an LLM needs to spend figuring out why it's broken or what to do next. EDIT: The fact that you get memory safety on top of all this and it's handled by the compiler is yet another advantage for LLMs: the logic that gets written is simpler to reason about, because if you try to mutably access the same variable in two different places, the compiler will feed this back to the LLM at build time. In other languages that would be a "code smell" or would require static analysis.

Strictness is a quality for software and a chore for humans, and of course the stricter you are at representing your logic and your state machine, the less ways a program can break. LLMs writing in rust give you the strictness without the chore part, and it's a very good deal from my point of view.

reply
If you are using TDD with any recent model and even local models (qwen3.5+), you alleviate most of the issues mentioned.

Note that:

Writing code, then tests

Is not equivalent to:

Writing tests, then code

reply
This is why you should use Haskell.
reply
Haskell is a good language for LLMs! Claude knows it really well, and the type system catches so many mistakes. Just make sure to tell it to model the domain in the type from the start.

Also, Haskell can be really performant and low level, while still keeping the benefits of typing. With the C foreign function interface you can really do anything in Haskell!

reply
My anecdotal (sample size 1) experience is not consistent with this. I code fast. Refactor fast. My stuff doesn't break. But my methodology isn't the same as other's.
reply
i have bad news
reply
Lay it on. I love to collect other's anecdotes and see where they align (or disagree)
reply
I've found the opposite.

If you want your code to actually work, LLMs are far worse at coding in Python than in something like Rust.

Sure, if you just want your code to pass the one test they wrote and work in the one case they coded for, Python is fine.

A lot of people think this is fine, until they actually do something with what they've built besides just... build it.

reply
Have you tried writing Rust? I often hear this from people who haven’t tried it. I’ve found absolutely no issues over python and everything works 100x better
reply
I figure a big part of it is that SWE-Bench is the target benchmark for programming and it's all python.
reply
Python being the language LLMs are best at predates SWE-Bench by years.
reply