undefined

upvote

points

by __mharrison__14 hours ago |

upvote

by dakiol14 hours ago|

[-]

Problem with Python and other non-strict typed languages is that if you let an LLM to write some stuff, you cannot truly be confident that nothing has broken. Even if your tests all pass. The LLM could have broken some path that only gets run in production in a very specific case. At least with strongly-typed languages you get a compiler error. In big codebases is non-negotiable

reply

upvote

by mjr0011 hours ago|

[-]

Python has had type hinting for quite a while, and adding validation with mypy/pyright/ty as a step in CLAUDE.md (as well as having it as part of your CI pipeline) can emulate static type checking pretty well.

reply

upvote

by hasley4 hours ago|

[-]

Agree.

I am using type hints in Python as much as possible for my hand-coding. And it catches a lot of bugs (especially during code refactoring) that I would not have noticed so easily.

reply

upvote

by zahlman2 hours ago|

[-]

> And it catches a lot of bugs (especially during code refactoring) that I would not have noticed so easily.

Can you give me an example of a recent experience with this? I've been working without type annotations for many, many years, and I keep finding that every time I find a bug I just don't feel like type annotations would have helped catch it, at least not to an extent that justifies the effort to put them in in the first place.

reply

upvote

by bee_rider11 hours ago|

[-]

Dynamically typed languages just add one more type of bug that can’t be caught at compile time. That’s not helpful, sure, but it’s one type of bug among many.

The issue you mention, execution paths not hit by test cases, is made worse by having more complicated code. Duck-typing can help reduce the number of paths.

Static vs dynamic… I don’t see an obvious winner here.

reply

upvote

by fyredge11 hours ago|

[-]

My take is that I can never be confident that anything an LLM produce will not be broken. Since I will have to check everything it produces anyways, why not write it in a human friendly language, i.e. python? C and rust may have better strictness, but the amount of boiler code to set up that system takes up a lot of mental space that could be better used to architect the problem at hand.

reply

upvote

by ttflee9 hours ago|

[-]

Perhaps we could do it in Python in the first pass for validation purpose. And then vibe rewrite it in Haskell.

reply

upvote

by serf14 hours ago|

[-]

so it just boils down to strictness even when we're talking LLMs?

I agree with you about fast failure being a nice feature , but I also think that if you're TDDing a bunch of stuff and it fails in some categorical way , well then the test suite was lazy.

reply

upvote

by plqbfbv13 hours ago|

[-]

> so it just boils down to strictness even when we're talking LLMs?

The article describes what I've been doing for the past few months - I did small python projects in the past because of the ecosystem: I couldn't possibly write a ton of the stuff required for the things I wanted to do, so I leaned into python because someone already wrote it for me. Quality of deps was mostly ok for the happy paths, but always a chore to patch the broken ones.

Nowadays I tell Claude what I want to build and I always ask it whether rust is a good choice for it. It'll pick up the right crates or choose whether it should DIY, do all the plumbing, nail all the logic, and in ~30m I'll have something very solid that would have taken me 3+ weeks of part-time evening coding in python. I think the article is right and rust is the closest to the "best language" we have for LLM coding at the moment: the strict typing and the tooling dramatically reduce the output space for LLMs, and 99% of errors have a clear, precise explanation that is actionable, and the compiler helps you a lot there too.

I think it also boils down to the fact that you cannot reliably and quickly answer "why is this arg None?" in languages like python without figuring out the call graph and evaluating possible states and inputs/outputs. Rust makes all that explicit and forces you handle it, which I feel dramatically cuts the time an LLM needs to spend figuring out why it's broken or what to do next. EDIT: The fact that you get memory safety on top of all this and it's handled by the compiler is yet another advantage for LLMs: the logic that gets written is simpler to reason about, because if you try to mutably access the same variable in two different places, the compiler will feed this back to the LLM at build time. In other languages that would be a "code smell" or would require static analysis.

Strictness is a quality for software and a chore for humans, and of course the stricter you are at representing your logic and your state machine, the less ways a program can break. LLMs writing in rust give you the strictness without the chore part, and it's a very good deal from my point of view.

reply

upvote

by __mharrison__14 hours ago|

[-]

If you are using TDD with any recent model and even local models (qwen3.5+), you alleviate most of the issues mentioned.

Note that:

Writing code, then tests

Is not equivalent to:

Writing tests, then code

reply

upvote

by faangguyindia10 hours ago|

[-]

This is why you should use Haskell.

reply

upvote

by black_knight9 hours ago|

[-]

Haskell is a good language for LLMs! Claude knows it really well, and the type system catches so many mistakes. Just make sure to tell it to model the domain in the type from the start.

Also, Haskell can be really performant and low level, while still keeping the benefits of typing. With the C foreign function interface you can really do anything in Haskell!

reply

upvote

by __mharrison__14 hours ago|

[-]

My anecdotal (sample size 1) experience is not consistent with this. I code fast. Refactor fast. My stuff doesn't break. But my methodology isn't the same as other's.

reply

upvote

by QuadmasterXLII14 hours ago|

[-]

i have bad news

reply

upvote

by __mharrison__14 hours ago|

[-]

Lay it on. I love to collect other's anecdotes and see where they align (or disagree)

reply

upvote

by onlyrealcuzzo9 hours ago|

[-]

I've found the opposite.

If you want your code to actually work, LLMs are far worse at coding in Python than in something like Rust.

Sure, if you just want your code to pass the one test they wrote and work in the one case they coded for, Python is fine.

A lot of people think this is fine, until they actually do something with what they've built besides just... build it.

reply

upvote

by mountainriver9 hours ago|

[-]

Have you tried writing Rust? I often hear this from people who haven’t tried it. I’ve found absolutely no issues over python and everything works 100x better

reply

upvote

by hamdingers11 hours ago|

[-]

I figure a big part of it is that SWE-Bench is the target benchmark for programming and it's all python.

reply

upvote

by solidasparagus11 hours ago|

[-]

Python being the language LLMs are best at predates SWE-Bench by years.

reply