Because the language gives you many different tools, an LLM generated codebase can get inconsistent and overly complicated quickly. The flexibility of Python is a downside when you’re having an LLM generate the code. If you’re working in an existing codebase, it’s great - those choices were already made and it can match your style.
When an LLM has to derive its own style is when things can devolve into a jumbled mess.
What language do you feel is easier to reason about in the large?
I think I have never seen haskell software made wih LLM's but well, aside from university, I have not seen Haskell code at all. (Also Haskell purists I would associate with people who avoid LLM's)
I would rather go with Rust given these choices.
But I have good results with typescript (or javascript for simpler things). Really large set of examples. Tools optimized for it, agents debugging in the browser works allmost out of the box. And well, a elaborate typesystem.
Compared to most languages, including Java, C# will have a hard time letting you compile incoherent code.
You barely need any dependencies other than aspnetcore and efcore for most applications and your AI knows them well.
It’s easy to do TDD with it so it’s easy to keep your IA from hallucinating.
> There are not that much different ways to get somewhere
This is far from true. C# is a language where you can operate on the raw pointers through unsafe keyword. On the other end of the spectrum, you can have duck-typing in dynamic blocks.
For operating on collections you can use old style loops, or chain of lambdas or sql like syntax.
I have been coding in C# old school way for most of my life at this point, and I feel like I'm in a foreign land reading code from some other C# projects.
You'd have to steer the LLM to use the style you want, and not massively overarchitect things though, but that's going to be an issue nonetheless.
Do you have any recommendations for systems where reasoning about large systems is easier than in python?
As a rule, I avoid implementation inheritance. Occasionally I need to facade a library that assumes implementation inheritance to avoid it spreading into my codebase.
When the codebase hits a certain size, I hand-roll some decorators to create functionality like java interfaces. With that done, and a suite of acceptance tests, I find it scales up well.
Python is terrible for writing big systems.
Projects whose V1 is written in Go/Rust/C++ don't normally go out and re-write V2 in Python.
The reverse is really common.
Even many famous Python packages are now Python wrappers.
That's because you would usually rewrite your Python program in something like C++ if you realise that it's too slow and you need the speed of a compiled language, despite the enormous extra complexity to create and maintain it that way.
You wouldn't go back the other way because it's very rare to go to all that extra effort writing in a more efficient language only to realise that the slower performance of Python would've been adequate after all. And, thanks to sunk cost fallacy, even someone that does realise it is unlikely to make the switch back.
There's no way you could convince me that writing your program in C++ is easier to code in, even for a very large system, than Python. C# maybe.
> Even many famous Python packages are now Python wrappers.
Of course! That's precisely because Python is much simpler to code in. If your Python libraries are wrappers around native code then you get the speed benefit without having to drop into those languages. (Plus they can release the GIL, allowing true multithreaded Python.)
If native coding languages were good enough then there would be no need for Python wrappers - you'd just call into the native library directly.
True but that's the problem. Once you have a big enough team, it becomes an uphill battle to maintain that.
The "faster to write" advantage becomes less relevant if most code is going to be auto-generated.
The "harder to maintain" might still remain more relevant.
Sure there's less ceremony, and yes, you can have your project going with just a single file, but other than that...?
In Java bad OOP conventions were commonplace, like everything using getters/setters, deeply nested class hierarchies and insane patterns like AbstractSingletonProxyFactoryBean. It got impossible to figure out what's going on.
C++ just got every possible feature that badly interacts with each other, in an amount that never could fit in a single person's context window. That basically led to a situation where every programmer or company had it's own dialect of the language; the other dialects than your own were mostly incomprehensive.
Python has it's own share of bad features, and for a long time really bad ecosystem around the language - Python 2 vs Python 3; eggs vs wheels; easy_install vs pip; 123489 ways of installing Python and each of them bad. But, once it started to become better, in the mid-late 10s, around Python 3.5 or 3.6, it exploded in popularity.
Less ceremony and boilerplate means more readable code.
I think a lot of the readability of python is in the fact you don't need to be recently familiar with it to pick up what its doing most of the time.
Over my career I've dipped in and out of rust, typescript, perl, swift, etc codebases. I'm no expert in any of these, but every single time I have to look something up to understand what this set of arcane symbols or syntax means.
When I dip into Python I just ... read it.
(None of this is to say I prefer Python, just that I really do get the readable thing)
Someone who is equally expert at Java and Python will probably consider Java to be more readable.
Often times when I am reading a medium or advanced python codebase I need to look into the function definitions and operator documentation to understand what is supposed to be returned. Where with C-like languages I feel it is easier to build that context because there is more context written and less tricky syntactic sugar.
Sure, but this is the case for any language.
So .. you were already trained in reading abstract.
A beginner on the other hand sees lots of intimitading {} in C family languages everywhere. And Python does not need them and less is usually better in design.
Misplaced brackets seem like a thing from the past to me when we didn't have IDEs. I don't remember ever having a bug due to that.
I can't imagine how. Whitespace physically lays out the block structure on the screen; braces expect you to count and balance matching symbols, and possibly scan for them within other line noise.
Any reasonable language with braces has standard formatter that will just put each brace level on a different whitespace level.
Brackets would allow the editor to autoindent the pasted code.
No choice is perfect.
I know that is mainly a beginner coding issue, but never having to deal with that issue was always one of the biggest advantages of python.
That said, I believe a lot of the stuff that was added in 3 and beyond (to make it more typesafe, accounting for unicode, etc) has made it a lot less readable over time. You can argue that it has made Python a better and safer language, but the pseudocode aspect has gotten worse. I kinda miss that.
And today with autofotnatters I think only Python is still vulnerable.
Go is a simple target for LLMs as the language has changed very little and with the Jetbrains go-modern-guidelines[0] skill the LLM can use the handful of recent additions effectively
And with Python there are things like ruff and pydantic that can enforce contracts in the code.
I seriously doubt this is really the case. From my experience coding agents just love writing bad python code. It always needs explicit instructions for example to use uv instead of raw dogging pip. There is a lot of python code out there because it is being taught as a beginner language and because of that there is necessarily a lot python code written by beginners. That's my explanation at least for bad LLM generated python code.
1) It's a very consistent language even if you compared to the other popular languages namely Python, Rust, C++ and Go. Try to perform doubly linked list with them and compare them all [1].
2) It's probably the most "Pythonic" among the compiled language according to Walter.
3) It utilizes GC by default, you can also manage your own memory and you can hybrid.
4) It compiled fast and run fast, heck it even has built-in REPL eco-system.
5) Regarding the small training set, with recent self-distillation fine-tuning approach it should be good enough, D (actually D2 version) has been around for more than a decade [2].
[1] Looking for a Simple Doubly Linked List Implementation:
https://forum.dlang.org/thread/osmecwfnpqahoytdqpkr@forum.dl...
[2] Awesome D:
But it's LLMs that read it not humans. At least that's the trend
> Use a language that has a large training set so the LLM can be most efficient.
It's pretty efficient with Rust.
Because I get reliable generation out of "niche" languages already
Is it code with lots of SQL injections used in a different domain to your own?
It's maybe not good to conflate quantity with quality
I'm more of a c++/TS/etc user, so I miss braces a lot. I think a basic Python script sure it's easy to read through, but a large project starts to get quite ugh.
I am very jealous of Python's numerous built-ins though. I was looking for a JS sum function the other day and was surprised to see node.js still doesn't have a built in + you still cannot reference operator functions.
You people should grow up. Programming languages are tools, not pets.