upvote
The main benefit of python to me is that while slow, it's predictable. I do think they're going to get a lot more resistance to adding JITs, moving GCs, etc. it will become java with a million knobs to tune. If people want a JIT'd python just use pypy, right?
reply
Java lost almost all those knobs a while ago (I mean they're there, but you're better off relying on the defaults). The modern GCs have one or at most two knobs remaining, and even that will become unnecessary next year. As to predictablity, you get maximal pause time of well under 1ms for heaps up to 16TB.
reply
The max pause time thing is a meme :) I have gotten multi second pause times with ZGC. It depends on what hardware you run it on.
reply
The new generational ZGC? I'm sceptical.
reply
So if I run it on a Pentium 3 you're telling me I can't get a long pause time? "1ms" means nothing
reply
Yes, on any hardware that ZGC supports and that your program otherwise runs with acceptable performance you won't get a long pause time, including (hypothetically) Pentium 3. The reason is that there is no work done inside the pause (no marking, no compacting, not even root scanning). It's just used to signal all threads that a new "epoch" is starting. On small hardware you have a small number of active threads, and so the pause will also be very short.
reply
Have a reproducer?
reply
As far as I know, java has 7 GC implementations, none of which are perfect, all of which have drawbacks

Lately, they seems to work with CRIU, various heuristics, multi-stage in-process bytecode compilation ..

Java is a mess, they are working hard to avoid fixing their issue (that nobody else have, so fixes are available)

reply
>As far as I know, java has 7 GC implementations, none of which are perfect, all of which have drawbacks

Compared to Python's, all of them are beyond perfect. And 99.9% of the time you don't even need to use anything but the default.

reply
> Compared to Python's, all of them are beyond perfect.

I somehow understand the situation less after reading this.

Is Python's GC bad, or are there cyclic reference issues? Is it possible to detect cyclic references perfectly? What does beyond perfect mean? If we have 7 and 0.1% of the time you need one of the 6 that is non-default, how do we choose? Is the understated version of "Compared to Python's, all of them are beyond perfect" "I think Java's are great"? If not, what about Python's impl makes it so lackluster to any of 7 of Java's?

reply
> Is it possible to detect cyclic references perfectly?

Yes. The GCs in Java, .NET, V8, and Go do it.

> If we have 7 and 0.1% of the time you need one of the 6 that is non-default, how do we choose?

Java's GC are optimised for different workloads and environments, and when the choice matters, they're easy to choose among:

1. Parallel GC: Maximal throughput when latency doesn't matter (batch processing).

2. Serial GC: Very small machines.

3. ZGC: low latency (<<1ms maximal pause, i.e. effectively pauseless)

4. G1 (the current default): A balanced mix of throughput and latency.

These are all the standard GCs (the seven you mentioned include a GC similar to Go's that was removed years ago, an "no op" GC for benchmarking hidden behind a development flag, and alternative implementations by different companies to some of the ones above).

It's possible that either Serial or Parallel will be removed when G1 is able to fully replace them.

Now, why do users need options? Because Java runs most of the world's finance, manufacturing, shipping and logistics, telecommunication, travel, healthcare, retail, defence, and government. We're talking large, complex software that handles huge workloads, and the needs vary. What works well enough for a CLI dev tool or a simple website is often not good enough to handle the world's credit card transaction processing or mobile phone networks.

> If not, what about Python's impl makes it so lackluster to any of 7 of Java's?

Java's GCs are moving collectors, which offer advantages not just compared to Python's GC but to all memory management strategies. Memory management (even in C) imposes a CPU/RAM tradeoff. Moving collectors (used in Java, .NET, and V8) give you a knob for controlling the tradeoff, i.e. they're able to convert RAM to CPU (i.e. use RAM chips as a hardware accelerator) and vice-versa.

reply
> Is Python's GC bad, or are there cyclic reference issues?

Unless you're being pedantic and including reference counting without cycle detection as GC, if your GC has cyclic reference issues, your GC is bad.

> Is it possible to detect cyclic references perfectly?

Yes? That's the entire point of tracing GC. You have some set of root objects that you start with (globals, objects on thread stacks, etc.) and then you mark every object that's reachable from them. Anything that's not reachable is garbage, even if there are cycles within them.

reply
>Is Python's GC bad, or are there cyclic reference issues?

Both can be true. The first can even be wholly or partly due to the second.

On addition, the way it does it via RC causes fragmentation, poor locality for caches, and general slowness for mass allocations. And it's one-size-fits-all.

Java has a much larger selection to pick to finetune specific use cases, which each being far greater for that use case. And the default no-need-to-think one (G1 iirc), is already faster and better than Python's.

reply
Are you not confusing GC (freeing memory) with the memory allocator ?

Memory allocator: tcmalloc, jemalloc, they are concerned with fetching (and releasing) pages of memory from the OS and allocating objects for the program

GC is only responsible for saying to the memory allocator "this object is no longer used"

(please stay focused on java)

reply
> GC is only responsible for saying to the memory allocator "this object is no longer used"

This is simply not true. Not only are Java's GCs responsible for allocation (which they do simply by bumping a pointer, similar to stack allocation), unlike Python's refcounting collector or Go's nonmoving tracing collector, they have no free operation of any kind and never free objects. Moving collectors don't even know when an object is freed. The way they work is that they compact the live objects, and because the dead objects are invisible to them, they happen to write over them when compacting the live ones. Refcounting collectors and nonmoving tracing collectors do use a free-list-based allocator that they use for allocation and deallocation, but moving collectors work completely differently.

Moving collectors can be so efficient that in the eighties, there was a famous paper about them called "Garbage Collection Can Be Faster Than Stack Allocation" [1] showing that the cost of managing an object's lifetime with a moving collector can, in principle, be less than a single machine instruction. That's why the cost of heap memory management in Java (and other runtimes that employ moving collectors) cannot be compared to the cost of memory management in languages using free lists, be it C or Python. Their operation is just too different (e.g. in Java, assigning a value to an object field or setting an array cell could sometimes be more expensive than allocating a brand new object/array).

[1]: https://www.cs.princeton.edu/~appel/papers/45.pdf

reply
That's not true, shenadoah does decommit and will continue to in the new generational version.
reply
What exactly of what I wrote is untrue, and while I'm not familiar with Shenandoah, what does decommitting have to do with any of it?
reply
You said "they have no free operation of any kind and never free objects" but some JVM moving GCs do.
reply
No, they do not. Uncommitting pages and freeing objects are different things. Even in malloc/free they are separate. The JDK's GCs do know when some memory region is unused and can choose to return it to the OS, but they still do not free any objects. What moving collectors do is compact the live objects or, if you will, evacuate the live objects out of some area memory. Once all the live objects are moved out of a certain area of memory, it can be uncommitted, but no objects were freed and the GCs don't even know when objects become unreachable. Unreachable objects are simply invisible to the GC, so when they move the live ones, they happen to overwrite the dead ones.

While the JDK's don't work quite like this, a simple way to picture it is as a a contiguous memory buffer, say, 100MB large. The GC compacts the live objects to the bottom of that buffer. At that point they do know where the end of the used memory is. Say that after compaction, the live objects occupy the bottom 20MB of the buffer (overwriting any dead objects that may have been there). At that point, the GC can choose to uncommit the top 30MB of the buffer and return it to the OS.

With malloc/free there's also this separation. A free operation marks the memory of a freed object in a data structure called a free list. If all the objects in a certain page have been freed, the allocator can choose to uncommit it and return it to the OS (although many modern allocators choose not to do this promptly).

The operation of moving collectors and where there cost is is completely different from free-list approaches. With free list approaches there's some work done to allocate an object and some work done to free it. With moving collectors, there's very little work to allocate an object (typically, just a pointer bump) and no work to free it; there is, however work to keep the object alive. That's why, especially with generational collectors, moving collectors do very little work to manage the memory of objects that don't live for long.

This is why, when working in Java, it's important not to think about the heap as we do in C. For short-lived objects, the cost of allocatng and "deallocating" them is often not significantly higher than the cost of allocating and deallocating an object on the stack in C. On the other hand, mutating an existing long-lived object could sometimes require some bookkeeping work by the GC, and could be much more costly than allocating (and "deallocating") a new object. That's why Java programmers are discouraged from pooling objects to "help the GC", something that Go developers often do when they run into issues with their non-moving, non-generational collector.

reply
>Are you not confusing GC (freeing memory) with the memory allocator ?

No, you're missing the fact that the allocation of memory and the GC go hand in hand, because you need it so for optimizations. They are designed together to cooperate in modern runtimes.

reply
Please read up some more about Java and GCs. Memory allocation and GC are heavily intertwined.
reply
> Lately, they seems to work with CRIU, various heuristics, multi-stage in-process bytecode compilation ..

Not sure what you mean by this, as this has nothing to do with GC, and Java has had a multi-tier optimising compiler for 15 years now.

> that nobody else have, so fixes are available

Go has much worse problems with GC than Java does these days, and nobody else is able to achieve similar performance in large programs with heavy workloads. So everyone else lives with less sophisticated compilers and memory management simply by accepting worse performance.

reply
https://openjdk.org/projects/crac/ (based on CRIU: https://criu.org/Main_Page)

Your last phrase is probably is joke ? If not, please share reproductible numbers: some things per watt, some things per MB of memory.

reply
As Python using SRE and supporting Python Flask apps, most of us would love JIT in Python assuming it pretty much drop in replacement.

PyPy doesn't have the support it needs and is stuck on 3.11.

reply
Why not just use Go? It has a proper concurrent, non-moving GC that, AIUI, has not been associated with sudden memory spikes.
reply
For a new project, teams can decide whether to use Go, but there are many millions of lines of existing Python servers out there.

Not to mention that there are differences in ecosystem, familiarity, and ergonomics that may make a team want to stick with Python.

“Just use Go” is not really actionable advice in most cases.

reply
Libraries. I use both languages, and a survey of what libraries are available is part of picking an implementation language when starting a greenfield project.
reply
It's a tradeoff. Go programs are extremely slow at starting up for example.
reply
A do nothing C program (int main() { return; }):

    $ time ./a.out

    real    0m0.002s
    user    0m0.000s
    sys     0m0.002s
A do-nothing Go program:

    $ time ./tmp

    real    0m0.002s
    user    0m0.000s
    sys     0m0.003s
I don't believe Go has any optimizations to not start its runtime if it isn't necessary, but when I added spawning a goroutine that immediately blocks on a channel read that will never come the numbers didn't change. That doesn't really time the runtime. Probably the program terminated before the goroutine was scheduled to run anything. It just makes it so there definitely wasn't an early exit because the compiler or the runtime "realized" it didn't need to start the runtime.

I'm sure the Go program is somewhat slower to start and end than C, and that we're running into the limits of how quickly processes can be spawned and other timing overhead which is obscuring the difference. However for practical purposes, "it starts up in less than the overhead for starting a process in the shell" is the same speed for most purposes.

Not even a "do nothing" Python program, no Python program at all:

    $ time python3 -c 1

    real    0m0.012s
    user    0m0.008s
    sys     0m0.004s
If you had a Go program that was slow to start up, it was your program, not Go. By contrast, Python, and the dynamic scripting languages in general, can be quite slow to start up, just in the reading and compiling of the code. (Even .pyc files, IIRC, take processing, just less processing than Python source code... it's still nowhere near "memory map it in and go" as it is for statically-compiled languages.)
reply
That doesn't matter for anything other then CLIs.
reply
Some people are writing CLIs
reply
Yes, of course, but I took the conversation to be centered around backend uses cases. What CLI is experiencing garbage collection issues like that in this discussion?
reply
What? Compared to Python they're like lightning. Typically milliseconds to the start of main() - admittedly they can be slowed down by init() nonsense and terrible generated protobuf code nonsense in deep dependency trees - but with a non-trivial Python program you can look forward to an order of magnitude more. There are techniques to help address that but (1) they're not idiomatic and (2) it still only mitigates it.

I suppose Go programs are slower than the equivalent thing in C or C++, but I'm not sure that's a very relevant comparison in most cases today (how many new things being written would choose those languages).

reply
So are Python and Java programs.
reply
Like saying a snail and an airplane are both slow compared to the speed of light.
reply
PyPy is not looking healthy right now - it's several versions behind in support and, while it's not dead, it looks like it might be settling down for a rest.

Obviously it's not easy to move the whole language of a big codebase, but I feel a lot of this stuff (fiddling with GC, JITing, type hints, and I'm dubious about the free-threading stuff) tries to take Python somewhere it isn't really good at, and if that's what you want, you really want a different language.

reply
It is the same for me. Predicability is better than any optimization.
reply
In what way do you feel Python is predictable, especially in comparison to other languages one would build a backend system in?

It's predictable vs Rust, C#, F#, Elixir, Go, etc.?

reply
And if people want python with java, there's always Jython.
reply
jython has been basically unmaintained for quite some time
reply
Well, they never made the jump to Python 3. But shipping 2.7 interpreters in 2024 was quite an achievement on its own. So their users already know this pain. And from my experience in academia, python 2.7 and java 8 will probably be used for another 20 years before the last machine running that stuff burns out.
reply
Graal vm has support for python 3 unfortunately it’s funded by oracle.
reply
If it makes you feel any better (it probably doesn't), the development of OpenJDK and the Java language itself is also mostly funded by Oracle
reply
Java is funded by Oracle, all of it.

People parrot to use OpenJDK without understanding it is mostly Oracle employees working on it.

And if you dislike Oracle, the other minor contributors are Red-Hat, IBM, SAP, Microsoft, Alibaba, Azul,... which for many HNers are the same.

reply
Jython is unmaintained, I'd recommend Clojure. Use python libraries and code while seamlessly targeting the JVM.
reply
jpype and graalpy are life.

jython went EOL.with python 2 going EOL.

reply
Resistance from anyone who matters to the developers?
reply
Why are people still building systems on top of a language that continually undergoes fundamental changes nearly 40 years after release? Is this not the strongest indication that this language is not well designed, it is unstable, and encounters many issues that flat out don't exist in other high level languages?
reply
What language that is actually used 40 years after release isn't undergoing big, fundamental changes?

Java? Nope, you're getting a fundamental change in Valhalla C++? Nope, new language edition every few years with fundamental changes C? C23 has a number of fairly fundamental changes, expect more in the next language revision

I think your sense of causality is backwards here. These languages are getting fundamental changes because they're being widely used. That is what motivates and drives the change. Languages with no users don't need to change.

reply
As you say, any widely used language gets fundamental changes from time to time.

But most such languages handle much better the compatibility with legacy applications.

Python is the main culprit in most cases when I see conflicts between various software packages that insist to use only a specific version of their dependencies. This is why I have to keep installed many versions of Python, and the Linux distribution that I use must take care to prevent interference between those Python versions.

reply
> Languages with no users don't need to change.

That's fine, but that's clearly not what I'm talking about.

Languages like F#, Elixir, etc. don't undergo fundamental changes. Yes, every language evolves. But for Python, we're talking about grafting literally fundamental stuff on top of a language not designed for any of these things.

For example, if someone went and redesigned Python to solve its warts, you'd basically end up with F#.

reply