Surely this is the sort of thing compiler / language design nerds dream about?
It doesn’t have to guarantee efficiency or provide cutting edge performance in any context … it should just exist!
My understanding is that we can make such a language … but it’s not caught the fancy of someone who could do it
Currently supports CPU and GPU on macOS and CPU on linux.
https://github.com/kiwi-array-lang/kiwi
Kiwi runs computations on small dense arrays in its own runtime, when they are larger it will lower to MLX CPU and eventually to MLX GPU when it is worth it.
As user you don't have to change any code, you just write k.
I'm sure there are other languages designed to take advantage of modern GPUs.
But even with SIMD you can get quite far with array oriented code and many array language implementations will make use of it (BQN, ngn/growler/k, goal, ktye k has a version with SIMD support, …)
I’ve yet to find a language that does SIMD / multithreading / GPU with minimal tweaks let along multiprocessing.
Yes, it can, but only by eliminating the features that make it Turing complete. It’s relatively easy to vectorize map with a closure that can’t mutate anything but once you have nontrivial control flow, the compiler can’t make those kinds of assumptions.
Definitely the closest thing so far (doesn’t do multiprocessing) but does seem to do SIMD / multithreading and GPU auto parallelizing!
Any idea why it’s so little known?
That’s what compilers and high level languages are supposed to be for!
It's already (partly) existed called D language, by default it's garbage collected (GC), can also be program without it or hybrid. It's a modern, backward compatible with C and it's included in GCC.
The linear algebra system in D or Mir GLAS is standalone BLAS implementation written directly in D [1]. It's already proven faster than the other widely existing conventional BLAS like OpenBLAS back in 2016, about ten years ago!
This popular OpenBLAS include Fortran based LAPACK (yes you read it right Fortran) and it is being used by almost all data processing languages currently Matlab, Julia, Rust and also Mojo [2].
Interestingly there is a very early stage of standalone BLAS implementation written directly in Mojo namely mojoBLAS similar to Mir GLAS just started very recently [3].
>Surely this is the sort of thing compiler / language design nerds dream about?
You can say this again.
Especially on the GC side of the programming language since this SIMD / multi thread / multiprocessing / GPU can be abstracted away.
Actually someone recently proposed VGC or virtualized garbage collector for Python in C++ for heteregenous GC [4],[5]. However, the current evaluation excludes JIT compilation, AOT optimization, SIMD acceleration, and GPU offloading.
[1] OpenBLAS:
https://en.wikipedia.org/wiki/OpenBLAS
[2] Numeric age for D: Mir GLAS is faster than OpenBLAS and Eigen:
http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/...
[3] mojoBLAS:
https://github.com/shivasankarka/mojoBLAS
[4] Virtual Garbage Collector (VGC): A Zone-Based Garbage Collection Architecture for Python's Parallel Runtime:
https://arxiv.org/abs/2512.23768
[5] VGC-for-arxiv:
All the flaws I can think of in Kotlin are due to the Java compatibility. They could've made it work here by being more explicit but the way it currently works seems doomed.
All the use of Kotlin in industry are due to Java compatibility. Else there would be ~0% marketshare of Kotlin.
> Supporting more of Python's dynamic features like classes, inheritance, and untyped variables to maximize compatibility with Python code.
What's more, note how it says "to maximize compatibility" not "to achieve full compatibility."
Yes the underlying platform they based their compatibility on, is the reason they got some design flaws, some more than other.
However that compatibility is the reason they won wide adoption in first place.
In reality I think they've dropped that pretty hard. Literally you can't even get the length of a string with `len(s)` in the latest release. They also removed negative indexing, which I find baffling and frustrating. The roadmap does say they don't intend to have any "syntax sugar" until later in the implementation, but negative indexing is such a core part of what makes Python so much nicer to work with compared to say C++...
Unless it's open sourced, it's a moot point, as most Python devs wont come anyway.
> We're committed to open-sourcing all of Mojo, but the language is still very young and we believe a tight-knit group of engineers with a common vision moves faster than a community-driven effort. So we will continue to plan and prioritize the Mojo roadmap within Modular until more of its internal architecture is fleshed out.
I hope they stick to their original promise. And the 1.0 release would be a great time to deliver this.
This is a false dichotomy.
For years Golang was developed in the open but strictly moved on the vision of its creators rather than being "community-driven". Many other venerable open source projects don't involve the community in serious strategy discussions. The community mainly acts as a bug finder/fixer. Mojo could do the same: be open source but choose its own priorities internally.
I'm guessing that Mojo is still looking for a monetization strategy. Keeping important things proprietary in Mojo at this stage helps I'm sure (nothing wrong with that).
But I feel the era of proprietary programming language play is over. Unless you create some hardware (which the Mojo guys don't) it's going to be tough.
Release the source, but don't take code from external contributors. Take issues and discussion instead
Translated from corporatese it means "it will never happen".
Doesn't matter if it was closed, when the alternatives were much worse.
I can also recite the whole story, the missteps in OpenCL 2. , OpenCL C++, the OpenCL 3.0 reboot, how SYCL came to, CodePlay only proper available implementation, Intel acquisition of CodePlay and everything else.
(Among other reasons, but that's easily the main one.)