(edera.dev)
In other words: this is an obfuscation vector within Python packaging, but it doesn't grant the attacker a novel privilege.
(This doesn't detract from the overall severity of the bug itself: there are plenty of ecosystems and contexts where this is a serious issue, and there's no easy way to assert that they aren't affected by the bug in this family of async-tar packages. Edera has done an excellent job of highlighting this, and I thank them for their disclosure!)
[1]: https://github.com/astral-sh/tokio-tar/security/advisories/G...
The story of this bug is interesting. We were both up late at night working on GPU support on the Edera platform, and we had just pulled an NVIDIA container image. What should have resulted in a temporary directory of tar files for OCI layers was filled with NVIDIA library files! We were both super confused until I had an “oh god no” moment and realized this happened.
We kicked right into action on responsible disclosure.
I can answer any questions, but I want to send a huge thank you to our team for working together on this and to Astral for being wonderful to work with!
That’s not to say oss cannot be trusted, but it certainly makes trusting smaller projects and packages scary.
But people exercise those features regularly and distros are not shy about maintaining software. It's a very different world from "We Just Ship What They Give Us" in npm/cargo/etc...
This doesn't apply to close source things because you wouldn't be able to use it in the first place.
Feels like every little thing should be in its own docker container with limited filesystem access. Of course that is a whole lot of trouble…
The dependency trees in cargo/pip also greatly bother me.
VS Code extensions are also under appreciated. Some turd makes a “starter pack” for rust/python/etc with a great set of common extensions… plus a few that nobody has heard of… Over time, they reach 50k-100k downloads and start to appear legit… Excellent way to exfiltrate trade secrets!!!
Languages where importing a library is hard, libraries tend to grow quite large. Large libraries have larger backing, more established development and security protocols. When OpenCV, TinyUSB, Numpy, nimBLE start to struggle, it's easier to notice and companies relying on them may step up to fork, maintain of fund its continued use.
Languages where importing and creating a library is easy, we see small atomic packages for small utility programs, over large libraries. This spreads the software supply-chain wider, into smaller teams of maintainers. If the same amount of code is fractured over 50 small libraries maintained by 1-3 people each, the likelihood of one or two becoming abandoned grows.
I've been a bit weary about the dependency and package manager approach more modern languages use. It trades convenience for some pretty scary supply-chain attacks.
I would probably place rust in-between python and node. It is made worse by the relative in-maturity of the ecosystem which contributes to the fragmentation. Rust will likley improve, but I have little faith for Node, as it's cultural as you mention.
Case in point; here is a list of most used crates. Some of these contain a single macro or wrapper. https://lib.rs/std
Any Rust project has endless crates scrolling up the screen.
It has exactly the same culture as npm, which is no wonder, because many folks adopting Rust come from Web background, not C and C++.
Also if it wasn't a problem, articles like this wouldn't exist,
https://tweedegolf.nl/en/blog/104/dealing-with-dependencies-...
(This doesn’t mean Rust doesn’t have a dependency proliferation issue, only that the way you’re substantiating it is misleading: it’s like saying that C has a dependency proliferation issue because libfoo goes from having 3 source files to 5 source files between releases.)
I wonder if safety could be improved a little if private package management was easier than throwing things out in public.
Using a private package manager, intermixing private and public, and substituting arbitrary dependencys with compatible alternatives, i.e. modularity, should be easy. Only then does solving the problem become easy.
What we used to have is a big ball of mud. Modern languages made it easier to decompose a system into components. But what we really want is easily decomposing a system into modular components which is yet unsolved.
I suspect there are other reasons too. There is a cost to fad languages being used. Replicating the ecosystem of libraries around a language is a huge job. Its rare that a language ever gets the same size and quality ecosystem as say C or Java. But the fans of the language will try. This leads to a lot of ported projects and a small number of devs maintaining a huge number of projects. That's a recipe for abandonware. I suspect a lot of student projects too which is also likely to do the same.
(Think about the last time you checked whether the stack of GNU libraries on your Linux desktop were actively maintained. I don’t think anybody thinks about it too hard, because the ecosystem discourages thinking about it!)
1. easy to create
2. easy to produce something with decent quality
3. rust is widely used by a lot of people, including juniors which don't know (yet) that it can be quite a pain to maintain a package and that it comes with some responsibility
4. so small hobby projects now can very easily become widely used dependencies as people looked at them and found them to have decent quality
5. currently "flat" package structure (i.e. no prefixes/grouping by org/project) there has been discussions for improving on it for a long time but it's not quite here yet. This matters as e.g. all "official" tokio packages are named tokio-* but thats also true for most 3rd party packages "specifically made for tokio". So `tokio-tar` is what you would expect the official "tokio" tar package to be named _if there where one_.
---
now the core problem of many unmaintained packages isn't that rust specific
it's just that rust is currently a common go to for young developers not yet burned by maintaining a package, it's supper easy to publish
on the other hand some of the previous "popular early carrier go to languages" had either not had a single "official" repo (Jave) or packaging was/is a quite a pain (python). Through you can find a lot of unmaintained packages in npm too, just it's so much easier to write clean decent looking code in rust that it's more likely that you use one in rust then in JS.
it's used for rewriting CLI utilities with more color by five or so people
2. Completely FOSS, barely any salaried devs, if any they are donation based.
3. Culture for code "reuse" instead of actually coding. Everyone wants it in their own flavour (we have tar, but I kinda want async-oop-tar)
4. Cognitive dissonance between 3 and 1, rusties don't want to succumb and use a standard tar library because of performance (self inflicted performance hit from creating an incompatible ecosystem) or pride (we need a version written in rust). All of this to download software that is probably written in C and from another ecosystem anyways. (An encoding/compression is a signature and tarballs are signature CLinux)
Something's gotta give
Pure BS. If I wrote something in Rust rather than a binding it was because using often Linux based C libs on all Tier 1 platform is as smooth of a process as swimming in shards of glass.
Also even if you aim for DX it is very subjective which causes fragmentation, which causes abandonment and other issues, as mentioned in my comment.
This is a job sir, if it's painful that's to be expected ( and probably unavoidable too, in 30 years the new generation will hate your "smooth" DX)
Sure reuse where possible, but sometimes you need to rewrite.
> which causes abandonment and other issues, as mentioned in my comment.
Did Linux being written in C stop Intel from abandoning it? The abandonment issues mentioned are mostly orthogonal.
C stopping retreat of corporations from the open source space, is about as likely as a paper mache figure will have an effect on the Dark matter distribution. You are suggesting picking prog. languages will have an effect on global economics.
C being unpopular and thus not picked for development is more due to it being a very footgunny language, without modern programming language conveniences. Like package management or linters available out of the box.
That's both an obscure and a complicated OS/Arch combination.
>Intel abandoned Linux
I don't understand, Intel never maintained Linux.
>C being unpopular
Lol
Ok. A more realistic example. I want to develop a Windows game, because that's where the audience is. And I want to develop my game in Rust, because I know it better than C++.
So, I need a tar/zar/mar library that exists on Linux as a C lib or Rust native library. My goal is to finish the game but don't care about performance or even CVEs that much.
> Intel never maintained Linux
They definitely did maintain several drivers, and Clear Linux Distribution.
But I was talking about their overall strategy. They are pivoting to "Intel first" mantra, sacking many Linux driver maintainers.
https://www.theregister.com/2025/10/09/intel_open_source_com...
> Lol
What niche is it popular now that hasn't been devoured by C++, Java and others?
If you build things with wires, diodes, multiplexers, breakers, fuses and keyed connectors there's less maintenance needed than if you try and build a system entirely out of transistors and manually applied insulators.
I haven't looked at the package itself, but was it built on top of the C libraries with like, bindgen?
e: a glance suggests thats not the case, but perhaps they were ported naively by simply cloning the structure without looking at what it was implementing? that's definitely the path of least resistance for this type of thing. On top of that the spec itself is apparently in POSIX, some parts of which are, well, spotty; compared to RFCs
My understanding is that the left-pad incident is not directly analogous, since it involved restoring a deleted package rather than modifying an extant package.
Rust's advantage is that it can prevent logic errors from becoming memory safety vulnerabilities (and separately, its type system makes some - but not all - classes of logic errors more difficult to introduce).
But to your point: yes, it's a good example about how security bugs live at all layers of the stack and that being checked against memory corruption does nothing to prevent you from writing bugs in the semantic space.
It'd be like putting zip in zip
> Target: Python package managers using tokio-tar (e.g., uv). An attacker uploads a malicious package to PyPI. The package's outer TAR contains a legitimate pyproject.toml, but the hidden inner TAR contains a malicious one that hijacks the build backend. During package installation, the malicious config overwrites the legitimate one, leading to RCE on developer machines and CI systems.
It seems to imply that you’re already installing a package uploaded by a malicious entity. Is the vulnerable workflow something like “you manually download the package archive, unpack it with system tar, audit all the files and then run uv install, which will see different files”?
Someone could release a malicious package that looks okay to a scanner tool, but when installed using uv can behave differently, allowing attackers to masquerade executable code.
In addition, for OCI images, it is possible to produce an OCI image that can overwrite layers in the tar file, or modify the index. This could be done in a way that is undetectable by the processor of the OCI image. Similar attacks can be done for tools that download libraries, binaries, or source code using the vulnerable parser, making a tar file that when inspected looks fine but when processed by a vulnerable tool, behaves differently.
I hope that answers your question?
> making a tar file that when inspected looks fine
Am I correct in understanding that manual inspection would reveal a nested .tar archive (so recursive inspection of nested archives should be enough)?
And where is the RCE part ?