Tarmageddon: RCE vulnerability highlights challenges of open source abandonware

upvote

Tarmageddon: RCE vulnerability highlights challenges of open source abandonware

(edera.dev)

95 points

by vsgherzi5 days ago |

upvote

by woodruffw2 days ago|

[-]

I triaged this vulnerability from Astral's side, so I wanted to make a clarificatory point that's also present in the advisory[1]: parser differentials can be extremely bad, but the risk to the Python ecosystem in this particular case is moderated by the fact that tar is only used with source distributions, which already possess arbitrary code execution at resolution/install time by design.

In other words: this is an obfuscation vector within Python packaging, but it doesn't grant the attacker a novel privilege.

(This doesn't detract from the overall severity of the bug itself: there are plenty of ecosystems and contexts where this is a serious issue, and there's no easy way to assert that they aren't affected by the bug in this family of async-tar packages. Edera has done an excellent job of highlighting this, and I thank them for their disclosure!)

[1]: https://github.com/astral-sh/tokio-tar/security/advisories/G...

reply

upvote

by alexzenla1 days ago|

[-]

Hi! I’m the CTO of Edera and discovered this bug with my colleague Steven!

The story of this bug is interesting. We were both up late at night working on GPU support on the Edera platform, and we had just pulled an NVIDIA container image. What should have resulted in a temporary directory of tar files for OCI layers was filled with NVIDIA library files! We were both super confused until I had an “oh god no” moment and realized this happened.

We kicked right into action on responsible disclosure.

I can answer any questions, but I want to send a huge thank you to our team for working together on this and to Astral for being wonderful to work with!

reply

upvote

by TealBLAHAJ6 days ago|

[-]

The scary part isn’t the bug — it’s how fragile the patching process becomes once a crate goes unmaintained.

reply

upvote

by hsbauauvhabzb2 days ago|

[-]

Package management systems are scary before packages are abandoned too. Your production infrastructure is trusting some random developer/s to both do the right thing and not get hacked.

That’s not to say oss cannot be trusted, but it certainly makes trusting smaller projects and packages scary.

reply

upvote

by cozzyd2 days ago|

[-]

In principle "traditional" curated Linux distro package systems will patch stuff even if upstream is unresponsive.

reply

upvote

by hsbauauvhabzb2 days ago|

[-]

Sorry I should have clarified that I was referring to language based systems (cargo, pip, npm, etc). But you do raise a good point, it’s less about the concept of package management and more around the point of curation and central security guarantees / policies / procedures. In theory RHEL package management system could have similar problems to cargo or npm, but they are much better funded and thus managed.

reply

upvote

by ajross1 days ago|

[-]

In practice, not principle. Virtually every non-trivial upstream package in debian/fedora/arch/whatever has at least a handful of distro-specific patches. Sometimes they're just configuration, sometimes they're distro-maintained security fixes, etc...

But people exercise those features regularly and distros are not shy about maintaining software. It's a very different world from "We Just Ship What They Give Us" in npm/cargo/etc...

reply

upvote

by mook1 days ago|

[-]

There's plenty of open source things from Google and Microsoft that's been abandoned too; so you'd need to evaluate the project independently of the sponsor.

This doesn't apply to close source things because you wouldn't be able to use it in the first place.

reply

upvote

by BobbyTables21 days ago|

[-]

I really hate it when various packages expect users to add their custom repo. Especially for something where I don’t care about updates.

Feels like every little thing should be in its own docker container with limited filesystem access. Of course that is a whole lot of trouble…

The dependency trees in cargo/pip also greatly bother me.

VS Code extensions are also under appreciated. Some turd makes a “starter pack” for rust/python/etc with a great set of common extensions… plus a few that nobody has heard of… Over time, they reach 50k-100k downloads and start to appear legit… Excellent way to exfiltrate trade secrets!!!

reply

upvote

by bn-l2 days ago|

[-]

From my experience abandoned repos is common in rust. Why is this?

reply

upvote

by aDyslecticCrow2 days ago|

[-]

I personally suspect it's an effect of the over reliance of the package manager approach to software development that rust and a few other languages use, which itself is an unintended to consequence of a well designed library import system.

Languages where importing a library is hard, libraries tend to grow quite large. Large libraries have larger backing, more established development and security protocols. When OpenCV, TinyUSB, Numpy, nimBLE start to struggle, it's easier to notice and companies relying on them may step up to fork, maintain of fund its continued use.

Languages where importing and creating a library is easy, we see small atomic packages for small utility programs, over large libraries. This spreads the software supply-chain wider, into smaller teams of maintainers. If the same amount of code is fractured over 50 small libraries maintained by 1-3 people each, the likelihood of one or two becoming abandoned grows.

I've been a bit weary about the dependency and package manager approach more modern languages use. It trades convenience for some pretty scary supply-chain attacks.

reply

upvote

by kibwen2 days ago|

[-]

But Rust doesn't have an abundance of small/atomic libraries. This is describing NPM, not crates.io. Rust doesn't have the same culture as Node.

reply

upvote

by eqvinox1 days ago|

[-]

There might be fewer small crates, but I've still seen massive dependency lists. Case in point: tokio. How many "related" crates do you deal with for various features? 5? 50? It's not great.

reply

upvote

by aDyslecticCrow1 days ago|

[-]

It's a range, but rust is closer to Node than it is to c and c++ in this culture.

I would probably place rust in-between python and node. It is made worse by the relative in-maturity of the ecosystem which contributes to the fragmentation. Rust will likley improve, but I have little faith for Node, as it's cultural as you mention.

Case in point; here is a list of most used crates. Some of these contain a single macro or wrapper. https://lib.rs/std

reply

upvote

by pjmlp1 days ago|

[-]

Say what?

Any Rust project has endless crates scrolling up the screen.

It has exactly the same culture as npm, which is no wonder, because many folks adopting Rust come from Web background, not C and C++.

reply

upvote

by kibwen1 days ago|

[-]

No, this is mistaken. A crate is both rustc's unit of compilation and Cargo's unit of dependency. Feel free to lament this conflation all you like, but what it means is that the common pattern is for library authors to publish large crates that are internally split into smaller crates for better compilation parallelism. This just isn't a problem by any metric; it doesn't increase your trusted computing base. That's not at all the same problem as NPM has with its web of frivolous micro-dependencies.

reply

upvote

by pjmlp1 days ago|

[-]

It is surely a problem given the cumulative time everyone waits compiling them distributed across Rust users.

Also if it wasn't a problem, articles like this wouldn't exist,

https://tweedegolf.nl/en/blog/104/dealing-with-dependencies-...

reply

upvote

by woodruffw1 days ago|

[-]

This is backwards: per GP, the cumulative time would be greater if there were fewer crates, because of how incremental and parallel compilation work in Rust.

(This doesn’t mean Rust doesn’t have a dependency proliferation issue, only that the way you’re substantiating it is misleading: it’s like saying that C has a dependency proliferation issue because libfoo goes from having 3 source files to 5 source files between releases.)

reply

upvote

by normie30002 days ago|

[-]

I haven't seen this perspective before but it's a very elegant explanation for why e.g. npm is so much scarier than maven.

I wonder if safety could be improved a little if private package management was easier than throwing things out in public.

reply

upvote

by Veserv2 days ago|

[-]

Newer languages have made packaging and importing dependencys significantly easier, but have done this while increasing coupling and making switching dependencys harder. This results in brittle dependency trees.

Using a private package manager, intermixing private and public, and substituting arbitrary dependencys with compatible alternatives, i.e. modularity, should be easy. Only then does solving the problem become easy.

What we used to have is a big ball of mud. Modern languages made it easier to decompose a system into components. But what we really want is easily decomposing a system into modular components which is yet unsolved.

reply

upvote

by hunterpayne2 days ago|

[-]

I suspect because a lot of Rust library projects are just clones of a project (usually in C) translated to Rust. This is a good way to learn Rust. However, after you learn it, why maintain it? There isn't really much incentive. Add to that the politics of the Rust ecosystem and it causes churn in the dev population.

I suspect there are other reasons too. There is a cost to fad languages being used. Replicating the ecosystem of libraries around a language is a huge job. Its rare that a language ever gets the same size and quality ecosystem as say C or Java. But the fans of the language will try. This leads to a lot of ported projects and a small number of devs maintaining a huge number of projects. That's a recipe for abandonware. I suspect a lot of student projects too which is also likely to do the same.

reply

upvote

by woodruffw2 days ago|

[-]

I think the null hypothesis would be that they’re no more common in Rust, but that Rust’s low-friction packaging ecosystem makes them more apparent to you than they would be in C or C++.

(Think about the last time you checked whether the stack of GNU libraries on your Linux desktop were actively maintained. I don’t think anybody thinks about it too hard, because the ecosystem discourages thinking about it!)

reply

upvote

by kibwen2 days ago|

[-]

Compared to C, Rust makes it easy to push stuff to a place that people will find it. That's the major difference. With C, people push their half-finished projects to Github where they drown in Github's poor discovery; in Rust they push their half-finished stuff to crates.io where people have more than half a chance to actually find it via a casual search.

reply

upvote

by dathinab2 days ago|

[-]

(no specific order)

1. easy to create

2. easy to produce something with decent quality

3. rust is widely used by a lot of people, including juniors which don't know (yet) that it can be quite a pain to maintain a package and that it comes with some responsibility

4. so small hobby projects now can very easily become widely used dependencies as people looked at them and found them to have decent quality

5. currently "flat" package structure (i.e. no prefixes/grouping by org/project) there has been discussions for improving on it for a long time but it's not quite here yet. This matters as e.g. all "official" tokio packages are named tokio-* but thats also true for most 3rd party packages "specifically made for tokio". So `tokio-tar` is what you would expect the official "tokio" tar package to be named _if there where one_.

---

now the core problem of many unmaintained packages isn't that rust specific

it's just that rust is currently a common go to for young developers not yet burned by maintaining a package, it's supper easy to publish

on the other hand some of the previous "popular early carrier go to languages" had either not had a single "official" repo (Jave) or packaging was/is a quite a pain (python). Through you can find a lot of unmaintained packages in npm too, just it's so much easier to write clean decent looking code in rust that it's more likely that you use one in rust then in JS.

reply

upvote

by oguz-ismail2 days ago|

[-]

> rust is widely used by a lot of people

it's used for rewriting CLI utilities with more color by five or so people

reply

upvote

by TZubiri2 days ago|

[-]

1. Radical new paradigm that critiques and disregards most of the traditional infrastructure.

2. Completely FOSS, barely any salaried devs, if any they are donation based.

3. Culture for code "reuse" instead of actually coding. Everyone wants it in their own flavour (we have tar, but I kinda want async-oop-tar)

4. Cognitive dissonance between 3 and 1, rusties don't want to succumb and use a standard tar library because of performance (self inflicted performance hit from creating an incompatible ecosystem) or pride (we need a version written in rust). All of this to download software that is probably written in C and from another ecosystem anyways. (An encoding/compression is a signature and tarballs are signature CLinux)

Something's gotta give

reply

upvote

by Ygg22 days ago|

[-]

> rusties don't want to succumb and use a standard tar library because of performance (self inflicted performance hit from creating an incompatible ecosystem) or pride (we need a version written in rust).

Pure BS. If I wrote something in Rust rather than a binding it was because using often Linux based C libs on all Tier 1 platform is as smooth of a process as swimming in shards of glass.

reply

upvote

by TZubiri14 hours ago|

[-]

While developer experience is obviously a positive, in tradeoff engineering it ranks very low. It is even the sign of a narcissistic developer who values their experience over the user's.

Also even if you aim for DX it is very subjective which causes fragmentation, which causes abandonment and other issues, as mentioned in my comment.

This is a job sir, if it's painful that's to be expected ( and probably unavoidable too, in 30 years the new generation will hate your "smooth" DX)

reply

upvote

by Ygg214 hours ago|

[-]

This isn't just a developer experience. It's user experience as well. As in "shit doesn't work on my Windows Arm."

Sure reuse where possible, but sometimes you need to rewrite.

> which causes abandonment and other issues, as mentioned in my comment.

Did Linux being written in C stop Intel from abandoning it? The abandonment issues mentioned are mostly orthogonal.

C stopping retreat of corporations from the open source space, is about as likely as a paper mache figure will have an effect on the Dark matter distribution. You are suggesting picking prog. languages will have an effect on global economics.

C being unpopular and thus not picked for development is more due to it being a very footgunny language, without modern programming language conveniences. Like package management or linters available out of the box.

reply

upvote

by TZubiri11 hours ago|

[-]

>Windows Arm

That's both an obscure and a complicated OS/Arch combination.

>Intel abandoned Linux

I don't understand, Intel never maintained Linux.

>C being unpopular

Lol

reply

upvote

by Ygg29 hours ago|

[-]

> That's both an obscure and a complicated OS/Arch combination.

Ok. A more realistic example. I want to develop a Windows game, because that's where the audience is. And I want to develop my game in Rust, because I know it better than C++.

So, I need a tar/zar/mar library that exists on Linux as a C lib or Rust native library. My goal is to finish the game but don't care about performance or even CVEs that much.

> Intel never maintained Linux

They definitely did maintain several drivers, and Clear Linux Distribution.

But I was talking about their overall strategy. They are pivoting to "Intel first" mantra, sacking many Linux driver maintainers.

https://www.theregister.com/2025/10/09/intel_open_source_com...

> Lol

What niche is it popular now that hasn't been devoured by C++, Java and others?

reply

upvote

by meltyness2 days ago|

[-]

Optimistically because the component was considered self-contained, and done?

If you build things with wires, diodes, multiplexers, breakers, fuses and keyed connectors there's less maintenance needed than if you try and build a system entirely out of transistors and manually applied insulators.

I haven't looked at the package itself, but was it built on top of the C libraries with like, bindgen?

e: a glance suggests thats not the case, but perhaps they were ported naively by simply cloning the structure without looking at what it was implementing? that's definitely the path of least resistance for this type of thing. On top of that the spec itself is apparently in POSIX, some parts of which are, well, spotty; compared to RFCs

reply

upvote

by lvass1 days ago|

[-]

Why didn't crates.io maintainers apply the patch themselves? NPM does meddle with packages when an incident happens like they did with left-pad.

reply

upvote

by woodruffw1 days ago|

[-]

I think that would be pretty disruptive, and would break some assumptions around crate integrity that are deeply held.

My understanding is that the left-pad incident is not directly analogous, since it involved restoring a deleted package rather than modifying an extant package.

reply

upvote

by eviks1 days ago|

[-]

Do you have a more relevant example of meddling besides a binary block/publish?

reply

upvote

by jedisct12 days ago|

[-]

The state of the Rust crates ecosystem https://00f.net/2025/10/17/state-of-the-rust-ecosystem/

reply

upvote

by huflungdung2 days ago|

[-]

[dead]

reply

upvote

by greatgib2 days ago|

[-]

I'm so confused, I was thinking that we were rewriting everything from C to Rust because Rust was a "safe language" preventing vulnerabilities...

reply

upvote

by woodruffw2 days ago|

[-]

To the best of my knowledge, nobody has ever seriously claimed that Rust (or any other general purpose programming language) can fully prevent logic errors.

Rust's advantage is that it can prevent logic errors from becoming memory safety vulnerabilities (and separately, its type system makes some - but not all - classes of logic errors more difficult to introduce).

reply

upvote

by ajross1 days ago|

[-]

This doesn't appear to be a memory safety bug. It's a data handling error, and the "RCE" in question is that the tar code can be fooled by a malicious tarball into writing files with arbitrary permissions at arbitrary paths (which is... actually something all tarballs can do, so I'm not really following why this is being treated as severe).

But to your point: yes, it's a good example about how security bugs live at all layers of the stack and that being checked against memory corruption does nothing to prevent you from writing bugs in the semantic space.

reply

upvote

by SoftTalker1 days ago|

[-]

iow it's a bug that can corrupt your data, but it does it safely.

reply

upvote

by kobebrookskC31 days ago|

[-]

if it was in c, who knows how many memory corruption bugs would be found before they stumbled upon this...

reply

upvote

by whatevaa1 days ago|

[-]

If it was C, 99% it would have been an code execution.

reply

upvote

by whatevaa1 days ago|

[-]

Why does TAR even have nested archives? I though bare TAR is very simple format, now apparently there is some nested crap in there... oh the format overengineering.

reply

upvote

by __s1 days ago|

[-]

Not inherent to format. Issue is just header parser ends up reading file contents, so putting tar file as file contents confuses things

It'd be like putting zip in zip

reply

upvote

by denhamparry6 days ago|

[-]

I'm from Edera. If you have any questions please send them our way

reply

upvote

by goldsteinq5 days ago|

[-]

Hi! Could you elaborate on the first attack scenario?

> Target: Python package managers using tokio-tar (e.g., uv). An attacker uploads a malicious package to PyPI. The package's outer TAR contains a legitimate pyproject.toml, but the hidden inner TAR contains a malicious one that hijacks the build backend. During package installation, the malicious config overwrites the legitimate one, leading to RCE on developer machines and CI systems.

It seems to imply that you’re already installing a package uploaded by a malicious entity. Is the vulnerable workflow something like “you manually download the package archive, unpack it with system tar, audit all the files and then run uv install, which will see different files”?

reply

upvote

by denhamparry5 days ago|

[-]

Thanks for the question!

Someone could release a malicious package that looks okay to a scanner tool, but when installed using uv can behave differently, allowing attackers to masquerade executable code.

In addition, for OCI images, it is possible to produce an OCI image that can overwrite layers in the tar file, or modify the index. This could be done in a way that is undetectable by the processor of the OCI image. Similar attacks can be done for tools that download libraries, binaries, or source code using the vulnerable parser, making a tar file that when inspected looks fine but when processed by a vulnerable tool, behaves differently.

I hope that answers your question?

reply

upvote

by goldsteinq4 days ago|

[-]

So the first scenario is also basically “automatic scanner bypass”? That answers my question, yes.

> making a tar file that when inspected looks fine

Am I correct in understanding that manual inspection would reveal a nested .tar archive (so recursive inspection of nested archives should be enough)?

reply

upvote

by denhamparry4 days ago|

[-]

It is possible to exploit this bug by crafting a file that has tar contents without a header, thus making it hard to detect even with recursive archives.

reply

upvote

by dang2 days ago|

[-]

(this subthread was originally in https://news.ycombinator.com/item?id=45656335 before we merged the threads)

reply

upvote

by zahlman2 days ago|

[-]

Since this came up specifically for `uv` (i.e. since the Python ecosystem relies on source distributions packaged as .tar.gz): has the Python standard library implementation (which is used by pip) been checked for a similar vulnerability?

reply

upvote

by denhamparry1 days ago|

[-]

It is unlikely to have the bug as it sees more use, but it is worth checking. There have been previous CVEs with Pythons tar module.

reply

upvote

by hulitu1 days ago|

[-]

> Tarmageddon: RCE vulnerability highlights challenges of open source abandonware

And where is the RCE part ?

reply