upvote
>The website claims it can escape "Kubernetes / container clusters" and "CI runners & build farms" but I don't see anything supporting the claim it can escape a container

they state that the write-up is forthcoming. presumably there is some additional steps or modifications that will be detailed in the 'part 2'.

"Next: "From Pod to Host," how Copy Fail escapes every major cloud Kubernetes platform."

reply
This is correct. The container escape exploit and writeup is not yet released.
reply
Opus 4.7 it if you can't wait
reply
It overwrites bytes in memory of any file you can read. It's not hard to imagine how it could escape a lot of things.
reply
The 2017 claim is based on the vulnerability having been introduced in this commit in the second half of 2017: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

The details will depend on whether the kernel is a newer release or a maintenance version of an older release.

reply
> They also claim their script "roots every Linux distribution shipped since 2017.", but only tested four; and it doesn't work on Alpine

They've done themselves no favours at all with their write up.

It does seem legitimate (I was able to use the PoC on a 24.04 instance), and seems like it should be a big deal, but the actual number of affected distributions seems way lower, and not even remotely as per their claim every distribution since 2017.

For example with Ubuntu, if I'm reading it right there's some impact in 16.04 (EOL), but then at least as per their analysis, only the vendor specific 6.17 kernels they ship that have it (e.g. linux-gcp, linux-oracle-6.7 etc.). That's a relatively new kernel version they started shipping recently, after it was released upstream last September.

reply
i mean, it doesn't work on any SELinux, but it's still quite severe anyhow
reply
Have you got any info about this. 'seinfo -c' shows there is an alg_socket class. I presume this permission is required to be able to create an AF_ALG socket:

    $ sesearch -A -c alg_socket -p createallow bluetooth_t bluetooth_t:alg_socket { accept append bind connect create getattr getopt ioctl listen lock read setattr setopt shutdown write };
    allow container_device_plugin_init_t container_device_plugin_init_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_device_plugin_t container_device_plugin_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_device_t container_device_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_engine_t container_engine_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_init_t container_init_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_kvm_t container_kvm_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_logreader_t container_logreader_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_logwriter_t container_logwriter_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_t container_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow container_userns_t container_userns_t:alg_socket { accept append bind connect create getattr getopt ioctl lock map read setattr setopt shutdown write };
    allow openshift_app_t openshift_app_t:alg_socket { append bind connect create getattr getopt ioctl lock read setattr setopt shutdown write };
    allow openshift_t openshift_t:alg_socket { append bind connect create getattr getopt ioctl lock read setattr setopt shutdown write };
    allow spc_t unlabeled_t:alg_socket { append bind connect create getattr getopt ioctl lock read setattr setopt shutdown write };
    allow staff_t staff_t:alg_socket { append bind connect create getopt ioctl lock read setattr setopt shutdown write };
    allow sysadm_t sysadm_t:alg_socket { accept append bind connect create getopt ioctl listen lock read setattr setopt shutdown write };
    allow unconfined_domain_type domain:alg_socket { accept append bind connect create getattr getopt ioctl listen lock map name_bind read recv_msg recvfrom relabelfrom relabelto send_msg sendto setattr setopt shutdown write };
    allow user_t user_t:alg_socket { append bind connect create getopt ioctl lock read setattr setopt shutdown write };
... that's a lot of domains, including container_t and user_t; and obviously anything unconfined_t can't be expected to be restricted.

(Maybe you & others are specifically thinking of Android's policy?)

reply
If you can get to real UID 0 from a rootless container, you can escape it, but you do need to take extra steps. Same with it working on Alpine: the underlying vulnerability probably still exists, but the script might need some adjusting. It's a PoC, not a full exploit for every situation.
reply
It's worth pointing out that you cannot, definitionally, get "real UID 0" in a "rootless" container, because then it wouldn't be a rootless container. This is relevant because this exploit doesn't claim to be able to bypass user namespaces, and that getting "real UID 0" would be a different exploit.
reply
The underlying exploit allows writing arbitrary values to the page cache, independent of any namespacing, so it should be assumed to allow container escapes even if the given PoC code doesn't do that.
reply
That's fair (although it doesn't have anything to do with getting "real root" in a userns in that case). I guess one approach would be something like modifying the host's logrotate binary and waiting for it to trigger, or something like that. Would escape the container to root on the host directly. I imagine it wouldn't be a sure thing to pull off, either, but definitely straightforward enough that any APT should be asking Claude to develop it.
reply
deleted
reply
Kubernetes 1.33 switches to user namespaces enabled by default, which I imagine is the same underlying mechanism that rootless Podman uses. `hostUsers: false` is the way to ensure that root in the pod is root on the host. It's trivial for a real (unmapped) root to escape a Kubernetes pod.
reply
Their PoC does as you say, but is built upon arbitrary modification of the page cache, which could be abused for the other things
reply
Ah indeed, it can be used to overwrite the page cache for files on read-only volumes.
reply
Did you try it on systems that don't have the patch already? Seems many distributions already shipped kernels with the patch ~a month ago.
reply
Yes. Alpine in rootless Podman doesn't work (after replacing "/usr/bin/su" with "/bin/su" in the .py, running the .py just doesn't do anything) while it does in Debian in rootless Podman on the same host.
reply
It also doesn't work on Raspberry Pi, though presumably it could easily be made to; it does replace the su binary, but the replacement is not executable.
reply
It's patching the binary in memory, so the binary patch would be architecture dependent. The existing one is only x86_64, but with an updated payload, it would work on arm.
reply
this is because the `su` binary is replaced with x86 shellcode, replace it with aarch64 and it will work just the same.
reply
there is a PoC floating around for Alpine.
reply
[flagged]
reply