undefined

upvote

points

by jasonjayr2 hours ago |

upvote

by weinzierl1 hours ago|

[-]

WASM has strong tried and proven sandboxing. We basically can build on nearly 30 years of experience. The decoders don't need a lot of access, they can basically be pure functions.

If this will pan out security-wise I don't know. I'm more worried that it will be so slow that no one will use it. Interesting idea, though, and I can see applications outside of the "big data" realm this apparently targets.

reply

upvote

by ok1234561 hours ago|

[-]

How do you prevent compression bomb attacks when files can define their own compression functions?

You could have some kind of OOM killer, but that will be a "footgun" that people who are actually doing "big data" will constantly shoot.

This pretty much kills any ingestion pipeline where the source is untrusted.

reply

upvote

by computomatic1 hours ago|

[-]

It seems like the WASM is simply a fallback if no other decoder is available. If the data source is untrusted, simply don’t run the WASM decoders.

“Some code is untrusted” does not mean code should never be executed. There are more use cases with trusted sources than untrusted.

reply

upvote

by ok1234561 hours ago|

[-]

So I define the data type to be "asdklfjaslkdfjiolsadfjoiusadfoiasfoikasjfdoisadf" and give you a decoder for it.

reply

upvote

by johncolanduoni1 hours ago|

[-]

OOM killing in WebAssembly is trivial, since it’s all in a growable linear memory. All the runtimes I’m aware of have a simple maximum memory setting, and they’ll trap any allocation requests after that point.

reply

upvote

by blmarket1 hours ago|

[-]

Attack is not just on file format itself. Based on the function signature it's possible for a single decoder to generate infinite bytestream - makes a lot of headache to reader implementation - implementing STRLEN is no longer trivial question.

Either engines should put some limit (e.g. VARCHAR(2000) to enforce length to be limited to 2000, but there are some other engines supporting unlimited BLOBs), or decoder should give a hint what is the maximum length it will yield. Unfortunately current research level project does not have such considerations implemented yet...

reply

upvote

by ok1234561 hours ago|

[-]

For images, it makes sense: people dealing with 16k x 16k PNGs are uncommon. Give them an error message that tells them the setting to bump. But what should be the threshold for "big data"? I'm sure it will follow Zipf's Law, but the tail will be fatter.

reply

upvote

by titzer1 hours ago|

[-]

And many of them have built-in gas metering, so you can time out the decode if it runs too many instructions.

reply

upvote

by kibwen1 hours ago|

[-]

Denial-of-service is bad, but it's not in the same ballpark, the same sport, the same planet, or the same universe of bad as RCE.

reply

upvote

by Retr0id1 hours ago|

[-]

WASM implementations are fairly mature now, but if there was e.g. an image file format with embedded WASM that needed to execute before you could view it, it would become the new low-hanging-fruit target for 0-click RCEs - whether it's exploiting the WASM engine itself or some other attack surface that's influenceable via it (See also, the FORCEDENTRY JBIG2 exploit).

reply

upvote

by titzer1 hours ago|

[-]

That exploit targeted an integer overflow in a bespoke Apple sandboxing mechanism. Bespoke sandboxing mechanisms have weird bugs.

Not that Wasm engines don't have bugs, but the whole point is to have an extremely solid, well-specified and efficient implementation of a widely accepted bytecode format. We can scope down the capabilities given to any program to a minimal set.

reply

upvote

by Retr0id1 hours ago|

[-]

Bugs are near-inevitable, and mitigations are the last line of defence. Scripting engines are excellent for bypassing mitigations (iiuc in the case of the FORCEDENTRY exploit, it was used for adjusting ASLR'd offsets).

As a random example that's an area of personal interest to me, I know of 3 distinct methods of achieving userland ROP execution of the Nintendo Switch 2, and all three rely on the (ab)use of a scripting engine (even if they aren't a vulnerability in the scripting engine itself).

reply

upvote

by titzer1 hours ago|

[-]

Well don't accept code from anyone ever then.

But seriously, if your format requires extensibility to the point that it embeds a bytecode, especially a Turing-complete bytecode, what format are you going to choose? Just design a new one? That's how you end up with a scripting engine with three ROP exploits.

reply

upvote

by Kiboneu1 hours ago|

[-]

> WASM has strong tried and proven sandboxing. We basically can build on nearly 30 years of experience. The decoders don't need a lot of access, they can basically be pure functions.

I've heard that kind of sentiment many times before. It's not a good (thought-terminating) mindset to have for any secure software.

There are several WASM implementations, WASM is just a format. "Pure functions" are pure at a superficial level. Many people say that they don't mutate global state, but they do ... it's just hidden. The decoders "not needing a lot of access" doesn't matter if the WASM engine is pwned through arbitrary code execution inside the environment, or if it's contorted to bypass the access control you are mentioning through various side-effects.

reply

upvote

by bilekas1 hours ago|

[-]

> The decoders don't need a lot of access, they can basically be pure functions

They don't currently either do they? It's the tight coupling of the interface layer no? I'm not sure this would be faster, or more secure so reliability might be the best usecase?

reply

upvote

by arcfour2 hours ago|

[-]

Yes...my first thought. No way in hell anyone actually trusts this.

(And as if we didn't trust the compiler enough already!)

reply

upvote

by Omega3591 hours ago|

[-]

Meh, it's not that bad. Pretty simple to block inline wasm and to use well known external decoders.

reply

upvote

by nine_k2 hours ago|

[-]

Does WASM have built-in I/O? If not, all that a decoder would be able to do is to decode into a buffer.

reply

upvote

by 0x4571 hours ago|

[-]

All WASM can do is transfer bag of bytes between module runtime and host. So yes, so yeah it can just decode into a buffer. Even you use wasm components to give it I/O, you can still make these go to buffer.

reply

upvote

by doctorpangloss2 hours ago|

[-]

But the WASM runs in the sandbox! It only has access to some files, your display, inputs, ... nothing insecure at all!

reply

upvote

by gavinray1 hours ago|

[-]

WASM runs in a confined memory space allocated for the program. There is no I/O or host address space access.

You need to run a WASI environment for that.

reply