upvote
It seems like the WASM is simply a fallback if no other decoder is available. If the data source is untrusted, simply don’t run the WASM decoders.

“Some code is untrusted” does not mean code should never be executed. There are more use cases with trusted sources than untrusted.

reply
So I define the data type to be "asdklfjaslkdfjiolsadfjoiusadfoiasfoikasjfdoisadf" and give you a decoder for it.
reply
OOM killing in WebAssembly is trivial, since it’s all in a growable linear memory. All the runtimes I’m aware of have a simple maximum memory setting, and they’ll trap any allocation requests after that point.
reply
Attack is not just on file format itself. Based on the function signature it's possible for a single decoder to generate infinite bytestream - makes a lot of headache to reader implementation - implementing STRLEN is no longer trivial question.

Either engines should put some limit (e.g. VARCHAR(2000) to enforce length to be limited to 2000, but there are some other engines supporting unlimited BLOBs), or decoder should give a hint what is the maximum length it will yield. Unfortunately current research level project does not have such considerations implemented yet...

reply
For images, it makes sense: people dealing with 16k x 16k PNGs are uncommon. Give them an error message that tells them the setting to bump. But what should be the threshold for "big data"? I'm sure it will follow Zipf's Law, but the tail will be fatter.
reply
And many of them have built-in gas metering, so you can time out the decode if it runs too many instructions.
reply
Denial-of-service is bad, but it's not in the same ballpark, the same sport, the same planet, or the same universe of bad as RCE.
reply