undefined

points

by stavros3 hours ago |

comments

by mswphd2 hours ago|

[-]

`unsafe` isn't viral. I can write

fn safe_function(...) -> (...) {

    // do unsafe things here

}

then `safe_function` can be called from safe code, and still trigger UB. This wouldn't be a soundness issue in the rust compiler, but instead a bug in safe_function.

There are many reasons you might want to do that. In particular, it's very common in rust to have a library define some data structure that uses unsafe under-the-hood, but checks whatever invariants it needs to, and provides solely safe methods to external callers. Rust's `String` type is like this: it's (roughly) a `Vec<u8>`, e.g. heap-allocated bytes. It has the additional invariant that these bytes correspond to valid UTF8 though. See for example `push_str_slice`, which (roughly) concatenates 2 strings.

https://doc.rust-lang.org/src/alloc/string.rs.html#1107

It does the following thing

1. reserve enough space for the concatenated string within the source string 2. does some pointer arithmetic and a call to Rust's equivalent to `memcpy` (unsafe) 3. re-casts this pointer to a string object without checking that it's valid utf8 (unsafe).

While these individual calls are unsafe, `push_str_slice` checks that in this particular situation they are safe, so the stdlib authors do not mark `push_str_slice` as unsafe. It has no invariants that must be maintained by external callers.

by ndiddy3 hours ago|

prev|

[-]

If code in an unsafe block triggers undefined behavior, then the assumptions the compiler makes regarding safety will no longer be true, and purely safe code (code with no unsafe blocks) is no longer guaranteed to be safe. This is what's happening in the example the person on Github wrote in the issue.

by weinzierl2 hours ago|

parent|

[-]

Exactly and "[...]and purely safe code (code with no unsafe blocks) is no longer guaranteed to be safe" hits the nail on the head.

I take issue with the phrasing of OP's title: "allows for UB in safe rust". AFAIK there are compiler bugs that allow UB in safe Rust, but this is not what is happening here. We have UB in an unsafe block (which is to be expected) which enables an issue outside in safe code. What is your opinion? Is calling this "UB in safe Rust" justified?

by mswphd2 hours ago|

parent|

[-]

it is, but it's a little confusing here because the library/consumer of the library are the same person.

This is a bug in the library, namely in Bun's PathString implementation. The bug is a soundness issue, precisely because usage of Bun's PathString implementation allows for UB in safe rust. Now this buggy library isn't that big of a concern for the community, because Bun is the only consumer. It's not also an indication of a compiler bug, because Bun's library is implemented using unsafe rust. But the fundamental issue is that usage of Bun's PathString implementation allows for UB in safe rust, and is therefore (clearly) unsound.

by fc417fc80258 minutes ago|

parent|

prev|

[-]

Suppose I initialize something in an unsafe block. I promise the compiler that it's properly initialized, but in reality it isn't. Importantly I never make use of the garbage values in the unsafe block so no UB has occurred - yet.

Later, the garage enters otherwise safe machinery and triggers UB. UB has now happened in safe rust as a result of my earlier contractual violation.

You can extend this example to other scenarios where UB in unsafe begets further UB in safe later on.

by rcxdude3 hours ago|

prev|

[-]

Unsafe blocks are you saying to the compiler 'trust me bro, I know this is safe'. But often that relies on some property of the code being true in order for it to actually be safe. Generally speaking, the expectation in rust is that you either encapsulate the code that enforces whatever property you are relying on behind a safe interface, so that it's not possible for other code to use it unsafely, or that you mark the interface itself unsafe so that it's obvious that the code using that interface needs to maintain that property itself. Rust code that doesn't do this will generally be considered buggy by most rust programmers (e.g. if you find a use of safe interfaces in the stdlib that causes a memory safety violation, then you should file a ticket with the rust team), but this is essentially only a social convention of where the blame lies for a bug, not something that compiler itself can enforce (and, for example, you can violate memory safety in rust with only safe std interface by abusing OS interfaces like /proc/self/mem but this is something that most people don't think can be reasonably fixed). The main reason that rust as a language is better in this regard is that it gives much better tools for being able to express that safe interface without giving up performance and that it has the means to mark and encapsulate this safe/unsafe distinction.

Here's some links on this topic which have some examples:

https://doc.rust-lang.org/nomicon/working-with-unsafe.html https://www.ralfj.de/blog/2016/01/09/the-scope-of-unsafe.htm...