Maybe its better to think about this in the reverse, where C and C++ has 'defined behavior', but unsafe rust intentionally does not, its just whatever the complier and platform lets you get away with. Ultimately its still just a computer which stores values in memory and jumps to subroutines.
Undefined behavior is everything else. C and C++ are relatively unique in that their standards explicitly say "combining these constructs in this way is undefined", and we call those cases explicit UB. There's also a larger universe of implicit UB that standards omit. Most (all?) languages have implicit UB, even if they lack the explicit stuff. What happens when you get ENOMEM is a common one.
Rust does something similar to C/C++ and lists a bunch of UB that's only possible with incorrect code in unsafe blocks. Correct code placed in an unsafe block remains defined, as does code without unsafe (up to compiler/language bugs).
The goal of a library is to provide the encapsulation such that the unsafety doesn't spread.
If undefined behavior occurs, the fault lies with whoever wrote `unsafe { ... }` in the body of a function. If I write "unsafe" in order to call an unsafe library function, and I don't meet the library function's pre-requisites, then it's my fault. If the library internally writes "unsafe" in order while providing a safe wrapper, and I never actually wrote `unsafe { ... }`. If neither I nor the library wrote `unsafe { ... }`, then it is the fault of the compiler.
Using "in safe Rust" means that `unsafe` doesn't occur either in the user code nor in the library. In this context, since we've heard how many uses of `unsafe { ... }` exist in the Bun rewrite, I'd read "in safe Rust" to mean "without calling any functions marked as unsafe".
In this context "UB" means something different than how you're using it. The UB being mentioned here is the "nasal demons" form, i.e., programs which contain undefined behavior have no defined meaning according to the language semantics.
What you're talking about is probably better described in this context as "unspecified behavior", which is behavior that the language standard does not mandate but does not render programs meaningless. For example, IIRC in C++ the order in which g(), h(), and i() are evaluated in f(g(), h(), and i()) is unspecified - an implementation can pick any order, and the order doesn't have to be consistent, but no matter the order the program is valid (approximately speaking).
So this "unspecified behavior" might turn into the more nasal demon type when g(), h() and i() share mutable state and assume some particular sequential order of execution. No?
But sure, if you're writing C++ and (for example) g is depended on for initialization of pointed memory that the other two consume you could end up with UB. But if you're writing Java then no, you will not end up with UB just buggy code.
Unsafe Rust allows you to tell the compiler “hold my beer”. It’s a concession to the reality that the normal restrictions of Rust disallow some semantically valid programs that you might otherwise want to write. The safeguards work great in most cases, but in some they’re overly restrictive.
In practice, the overwhelming majority of code is able to be written in safe Rust and the compiler can have your back. The majority of the rest is for performance reasons, interacting with external functions like C libraries over FFI, or expressing semantics that safe Rust struggles with (e.g., circular references).
fn safe_function(...) -> (...) {
// do unsafe things here
}then `safe_function` can be called from safe code, and still trigger UB. This wouldn't be a soundness issue in the rust compiler, but instead a bug in safe_function.
There are many reasons you might want to do that. In particular, it's very common in rust to have a library define some data structure that uses unsafe under-the-hood, but checks whatever invariants it needs to, and provides solely safe methods to external callers. Rust's `String` type is like this: it's (roughly) a `Vec<u8>`, e.g. heap-allocated bytes. It has the additional invariant that these bytes correspond to valid UTF8 though. See for example `push_str_slice`, which (roughly) concatenates 2 strings.
https://doc.rust-lang.org/src/alloc/string.rs.html#1107
It does the following thing
1. reserve enough space for the concatenated string within the source string 2. does some pointer arithmetic and a call to Rust's equivalent to `memcpy` (unsafe) 3. re-casts this pointer to a string object without checking that it's valid utf8 (unsafe).
While these individual calls are unsafe, `push_str_slice` checks that in this particular situation they are safe, so the stdlib authors do not mark `push_str_slice` as unsafe. It has no invariants that must be maintained by external callers.
I take issue with the phrasing of OP's title: "allows for UB in safe rust". AFAIK there are compiler bugs that allow UB in safe Rust, but this is not what is happening here. We have UB in an unsafe block (which is to be expected) which enables an issue outside in safe code. What is your opinion? Is calling this "UB in safe Rust" justified?
This is a bug in the library, namely in Bun's PathString implementation. The bug is a soundness issue, precisely because usage of Bun's PathString implementation allows for UB in safe rust. Now this buggy library isn't that big of a concern for the community, because Bun is the only consumer. It's not also an indication of a compiler bug, because Bun's library is implemented using unsafe rust. But the fundamental issue is that usage of Bun's PathString implementation allows for UB in safe rust, and is therefore (clearly) unsound.
Later, the garage enters otherwise safe machinery and triggers UB. UB has now happened in safe rust as a result of my earlier contractual violation.
You can extend this example to other scenarios where UB in unsafe begets further UB in safe later on.
Here's some links on this topic which have some examples:
https://doc.rust-lang.org/nomicon/working-with-unsafe.html https://www.ralfj.de/blog/2016/01/09/the-scope-of-unsafe.htm...
And it's not like Bun when written in Zig has been a beacon of stability either. It has been segfaults all over the place.