I'm not sure Rust's `async fn` desugaring (which involves a data structure for the state machine) is inlineable. (To be precise: maybe the desugared function can be inlined, but LLVM isn't allowed to change the data structure, so there may be extra setup costs, duplicate `Waker`s, etc.) It's probably true that there is a performance cost. But I agree with the article's point that it's generally insignificant.
For non-async fns, the article already made this point:
> In release mode, with optimizations enabled, the compiler will often inline small extracted functions automatically. The two versions — inline and extracted — can produce identical assembly.