I have seen a pure C/C++ implementation of coroutines (it used setjmp/longjmp, and memcpy to copy stacks in and out of the native arena). Not the most portable of constructions, but it worked absurdly well.
Being able to write "async" code essentially in-line is a superpower.