related paragraph from Wikipedia:
Modern backpropagation was first published by Seppo Linnainmaa as "reverse mode of automatic differentiation" (1970)[26] for discrete connected networks of nested differentiable functions.[27][28][29]
In 1982, Paul Werbos applied backpropagation to MLPs in the way that has become standard.
Both papers are direct applications of the chain rule applied to estimate the gradient of a multivariate function.
Name a single aspect of something modern like the Transformer architecture or how it is trained, that is even indirectly attributable to Schmidhuber.
No doubt he'd be jumping up and down wanting to take credit for residual connections, but where was Schmidhuber in the ImageNet era when everyone else was discovering how to build deep neural nets? Why didn't Schmidhuber invent ResNets, but instead waited until someone else (Kaiming He) did, then claim credit for it?
I'll bet Schmidhuber isn't done with yet ... when someone eventually comes up with an architecture for AGI, Schmidhuber will come out of the woodwork and point to a note he made on a napkin in 1800 that predicted it all.