RL post-training alters the parameters of the transformer, while your f(manifold) idea seems to suggest that a new layer on top would suffice, no need to alter the transformer itself at all.
It would be extremely handy if that were so, but I'm guessing it isn't, or it would be the prevailing approach.
Worth noting a different manifold "exists" after each transformation (e.g. layer). You only sample from the same manifold when you apply the same transformation(s).