It would work just like a discrete GPU when doing CPU+GPU inference: you'd run a few shared layers on the discrete GPU and place the rest in unified memory. You'd want to minimize CPU/GPU transfers even more than usual, since a Thunderbolt connection only gives you equivalent throughput to PCIe 4.0 x4.
How big a bottleneck is Thunderbolt 5 compared to an SSD? Is the 120 Gbps mode only available when linked to a monitor?
That's why all the projects streaming models into the GPU from an SSD popped up recently.