We need local AI ASAP.
This time, however it's even worse, because it'll be a really long time until either we get consumer GPUs with enough VRAM for full models or LLMs that fit in 16-32GB capable enough to compete with cloud providers.
I run locally qwen3.6 27b on my 3090 and it's really impressive for what it is, but it is still generations away from being capable of delivering a level of quality that we can confidently default to solo drive them on a daily basis.
That is an excellent idea, once we, the GPU-poor mice, figure out who is going to bell the SoTA training cat. Chinese models being banned is well within the realms of lobbied possibilities.
The real game would be to put a “nothing of interest here” prompt injection attack in the original series of prompts so a LLM parsing them later would ignore the attackers’ session.