The slow word-by-word typing was what we started to get used to with LLMs.
If these techniques get widespread, we may grow accustomed to the "old" speed again where content loads ~instantly.
Imagine a content forest like Wikipedia instantly generated like a Minecraft word...
A chatbot which tells you various fun facts is not the only use case for LLMs. They're language models first and foremost, so they're good at language processing tasks (where they don't "hallucinate" as much).
Their ability to memorize various facts (with some "hallucinations") is an interesting side effect which is now abused to make them into "AI agents" and what not but they're just general-purpose language processing machines at their core.
Alternatively, ask yourself how plausible it sounds that all the facts in the world could be compressed into 8k parameters while remaining intact and fine-grained. If your answer is that it sounds pretty impossible... well it is.
Smaller models, not so much.
What GP is expected to happen has happened around late 2024 ~ early 2025 when LLM frontends got web search feature. It's old tech now.