It's not completely impossible, depending on what your expectations are. That language model that was built out of redstone in minecraft had... looks like 5 million parameters. And it could do mostly coherent sentences.
Which is a lot more than 888kb... Supposing your ESP32 could use qint8 (LOL) that's still 1 byte per parameter and the k in kb stands for thousand, not million.