I supervised a student's project whose goal was exactly that : implement compression with LLMs using AC.
Since AC is optimal, if your LLM has an average cross entropy x on some dataset, you can expect that the compression will compress data using x nats per token on average!