upvote
Chinese omits articles, verbs aren't conjugated, and individual characters carry more meaning than English letters, but other than those differences I don't have the impression that Chinese communication is inherently more concise. Some forms of official speech are wordy. Writing is denser, but the amount of information conveyed through speech is about the same. There are jokes about ambiguous words or phrases in both Chinese and English. So I was surprised at your take, but no objection to your points above. Ancient Chinese, on the other hand, is extremely concise, but so are other ancient languages like Hebrew, although in a different way. So it seems that ancient languages are compressed but challenging and modern languages have unpacked the compression for ease of understanding.
reply
That's a really interesting point about Ancient Chinese and other ancient scripts. I'd love to learn more about that.

I'm also more curious about tokenizers for LLMs than I've ever been before, both for Chinese and English. I feel like to understand I'll need to look at some concrete examples, since sometimes tokenization can be per word or per character or sometimes chunks that are in between.

reply