upvote
If the embedding isn’t smaller than the input, how is it compressing information? It might lose information in its mapping to the embedding space, but in my understanding, the definition of compression means it has to use less bits than the original to hold the same information. As such, the embedding space must be smaller.
reply