But then there is a comment below talking about how DeepSeek was able to get a huge improvement in compression by using visual tokens, https://news.ycombinator.com/item?id=48777848. I don't fully understand all of the underlying technical details so I am still fundamentally baffled about how going the OCR route could actually result in overall electricity/computational savings.