undefined

upvote

points

by not_that_d12 hours ago |

upvote

by stingraycharles7 hours ago|

[-]

You do realize that today’s multi-modal LLMs are actually able to understand images? They’re tokenized very much like language is.

reply