upvote
How were you getting anything useful out of that? We found the (unquantized!) E2B model to be completely useless at even the simplest real-world classification tasks.
reply
How do you know it swaps to ram vs on the TPU?

Would be interested in testing this on my pixel.

reply
Because TPU has 2GB and weight + context needs more
reply