undefined

upvote

points

by simonw2 hours ago |

upvote

by rahimnathwani1 hours ago|

[-]

Looking forward to next time, hoping you mention speculative decoding and MTP :)

It would support your point about the performance of 20GB local models.

reply