undefined

upvote

points

by mfro13 hours ago |

upvote

by steve-atx-760013 hours ago|

[-]

Inference from an LLM is O(tokens^2)

reply

upvote

by halJordan10 hours ago|

[-]

Only in the naive implementations of attention

reply