undefined

upvote

points

by colechristensen2 hours ago |

upvote

by delis-thumbs-7e2 hours ago|

[-]

Wouldn’t that be extremely computationaly expensive considering how resource incentive training is?

reply

upvote

by colechristensen2 hours ago|

[-]

No, training a state of the art model involves training on the order of 10 trillion tokens.

We're talking about a step that updates weights based on say between 10k and 1M tokens.

reply

upvote

by delis-thumbs-7e2 hours ago|

[-]

I learned something. Thank you!

reply