Hacker News
new
past
comments
ask
show
jobs
points
by
delis-thumbs-7e
2 hours ago
|
comments
by
colechristensen
2 hours ago
|
[-]
No, training a state of the art model involves training on the order of 10 trillion tokens.
We're talking about a step that updates weights based on say between 10k and 1M tokens.
reply
by
delis-thumbs-7e
2 hours ago
|
parent
|
[-]
I learned something. Thank you!
reply