Hacker News
new
past
comments
ask
show
jobs
points
by
dash2
15 hours ago
|
comments
by
Tarq0n
14 hours ago
|
[-]
If it works for predicting the next token in a very long stream of tokens, why not. The question is what architecture and training regimen it needs to generalize.
reply