Hacker News
new
past
comments
ask
show
jobs
points
by
gobdovan
12 hours ago
|
comments
by
swyx
25 minutes ago
|
[-]
> If you can see that these models empirically get better with scale, why would you swap the main architecture? Those events will be pretty rare
c.f. hardware lotter
https://arxiv.org/abs/2009.06489
reply