Hacker News
new
past
comments
ask
show
jobs
points
by
lumost
1 hours ago
|
comments
by
barnas2
8 minutes ago
|
next
[-]
A company called Taalas is working on something like that. Not Opus4.6 quality, but I'm sure they're targeting larger models. Currently they're using a LLama 8B model. It runs at ~17k tokens per second, and you can test it at
https://chatjimmy.ai/
.
reply
by
neals
50 minutes ago
|
prev
|
next
[-]
I'm curious how hardware and power cost would stack up to subscription cost
reply
by
bigmadshoe
1 hours ago
|
prev
|
[-]
Can you give an example of such a problem?
reply