Hacker News
new
past
comments
ask
show
jobs
points
by
chrislattner
5 hours ago
|
comments
by
nabakin
4 hours ago
|
next
[-]
Faster than TensorRT-LLM on Blackwell? Or do you not consider TensorRT-LLM open source because some dependencies are closed source?
reply
by
melodyogonna
3 hours ago
|
parent
|
[-]
I reviewed the TensorRT-LLM commit history from the past few days and couldn't find any updates regarding Gemma 4 support. By contrast, here is the reference for MAX:
https://github.com/modular/modular/commit/57728b23befed8f3b4...
reply
by
nabakin
2 hours ago
|
parent
|
[-]
If OP meant they have the fastest implementation of Gemma 4 on Blackwell at the moment, I guess that is technically true. I doubt that will hold up when TensorRT-LLM finishes their implementation though.
reply
by
pama
2 hours ago
|
parent
|
[-]
How is the sglang performance on Blackwell for this model?
reply
by
nabakin
1 hours ago
|
parent
|
[-]
Dunno but there's a PR for it. Probably also more performant than Modular.
reply
by
jjcm
1 hours ago
|
prev
|
[-]
What % of a speedup should I be expecting vs just running this the standard pytorch approach?
reply