undefined

points

[-]

> All of the optimizations Deepseek have done are in software and it goes down to the PTX assembly level

DeepSeek are still using NVIDIA (PTX) to train on, but for inference have already transitioned to Huawei Ascend chips, and inference speed is what this paper is addressing.

by vidarh9 hours ago|

prev|

[-]

> Compared to Anthropic who are celebrating in fixing a flickering issue in a terminal app which took months to fix.

It's funny, because if you ran Claude Code on a slow terminal, the cause of the flicker was obvious: They kept dumping the entire history of the chat back into the terminal in a number of situations, and relied on the terminal to them end up in the correct state.

by yorwba9 hours ago|

prev|

[-]

Anthropic almost certainly also has optimized software down to the assembly level, considering this take-home interview challenge they published: https://github.com/anthropics/original_performance_takehome/... which is all about instruction-level performance optimizations. That they don't prioritize UI fixes just means they consider other things more important.

by lelanthran9 hours ago|

parent|

[-]

Unlikely: that product is written completely by AI, of which they are not lacking.

More likely is that an AI generated codename is impossible to fix by humans, and SOTA was not able to figure it out until now.

by lionkor9 hours ago|

parent|

prev|

[-]

that's pretty silly to use as a measure of what they do internally

by saagarjha7 hours ago|

parent|

[-]

It's pretty representative of what they do internally

by saagarjha7 hours ago|

prev|

[-]

All frontier labs are working down to the PTX level (and lower)