undefined

points

[-]

There's already KernelBench which tests CUDA kernel optimizations.

On the other hand all companies know that optimizing their own infrastructure / models is the critical path for ,,winning'' against the competition, so you can bet they are serious about it.

by xtracto14 hours ago|

prev|

[-]

So, im working in some high performance data processing in Rust. I had hit some performance walls, and needed to improve in the 100x or more scale.

I remembered the famous FizzBuzz Intel codegolf optimizations, and gave it to gemini pro, along with my code and instructions to "suggest optimizations similar to those, maybe not so low level, but clever" and it's suggestions were veerry cool.

LLM do not stop amazing me every day.

by amrrs19 hours ago|

prev|

[-]

Honestly the problem with these is how empirical it is, how someone can reproduce this? I love when Labs go beyond traditional benchies like MMLU and friends but these kind of statements don't help much either - unless it's a proper controlled study!

by minimaxir19 hours ago|

parent|

[-]

In a sense it's better than a benchmark: it's a practical, real-world, highly quantifiable improvement assuming there are no quality regressions and passes all test cases. I have been experimenting with this workflow across a variety of computational domains and have achieved consistent results with both Opus and GPT. My coworkers have independently used Opus for optimization suggestions on services in prod and they've led to much better performance (3x in some cases).

A more empirical test would be good for everyone (i.e. on equal hardware, give each agent the goal to implement an algorithm and make it as fast as possible, then quantify relative speed improvements that pass all test cases).

by squibonpig17 hours ago|

parent|

[-]

Yeah but like what if they're sorta embellishing it or just lying? That's the issue with not being reproducible.

by jstanley19 hours ago|

parent|

prev|

[-]

Oh, come on, if they do well on benchmarks people question how applicable they are in reality. If they do well in reality people complain that it's not a reproducible benchmark...

by girvo16 hours ago|

parent|

[-]

That's easily explained by those being two different people with two different opinions?

by 2goomba1stage11 hours ago|

parent|

[-]

And together they make one single community that s effectively NEVER happy.