upvote
So it’s just like, your opinion, man?

edit: It was a play on The Big Lebowski, folks.

reply
College SAT scores do not tell you how the dev applying for your open back end systems engineering job is going to do once they're in your workplace harness.

Nor do class standings, nor hackerrank and the like.

What will tell you is asking them to fix a thing in your codebase. Once you ask an LLM to do that, a dozen times, I'd argue it's no longer "just your opinion man", it's a context-engineered performance x applicability assessment.

And it is very predictive.

But it's also why someone doing well at job A isn't necessarily going to be great at B, or bad at A doesn't mean will necessarily be bad at B.

I've often felt we should normalize a sort of mutual try-buy period where job-change seeker and company can spend a series of days without harming one's existing employment, to derisk the mutual learning. ESPECIALLY to derisk the career change for the applicant who only gets one timeline to manage, opposed to company that considers the applicant fungible.

But back to the LLM, yeah, the only valid opinion on whether it works for you is not benchmark, it's an informed opinion from 'using it in anger'.

reply
the (dead) internet is full of opinions exactly like this
reply
you tried qwen3.6 and you think it is not good?
reply
I do not have high opinions of any ai model.
reply