upvote
Buddy, this tone may be why.

We genuinely don't understand what your post is about. What is this tool? What are these numbers representative? Why are things sorted in that order?

You haven't communicated really anything at all. I am interested, I'd like to understand. Write a more complete post, please.

reply
Are you familiar with https://artificialanalysis.ai/leaderboards/models

The json on the page has a coding index result it hides from the table.

That's what this exposes. It's a sorting from the leading evals company on the coding index for basically every model that matters presented in an easy to parse format that you can feed into model routing harnesses in real time so, for instance, your agents can dynamically upgrade themselves to better models as they come out or cost optimize based on eval results.

I do stuff like this, give it away for free and it's either ignored or makes people angry...

I really wish I didn't piss people off with my sincerity but somehow it always goes down that way

I really appreciate your time thank you so much

reply
I see no 'score' or 'age' mentioned in your script. What does age signify and how are they calculated?
reply
This isn't obvious?

    "\(
        10 \* (.codingIndex // 0) | round / 10
    ) \(
      (
        now - (
        .releaseDate |
          try ( strptime("%Y-%m-%d") | mktime )
          catch (now + 86400)
      ) ) / 86400 | floor
Real question. I see 86400 and I know it's time... That might just be me.

I'm not being an ass, I don't know how to talk to people or when I think I'm being clear but I'm actually being cryptic

reply
It is kind of noisy because the release recency, which is what your "age" column actually represents, is not important data for the comparison you are trying to make.

Also what message we should get from that table is not really obvious.

reply
Okay I think there's a familiarity delta. I constantly run into this

I know artificial analysis quite well as the gold standard in llm evals.

But I guess they're still obscure

I didn't think they were.

The age is important because new techniques keep being developed and so it is a very rough indicator of the size/cost/efficiency trade-off.

How old a model is is a major indicator of what you can expect from it.

I really need to develop a better sense for what people know. That's only one of my problems

Thanks for engaging with me

reply