upvote
It’s a good spot for hobbyists to fill in the gaps. Maybe it’s not interesting enough for academics to study, and for corporate ML they would probably just fine tune something that exists rather than spending time on surgery. Even Chinese labs that are more resource constrained don’t care as much about 4090-scale models.
reply
It's still non-trivial, as multi-digit numbers can be constructed a huge combination of valid tokens.

The code in the blog helps derive useful metrics from partial answers.

reply