upvote
Except Qwen already release their own fully baked interpretability SAE toolkit tuned on their models so deserve credit here and activation telescopes should be a standard part of every major release

[1] https://qwen.ai/blog?id=qwen-scope

reply
We already know Anthropic does open source for a while such as the "flawed" MCP spec and "skills" spec.

This release is only done on other open-weight LLMs which have been released and even though they will use this research on their own closed Claude models, they will never release an open-weight Claude model even if it is for research purposes.

So this does not count, and it is specifically for the sake of this research only.

reply
It's literally an open model that generates natural language text (or one that takes in text and turns it into activations). Why does engagement with the local models community "not count" if it isn't Claude? That makes very little sense to me.
reply
Because we know what Embrace, Extend, and Extinguish means for example.They're leeching off opensource, not contributing in any meaningful way.
reply
https://github.com/kitft/natural_language_autoencoders

Here’s the full source code for training your own NLA, provided by Anthropic.

reply
Sorry, what are they embracing and extending?
reply
Chinese open models? /s

To counter the grandparent you’re replying to: Embrace, Extend & Extinguish is a Microsoft strategy. So is FUD, and that’s all this is.

reply
Humanity!
reply
Those are generally used by someone who is behind. See: everything meta does.
reply