undefined

points

[-]

If you think training a sparse autoencoder to extract concept vectors that are usable as steering injections into a modern LLM is pretty easy, you should probably go work for Anthropic's mech interp team ;)