undefined

points

[-]

Your method appears to be similar to LoRA but simply less expressive. Some kind of manipulation to layers 7, 14, and 21. Did you compare with other layers? This is obviously extremely specific to a particular backbone.

Also your documents use a ton of nonstandard jargon which only serve to confuse laypeople and annoy anyone who is familiar with ML. Saying your change adds “semiotic awareness” is meaningless when your experiments claim only marginal improvements. Clearly the model had most of the capability before.

More generally, who is it for? People who have expertise in ML are not going to take it seriously. People who don’t?

by spacebacon8 hours ago|

parent|

[-]

It is not LoRA. LoRA fine tunes capabilities into the model. SRT Adapter is a small overlay on a frozen model whose purpose is to make internal reasoning observable. It surfaces what the model is activating at moments of high divergence.

The layers 7, 14, and 21 were chosen after probing. They showed the strongest regime signals. We did compare other layers. The term semiotic awareness is just shorthand for detecting and modulating higher order meaning patterns. If the term is unhelpful I will drop it.

The capability gains are often marginal on standard benchmarks. The intended value is observability and steerability without retraining the backbone.

by anentropic10 hours ago|

prev|

[-]

Tip: neither the "30 second TL;DR" nor the intro paragraph above it really explain to anyone unfamiliar with your (possibly novel?) jargon what it does

by janalsncm9 hours ago|

parent|

[-]

“Semiotic awareness” is not standard ML terminology. The dictionary definition of semiotic simply means “relating to symbols” so it’s a bit grandiose to say you have Qwen “awareness of symbols” when in reality it’s a marginal improvement if even true.

Also to say that a philosopher that died 100 years ago inspired a new attention head is another instance of GPT off his rocker again. You don’t need MAH to contextualize “freedom” in a sentence. Attention already does that.

by spacebacon10 hours ago|

parent|

prev|

[-]

Thank you, I would appreciate additional feedback on how I can improve that?

Edit: its not GPT nor off rocker. This repo empirically proved computational semiotics with the reference to C.S. Peirce, Paul Kockelman, and many other respected contemporary semioticians.

by anentropic8 hours ago|

parent|

[-]

Just try to explain why I should use it and why it's different or better than alternatives - in terms of some qualities of the results rather than how it's implemented

The technical implementation details are also useful to have, but they're a bit hard to parse into "what is this?"

by anentropic8 hours ago|

parent|

[-]

FWIW I'm sympathetic to vibe-coded docs as I'm doing it myself a bit lately, but the agents are bad at it by default because all their context is the how and why of technical decisions made while coding with you

they need specific coaching to get them to try to write for the perspective of a new user

by spacebacon8 hours ago|

parent|

[-]

The main reason to use it is the output quality. SRT steers the model toward a consistent target voice or discourse style more reliably than prompting or basic steering, while keeping the base model frozen. The results feel more coherent in tone and perspective across longer outputs, especially when the target style comes from a specific corpus or community. On the sympathetic point about vibe-coded docs: exactly.

by anentropic2 hours ago|

parent|

[-]

how is it different/better than LoRA ?

by spacebacon8 hours ago|

parent|

prev|

[-]

Thanks for the feedback … rough and precise equally appreciated. Computational semiotics was empirically proven with this repo. I will work hard to make the findings and content more accessible for everyone.

by janalsncm9 hours ago|

parent|

prev|

[-]

You should write your readmes by hand. You’ll learn a lot more that way, and it’ll help to ground the project.

by spacebacon8 hours ago|

parent|

[-]

It’s not as if they were one shot. 5 repos prior, two published pre-prints on SSRN and thousands of hours back my research that is right there for you to peer review and use freely.

by nextaccountic9 hours ago|

prev|

[-]

How does this helps with making a LLM write in a particular style present in a large corpus? Is there a training step? Or does SRT can use the raw data as is? (seems unfeasible)

Also is SRT really suitable for style transfer?

I mean this seems to be another network overlaid on top of the LLM steering it, but it needs some target to determine whether the underlying LLM drifted away from it

by spacebacon8 hours ago|

parent|

[-]

SRT does involve a training step, but only on the small adapter and not on the base model. It learns to shift internal representations toward a target discourse regime or style.

It is an overlay, but it works by modulating meaning level patterns called regimes rather than fixed steering vectors. Because it can read its own effect on the hidden states it gives a way to observe whether output is staying in the target regime or drifting.

It is not raw data in and raw style out. The adapter needs examples that define the desired regime.