undefined

points

by cyanydeez8 hours ago |

comments

by c7b4 hours ago|

[-]

Cool! Anything you want to share? I haven't looked much into my system prompt yet, do you have any tips?

by ACCount378 hours ago|

prev|

[-]

We live in a non-deterministic world. Anything "deterministic" in it is a castle built on quicksand.

LLMs are, as far as the nastiness of the Real World goes, really fucking benign. Future models outperform past models, both in open weight land and at the big frontier labs. Performance per $ only ever goes up. That's just nice.

by windexh8er6 hours ago|

parent|

[-]

> We live in a non-deterministic world. Anything "deterministic" in it is a castle built on quicksand.

Except the Enterprise, and a lot of what people want compute for, is built on deterministic systems or processes. I'm not saying the non-deterministic nature of LLMs isn't useful. However I've worked with a lot of organizations on SOAR projects, for example. When you can weave the deterministic and non-deterministic together you get a relatively efficient system. A workflow that will stay on the rails and will come to a conclusion as expected. And the "as expected" part is critical in these types of systems. The reality of, using SOAR as an example, is also that most enterprise would be much better served by fast SLMs. Parse an email and validate if it's SPAM / Phishing or read a chunk of firewall logs and look for outliers / indications for escalation - those things can get messy in a deterministic system because of potentially unstructured data.

I don't believe it's either / or. And I believe that LLMs just aren't efficient, fast or reliable in the sense that deterministic are. It seems, at least to me, a better together story.

by SkyBelow5 hours ago|

parent|

[-]

I think it might be built on something more than deterministic systems. Some property that is a subset of deterministic, so all your argument still apply, but merely being deterministic is not good enough.

LLMs are what made me start considering this. Imagine a company using an LLM that was fully deterministic. All RNG was either removed or seeded in such a way that the same input (so many the seed counts as part of the input) gave the exact same output. Fully deterministic.

But such an LLM, with a slight drift in input, could still produce very different outputs. This isn't being non-deterministic, but more than the change in outputs does not naturally follow from the input. I'm thinking like how 2 double pendulums can (but not always do) greatly diverge given a very small change in their input.

So in light of that I've begun to call this new property non-chaotic. So Enterprise depends on non-chaotic systems, which are a subset of deterministic systems, and then wrangling the chaotic elements they cannot remove as much as possible.

The follow question I now have is if all LLMs are inherently chaotic, or if it is possible to have a non-chaotic LLM.

by cyanydeez6 hours ago|

parent|

prev|

[-]

YES, but you seem to not understand that having two non-deterministic layers is incompatible. #1 is fine: it has random issue and you build around those random issues; those issues don't change unless you change them.

#2 is not fine; that non-determinism you do not control, have no insight into, etc.

I'm saying sure, give me #1 if it means I can build a harness around it and smooth over the edges. But I'm not taking #1 and #2. There's zero reasonable way to manae two non-deterministic systems.

by maykthewessen8 hours ago|

prev|

[-]

Qwen is the Alibaba distilled Anthropic Claude model

So piracy on an by piracy trained ai model..

by cogman108 hours ago|

parent|

[-]

Piracy? Lol.

Alibaba didn't steal Opus weights, they used opus output to train their model.

If this is piracy, then so is reverse engineering efforts powering a bunch of Linux drivers.

by cyanydeez6 hours ago|

parent|

[-]

If that's piracy, I'm going to the library and arresting everyone there!

Also, yeah, they already stole their copyrighted works, so a thief from a thief is still...theives?

by tommica5 hours ago|

parent|

prev|

[-]

Well, Anthropic got paid for it, unlike the sources that they used...

by c7b4 hours ago|

parent|

prev|

[-]

I'm not sure what you're trying to say. Is that a good or a bad thing? Model distillation is presumably part of the reason why Qwen is so good, yes. As a consumer, that's a good thing I would say. It's a natural counterbalance to the monopolistic tendencies of other tech segments.

If you have ethical concerns, model distillation feels like an arbitrary line to draw. Why is the first type of piracy ok, the second not? You should restrict yourself to ethical open source models. Which is btw where I genuinely hope the future of local models is going to lie. Open weights is not enough, we need fully open source models to be sustainable. Even for simple things like updating the knowledge cutoff. How we are going to distribute the training effort will be an interesting problem where I don't see an obvious solution yet. Maybe the blockchain/federated learning people can suggest something. Or university consortia, or some public sector solutions. Or something really boring - I for one would absolutely be willing to pay for DRM-free weights of an open source model (even if I could pirate them for free).

by zaphirplane3 hours ago|

parent|

[-]

Are you saying 2 wrongs make a right

by c7b3 hours ago|

parent|

[-]

I'm saying, either you have a problem with the copyright issues related to AI training or you don't. If you do, neither Qwen nor Claude are acceptable, if not then both are. They have similar moral standing to me.

Btw, ethically sourced, open source LLMs exist! Check out eg Olmo by Allen AI: https://allenai.org/olmo