undefined

upvote

points

by simonw15 hours ago |

upvote

by Grimblewald11 hours ago|

[-]

Corporate America never backs down. It simply rallies and tries again later until people are too fatigued to care. The only solution is to abandon ship, which I am doing. MS walked back in OS ads the first few times, but ultimately we still ended up on the exact trajectory everyone was outraged at. OpenAI still ended up on its path to closed AI despite initial walk backs. The story repeats itself over and over again, so, once the bad behavior starts, you leave. Their apologies are as hollow as their moral posturing.

reply

upvote

by n62429 hours ago|

[-]

Same with VISA/Mastercard deciding what we can/cannot buy. The only solution is to stop using their credit cards at all.

reply

upvote

by abustamam5 hours ago|

[-]

Easy to say, but every bank I've had the (dis)pleasure of doing business with only ever issued a Visa or Mastercard so it's not really feasible to just "stop using them"

reply

upvote

by philipallstar4 hours ago|

[-]

The only solution to MacDonald's and Burger King deciding what we can/cannot order on their menus is to stop eating there.

reply

upvote

by lukifer4 hours ago|

[-]

"Taco Bell was the only restaurant to survive the Franchise Wars. Now all restaurants are Taco Bell."

reply

upvote

by Cider99867 hours ago|

[-]

Yes, Monero is a lot better than credit cards for privacy and freedom. I hope to see it accepted more.

reply

upvote

by j16sdiz7 hours ago|

[-]

Its not only the corporate america. Those crypto scammers do the same simply rallies and tries again later until people are too fatigued to care.

reply

upvote

by fatata1237 hours ago|

[-]

[dead]

reply

upvote

by mettamage10 hours ago|

[-]

I hope this has some answers [1]. It’s on the front page right now, but your frustration clearly seems to have some implicit answers that [1] is trying to answer.

[1] https://news.ycombinator.com/item?id=48477135

reply

upvote

by aimanbenbaha8 hours ago|

[-]

This is more on brand on the evil shortcomings that comes with letting effective altruism run unchecked and honestly is worse than average "Corporate America". And the Tech/AI Space have been warned many times. Getting paid for providing a compute/token hungry model and still intentionally sabotaging your customers and poisoning their workflows is something that should be unforgivable and frankly ground for antitrust prosecution.

reply

upvote

by inglor_cz8 hours ago|

[-]

"Corporate America never backs down. It simply rallies and tries again later until people are too fatigued to care. "

Frankly, that sounds excactly like Chat Control and similar recurring attempts to enact total surveillance here in the EU (Now shifted to heavy-handed age verification and various politicians touting bans on VPNs.) I don't want to abandon my continent of birth, though...

reply

upvote

by red-iron-pine6 hours ago|

[-]

guess who is pushing for those anti-privacy laws?

hint: they're publicly traded

reply

upvote

by inglor_cz5 hours ago|

[-]

I have encountered enough such people to know that the really heavy push is coming from the police and secret service circles. These are the workplaces that attract all the wannabe Stasi types.

reply

upvote

by sixothree5 hours ago|

[-]

I am 100% convinced the reason laptops came with webcams as standard so early on, even when webcams were an expensive option, was because law enforcement needed to spy on people.

reply

upvote

by h6d_100c15 hours ago|

[-]

To late. I canceled my Max subscription. The idea they would even do this is so destroyed any remaining trust. Why would I pay them 1000s of dollars in extra usage per month for something they could still be doing behind the scenes? Any errors previously chalked up to thinking effort or other backend changes? Maybe it was intentional prompt injection the entire time.

reply

upvote

by musebox3513 hours ago|

[-]

I work on open source text-to-image finetuning of open source models like zimage/flux2 klein 4b and inference time latency optimization. The moment I read the silent treatment, I went ahead and cancelled my subscription too since I would never know whether the models they launch will silently corrupt my output. This is totally unacceptable. There is a big difference between silent / flagged if you are doing ml research but not at frontier capability.

This goes on to show that - All that interpretability / safety research they are doing can also be weaponized against customers (steering vectors, intent classification, ...) in the name of safety from malicious actors. - If they deem profitable, they might nerf to original model and its training data for ml research at a bulk scale and then they won't even have to announce it so long as the overall benchmark score stays high enough.

As the IPOs get closer, they can do whatever they want to assure the investors that they have a moat that can not be crossed over by their own products. Considering this affects all ML researchers/students at universities, smaller scale research labs, this is just "cutting the branch you are sitting on".

reply

upvote

by Grimblewald11 hours ago|

[-]

I think all this started with post opus 4.5, that's when claude started wrecking my shit without extreme oversight. Codebases it was making positive contributions to before were slowly and constantly being eroded and wrecked. Give it tasks in isolation? still does well, but the moment it sees the bigger picture, it goes to shit. I chalked it up to a bad model but this makes it all seem like it may have been by design in retrospect.

reply

upvote

by jiggawatts8 hours ago|

[-]

Constraint decay is an issue with all LLM-based agentic development, at least for now.

Humans can maintain a long- and medium- term memory of constraints that they consciously (or subconsciously!) apply to the code that they write. The current crop of AIs are all amnesiacs, like the protagonist in Memento, falling back onto general instead of institutional knowledge.

For now, we are safe. We can rent out our meat brains for money for a little while longer.

Next year? Who knows...

reply

upvote

by close0410 hours ago|

[-]

> I would never know whether the models they launch will silently corrupt my output

You never knew to begin with, now you have an explicit reason to realize this. Any black box run entirely out of your control, where you can never verify the output, is subject to the same suspicion.

reply

upvote

by musebox359 hours ago|

[-]

True enough, but that is true for all the products I buy. I do not expect to control every product I own. For some I prefer to have more control, for others I just need something that works out of the box. There is always an initial bias for trust when you buy something otherwise you would not spend your hard earned money on it.

“Fool me once, shame on you. Fool me twice, shame on me. Fool me three times, shame on both of us.” -- S. King

reply

upvote

by close049 hours ago|

[-]

> but that is true for all the products I buy

Some things are more obscure than others. It's easier to trust and verify Office SaaS than AI SaaS. The determinism and obviousness of most other activities make them less susceptible to hidden interference. AI run by someone else is the next level of black box for users compared to most other objects or services we usually interact with.

reply

upvote

by gck112 hours ago|

[-]

OpenAI has a real opportunity to do some sort of "we don't maliciously alter your prompt and nerf the model" with some form of verification, when they release the next model.

But if Anthropic gets their way with regulatory capture, this could be the only future we'll see.

To think that they didn't expect the backlash speaks volumes about how much shady things they're doing which is not publicly known.

reply

upvote

by silisili11 hours ago|

[-]

OpenAI has been the absolute worst about this, historically. I found myself having to change my queries because it refused to serve things it deemed insensitive.

reply

upvote

by gck110 hours ago|

[-]

Yes, that's true. Excluding Fable, OAI models are the most refusal heavy. However, I'd rather get a refusal than response with poisoned output.

Since currently there's no way to verify if poisoning happened or not, I don't trust Anthropic anymore, regardless of what they say.

But my trust towards OAI is also brittle - what if they also do it, or start doing it?

I want to have a verifiable way to know that the prompt I sent was the prompt the model received. I want to know if anything was injected as well - I understand they may not necessarily be able to reveal the exact steering, but at least give me the steering category and its hash or something.

reply

upvote

by dannyw9 hours ago|

[-]

What kind of work are you getting refusals on? Genuinely curious. The only refusal I’ve had in recent memory was declining to find doorbell camera footage matching a certain description, which is fair enough and I think EU laws heavily restrict such activities (even tho I’m not in the EU)

reply

upvote

by VortexLain6 hours ago|

[-]

During Iran shutdowns I've been researching what ways Iranians manage to get to the internet by mimicking as whitelisted resources (such as hcapcha). ChatGPT had refused to lookup information written in Farsi since "circumventing state regulation is a crime".

reply

upvote

by Cider99867 hours ago|

[-]

How would the AI be able to find the footage itself?

reply

upvote

by dannyw7 hours ago|

[-]

I use Codex and wanted it to sort through the footage and use subagents to review. Codex limits are fairly generous, esp paired with mini models for this kind of task generally, but even GPT5.5 usage is still pretty generous.

Again, it’s the only refusal I’ve gotten for coding/agentic tasks, and it has a basis in law somewhere, so I don’t fault OpenAI for that.

reply

upvote

by intended11 hours ago|

[-]

Eh, I expect open Ai to follow suit.

I suspect this is surprising to folk because they aren’t the ones busy figuring out how to use LLMs for illegal acts.

In general, HN users focus on making stuff, and not the safety side of things, or the scale of harms being enabled via LLMs and generative AI.

If you are on the safety side of things the ratio of misuse to fair use is inverted and everything is at scale.

Transparency won for now, but OpenAI will also have to contend with the long tail of harms LLMs enable, and that’s going to conflict with letting customers have all the features of frontier models.

reply

upvote

by dannyw9 hours ago|

[-]

Building distributed training pipelines or optimising your ML stack (examples called out in the model card) isn’t harmful.

reply

upvote

by kmeisthax1 hours ago|

[-]

Yes, but there is a very specific subset of things AI companies will and won't cite safety for as a concern, and that subset intersects neatly with things the companies consider to be business risks. Like, the main reason why AI companies are so willing to poison the well is because there's no money in selling to the kinds of people who want to write malware[0].

The correlation between how bad an AI safety risk actually is and how much the companies in question will actually talk about it is almost perfectly negative. The poster child of this is AI superintelligence; companies love to talk about how dangerous the AI they are actively trying to build is. But superintelligence is also a really vague concept without a clear definition. If we naively define it as "an AI system that is better than a human in some aspect", then it already exists. These models already read and write at superhuman speed.

"That's not real superintelligence!" you say. But that's exactly the capability you need in order to flood every online forum with an unending tide of AI slop. And I don't remember, say, OpenAI saying they were shutting down Sora because it was destroying or defacing human culture[1]. They shut down Sora because it was way too expensive to run.

Meanwhile, Sam Altman went and bragged about how he wants ChatGPT to make erotica. Y'know, as if we don't already know that character.ai gooning is about as safe for your mental health as Action Park was for your physical health. But porn is also a huge market, so obviously he and all the other AI companies want in on it, even though the "sexy suicide coach" is already a well-documented harm of AI.

And the idea that distillation is an attack is laughable. Like, I get the logic - if someone can ask the AI to make another AI then they get to change the guardrails - but it's still ultimately just Anthropic objecting to their own conduct when it happens to them. All their models are trained on nonconsensually harvested data. There is no moral or legal principle where Anthropic gets to use my data without permission but I don't get to use theirs.

Furthermore, AI safetyism runs up against "Freedom Zero", a core tenet of the Free Software ethos: you should be allowed to use software in any way you choose. This is not a call for more people using AI for evil, but a call to recognize that people should be allowed to use their property as they wish. Making software disobey its owner is malicious behavior. And every single time safety considerations are brought up it is to justify further attacks on Freedom Zero. And these justifications are always self-serving. There is no context in the world where a frontier AI lab asking someone else's AI about AI research is intrinsically harmful; especially not to the point where we need to make Claude deliberately sabotage your work. That is malware. Anthropic shipped malware. This is inexcusable.

[0] Digital or biological.

[1] https://www.youtube.com/watch?v=YCPAIg7RUq8

reply

upvote

by nmfisher6 hours ago|

[-]

I cancelled mine immediately too. Anyone who supports open models will sympathize.

reply

upvote

by z3ratul16307114 hours ago|

[-]

that you still had max after all their deceptions is amazing

reply

upvote

by h6d_100c14 hours ago|

[-]

Yeah; not my smartest decision given their ongoing “issues”

reply

upvote

by trhway13 hours ago|

[-]

You've been Stuxnet-ed by Anthropic :)

reply

upvote

by hedgehog15 hours ago|

[-]

The "tradeoff" warning implies they stand by their thinking and don't think there was anything qualitatively wrong with it which, if nothing else, is helpful so potential customers can know how they think. I think the core lesson is if you want reliable infrastructure to build into an application you should use a different provider. (edit: I'm not specifically an Anthropic hater, but having just spent some time adding complexity to an app to deal with the existing refusal behavior in Sonnet... I understand why they might want this in an end user chatbot but for an API it's really not acceptable)

reply

upvote

by brookst7 hours ago|

[-]

Is it not a trade off? I think they made the wrong choice, but it seems reductive so say there was no choice at all and should never have been consideration of trade offs of silent versus not.

Even wide open, uncensored models are often the product of a deliberate choice. I have a hard time faulting people for intentionality (even when they get it wrong).

reply

upvote

by hedgehog2 hours ago|

[-]

They have a lot of choices, why would that specifically be a tradeoff? It's common for people to construct a tradeoff under which their preferred action is the more virtuous option, and thus they can be "the good guys", but that doesn't mean their framing makes any sense at all. Silently downgrading requests to a weaker model and billing the customer at full price, then framing the debate as how much (not if) this behavior is correct, that's an expression of values. People make mistakes all the time, if they thought it was actually wrong they could well have said so and explained what corrective action they've taken. One of the most famous examples of doing this right was the Pentium FDIV bug. Intel stood behind the product by recalling the affected units at great expense, and that (rightly) earned a lot of trust for decades.

reply

upvote

by consumer4518 hours ago|

[-]

The other major thing is almost as bad, and actually maybe even worse for trust of AI features in b2b apps:

> Anthropic requires 30 day data retention for Fable and Mythos

https://news.ycombinator.com/item?id=48464258

I used to be able to tell my enterprise customers something simple, that I really believe: "We use Anthropic models via Bedrock/Azure, therefore we are guaranteed that your data will not be used for training models."

That simple blanket statement is no longer true. Also, most normal people/customers only read headlines, and this is a huge story. From my point of view, as someone deploying LLMs in my apps, trust comms with my clients just got set back two years.

reply

upvote

by Spooky237 hours ago|

[-]

I’m very cautious with using these tools with certain clients, as I’m often contractually obligated to do things that my downstream supplier can rug pull at any time.

You should never use any of the frontier models with operational workloads manipulating or interpreting customer data.

reply

upvote

by consumer4517 hours ago|

[-]

I appreciate the reply. Could you please help me understand what you mean by "You should never use any of the frontier models?"

Does that mean the latest model, hosted by the lab, Bedrock, or Azure Foundry? Or, do you mean only use self-hosted models, or what did you mean by that? I would really love to learn what others are doing. I felt like my trust story was solid enough, prior to all this. I have been deploying and integrating Claude and Sonnet (latest 4.x-2), on Azure, as my client base has MS contract trust, for better or worse, and Anthropic models have been making my products amazing.

To see my other thoughts on this cluster f, please see: https://news.ycombinator.com/item?id=48488781

reply

upvote

by Spooky234 hours ago|

[-]

Sure. It's really about informed consent and acceptance of risk. I'm very conservative about that due to my background and business.

Say you have some flow that is processing/handling regulated, sensitive or other customer data with the LLM as part of an operational process. An example that I'm thinking of is for a customer who wants to more efficiently resolve or route IT incidents to the right place. The incident data may contain user-provided data has strings attached from a compliance perspective.

If you're using a third party API, your T&Cs are the only protection that you have. Microsoft/Google/Amazon are pretty decent by default. When I worked for the government, we had the leverage to extract much favorable terms from the big vendors like Google, Amazon, Microsoft as well. With Anthropic, and OpenAI, they are in the move fast and break things universe, you need to be bringing alot of money to the table to get terms changes, and you can easily stumble into a situation where they are retaining data in a manner that your customer will not like. So unless the customer is informed and accepting of that risk, proceed with caution.

I've had some success using self-hosted inference for these scenarios.

For development of software, totally different story -- it's your IP and you make the risk call.

reply

upvote

by consumer4513 hours ago|

[-]

Oh man, thanks for taking the time to reply. I feel a bit better now, lol.

If you read my rant linked previously, yeah... we are on the same page. As another user pointed out in that thread, the issue here is that even on Bedrock and Azure Foundry, now with Fable 5, Anthropic inserts themselves as an additional data subprocessor that we would have to consider and certainly disclose, correct?

That kind of destroys the whole point of using Bedrock/Azure for the model, doesn't it?

reply

upvote

by Spooky232 hours ago|

[-]

Yeah tbh I may have read past some of your previous post :) What you’re saying is what makes me nervous.

It was definitely sold as “anthropic IP, thorough your old pals at the hyper scaler”. And it’s turning into something else — I’m having lunch with AWS and this other guy showed up with them.

reply

upvote

by consumer4512 hours ago|

[-]

No worries :) What this showed me is the power/velocity/inertia that Anthropic can hold over the 3rd party providers. Like, they should have pushed back on this, as it must have been clear to the 3rd parties that this change was a big deal to their customers... and yet, it went how Anthropic wanted it to go.

reply

upvote

by Hizonner3 hours ago|

[-]

> I used to be able to tell my enterprise customers something simple, that I really believe: "We use Anthropic models via Bedrock/Azure, therefore we are guaranteed that your data will not be used for training models."

They claim they're not using it for training, only for "safety", and in fact I believe them. If you think they're lying, then why didn't you think they were lying about zero retention before? And "don't throw this in the training bin" is a relatively easy policy for them to get right. Especially because, no matter what your "enterprise leaders" tell themselves, your queries probably have close to zero real training value.

What I don't believe is that they can guarantee it won't leak to non-training parts of Anthropic, leak to or be stolen by outside actors, or be coerced out of them. That risk comes from creating the record in the first place, and that is the problem.

reply

upvote

by consumer45146 minutes ago|

[-]

I explained/ranted about why this new scenario is far more worrisome in this comment:

https://news.ycombinator.com/item?id=48488781

reply

upvote

by pseudosavant14 hours ago|

[-]

They are still downgrading. They just aren't doing it silently. I don't know how big of a win that is? They still trained on everyone else's data without license or attribution but want to prevent someone else from doing the same thing to them.

Some pretty audacious hypocrisy from Anthropic this week.

reply

upvote

by musebox3510 hours ago|

[-]

It is much more reasonable to do it in a visible / flagged way. At least you have visibility over the quality of service you get as a customer.

Silent treatment is a breach of trust, what you buy changes depending on the context based on the goals of the producer. It is like your computer silently blocking ads from competitors at the hardware level, which is crazy. I think they erred on the wrong side of things due to IPO pressure.

At least there is competition from multiple companies. Still it is best to have personal benchmarks for the domain you are working on to have a real evaluation of the value you get for the money/time you spent on these products. Without trust, that might be the only way forward to keep the companies honest.

This happens eventually in all sectors, a good magazine/website that does independent product evaluation is priceless. Sadly, the new ad-driven internet decimated those that worked great in the 90/00s. Still there are independent blogs that does some evaluation and that is better than nothing.

reply

upvote

by KeplerBoy14 hours ago|

[-]

Imo that's a big win. The LLM just gaslighting you into suboptimal approaches was insane.

reply

upvote

by pseudosavant14 hours ago|

[-]

I guess, but yesterday Anthropic had their version of Google removing the "Don't be evil" from their motto. They destroyed a metric ton of goodwill they'll never regain.

reply

upvote

by cayley_graph12 hours ago|

[-]

Yeah, they showed their true colors there. This, compounded with the fact that they're the only frontier lab with no open models, tells you all you need to know. Tired of the insanely patronizing (+ conveniently and overwhelmingly self-serving) attitude out of them. My goal is to own my computing and be able to choose what to do with it.

reply

upvote

by monegator12 hours ago|

[-]

And just a few days ago i was being called out because i considered anthropic "evil"

I mean, did nobody ever get the vibes, never see a pattern emerging? (well they don't or they wouldn't be so amazed by pattern recognition machines on steroids)

reply

upvote

by selicos12 hours ago|

[-]

If any work is blocked/etc, refund all credits from that session/last X minutes. Minimum.

reply

upvote

by bostik14 hours ago|

[-]

They need to walk back a lot more.

Unilaterally revoking zero-data retention, even for enterprise contracts that explicitly require that? Nope.

Fable is utterly unusable for any kind of security work. I tripped the safeguards yesterday - using Fable to dig into a complex (& annoying) security bug that has so far resisted both human and Opus 4.8 level investigation. "Sorry Dave, I can't let you do that."

For the time being we are requesting Anthropic disable Fable for our enterprise and turn ZDR back on. The two may be interlinked so that one will always get neither or both. ZDR is a contractual obligation. Fable in its current form is useless. Might as well flip the old behaviour on and avoid burning money for no reason while this mess is being sorted out.

reply

upvote

by rmast13 hours ago|

[-]

I was using it to craft a CTF challenge for summer students involving a simulated mechanical dial safe, but with the fence replaced by a IR beam break sensor and a microcontroller handling the check + flag message display.

For generating the initial 3D simulated safe using three.js it worked well, but then modifications to print a flag tripped the safeguards; eventually got it narrowed down the part in the prompt about it being for a CTF for students, and the "thinking" for the model seems to drift to ideas of encryption/obfuscation of the safe combo so students can't just read out the answer... which makes sense logically to help force students into turning the simulated dial instead. But whatever detection Anthropic I guess just naively sees the model thinking about "encryption" and "obfuscation" without taking into account any of the context.

For writing the dummy firmware, it tripped the safeguards while thinking about how to track dial position in the firmware and output the message; however, when I left out talk about safes and just told it to write firmware for a microcontroller hooked up to an i2c display for showing a message with a beam break sensor to determine the message, and an unspecified i2c chip for getting an unspecified number (e.g. internal wheel positions) it worked fine.

An unrelated software task I asked it to write some code to translate CustomActions in a Windows MSI installer into human readable stuff, which has (exclusively?) defensive security applications for recognizing malicious behavior in an MSI installer. Maybe I'm going crazy, but I'm guessing as part of its research into MSI installer custom actions Fable found articles about analyzing malicious MSI installers, and that probably tripped the safeguards.

Overall my impression is that the safeguards are perhaps using an overzealous and naive implementation that just looks for a list of banned words in the prompt or the thinking -- which drives me crazy when the model says my prompt looks fine, and then 10 minutes in some part of the thinking trips the safeguard.

reply

upvote

by dmurray13 hours ago|

[-]

The announcement I saw was that your enterprise would have to turn off ZDR to get Fable, not that users could accidentally opt out of ZDR by selecting the wrong model.

Unilaterally disabling ZDR seems like a step too far in the enterprise market, even for a company trying to figure out what its users will let it get away with.

reply

upvote

by bostik13 hours ago|

[-]

I read the same announcement. Or more precisely, I read at least two slightly different revisions of the announcement (it was updated between my two passes).

Our org has ZDR, and has had it since the contract was signed. Yesterday two things held true at the same time:

    1. Fable was available if you had at least .170 CLI client; and
    2. ZDR was no longer on

By the time West Coast woke up, the admin panel apparently had an option to toggle ZDR again. It remained off by default.

reply

upvote

by mastermage12 hours ago|

[-]

You mean off as in no Data Retention? Or in we turned off your ZDR Policy so we collect all your data now?

reply

upvote

by bostik12 hours ago|

[-]

ZDR had been turned off. We sent in a request to have it re-enabled (and to disable Fable access for the time being).

Somewhere along the line we also used the self-service toggle to turn ZDR back on. I am not 100% certain of the exact timeline of interleaving events, many of the actions were taken by our Western US folks. Sorry. It's been a bit hectic over the past ~36h...

reply

upvote

by mastermage11 hours ago|

[-]

JFC, thats a terrible situation. Thats literally a lawsuit or multiple waiting to happen. Godspeed you seem to have had a few interesting days so far.

reply

upvote

by rurban13 hours ago|

[-]

Not just security work. Normal bug finding was impossible, because the model suddenly called triaging and verifying a possible fix a cyber security threat.

reply

upvote

by insanitybit8 hours ago|

[-]

I was just building a library to use file capabilities (ie: open_at) and it refused. This thing won't even help you write safe software.

reply

upvote

by rurban3 hours ago|

[-]

Whow, same for me. Insane context bugs in flake 5

reply

upvote

by lII1lIlI11ll9 hours ago|

[-]

I think the main reason reason why they mandated data retention for Fable is to fight distillation, not to prevent black hats from using the model.

reply

upvote

by gmerc13 hours ago|

[-]

They want to keep the logs so they can see what other companies do with AI in their area of frontier.

reply

upvote

by Aperocky13 hours ago|

[-]

I don't think it's the widespread condemnation, I think it's some high paying customer and potential investor telling them to stick it.

reply

upvote

by nl12 hours ago|

[-]

This is different to the cyber limitations though.

To be precise - it makes the "won't work on frontier machine learning" refusal the same as the "won't work on cyber security" refusal (instead of the way it previously would work on frontier machine learning problems but give sub-optimal answers without informing the user)

reply

upvote

by dannyw9 hours ago|

[-]

Some anecdotal social reports seem to suggest it wasn’t just giving suboptimal answers, but rather mucking around and sabotaging your codebase and training (like editing hyperparameters in project files despite not being requested).

Of course, it’s impossible to know if that was deliberate sabotage, or model misbehaviour. Which is exactly the problem.

That may be considered malware / a criminal act tbh.

reply

upvote

by rafram15 hours ago|

[-]

The mitigations against distillation are separate, and not what the OP is about at all.

reply

upvote

by 12 hours ago|

[-]

deleted

reply

upvote

by AussieWog9312 hours ago|

[-]

Non-paywalled: https://archive.md/yxYhU

reply