undefined

upvote

points

by CWuestefeld18 hours ago |

upvote

by demorro18 hours ago|

[-]

Humans should have more legal privileges than machines, just as individuals should have more legal privileges than corporations. It's really as simple as that. I don't want to gripe around making up justifications, that's how the law should be and if it turns out not to be that, I'm going to be nettled.

I live in the UK, and most US law is based upon English common law, it's not some immutable code given to us from above. It's based upon assumptions and capabilities of the entities participating in the system at the time the law was codified. It can and should change to make more sense if those assumptions and capabilities shift massively.

reply

upvote

by idle_zealot17 hours ago|

[-]

I get the individual/corporation distinction, but how is a machine another tier here? It's a tool, it can't have any rights at all. The wielder has rights, and curtailing their rights depending on what tool they're using to exercise them seems strange. Potentially justifiable, but it's a different axis from the nature of the actor.

reply

upvote

by demorro16 hours ago|

[-]

Our positions are completely compatible. People are anthropomorphizing LLMs, saying that because humans train on protected works, then it is fine for LLMs to do the same.

If they have only the rights that their human creators have, then access to them cannot be sold, in the exact same way that I cannot sell you a database that I have collected filled with copyrighted material. The "humans do training too" argument only holds if you imbue LLMs with similar rights to humans.

I am allowed to sell myself (in a very limited capacity) to others for them to exploit my training, even if that training was on protected material, which is a privilege humans should have, but machines should not.

reply

upvote

by p_l5 hours ago|

[-]

Thing is, LLMs level of compression of training set mean that effectively, under the same rules that say you cannot sell that database filled with copyright material, the LLM is fine to sell. Because you have to be able to meaningfully trace each claim to final output (weights). For example, for some older stable diffusion model, it was calculated that each individual work addition or removal resulted in about 1-2 bits of change, meaning the same rules would qualify it as not derivative work.

However, because it is an issue with (at least historical) goals of copyright law, the common pattern that is evolving is that AI is not granted copyright of any work it generates, making it a bit of poison pill for some of the egregious ideas of corporate abuse. Not sure if the weights will be considered copyrightable either.

reply

upvote

by missingcolours17 hours ago|

[-]

In many of those examples, there is payment to the creator of the works that others are learning from. Authors are paid for their books, when we listen to music on the radio the musician is paid royalties, etc. When you lead a team and mentor junior engineers you're being paid for your time.

The nature of the source material matters though. Training a model on open source software seems perfectly fair - it has explicitly been released to the public, and learning from the code has never been a contested use.

IMO the questions around coding models should be seen as less about LLMs and more as a subset of the conversation about large companies driving immense profits from the work of volunteers on open-source projects, i.e. it's more about open source than AI.

reply

upvote

by jacquesm18 hours ago|

[-]

Scale and the ability to generate a livelihood of your creations and/or the ability to control how what you have created is used, for instance, to demand attribution.

reply

upvote

by atleastoptimal18 hours ago|

[-]

The attitude is derived from a general animus many have towards AI companies. They resent the efficacy of AI because it devalues individual expertise.

I can't imagine it really justifiable to say that training off data is the same as "stealing", when that same claim, that learned information that a person could retain and reproduce constitutes copyright infringement is the subject of many dystopian narratives, like this one, where once your brain is uploaded to the cloud you have to pay royalties based on every media product you remember.

https://www.youtube.com/watch?v=IFe9wiDfb0E

reply

upvote

by RealityVoid15 hours ago|

[-]

This is the answer. People don't like having their livelihood threatened so they kick the thing that threatens it.

reply

upvote

by mattmanser18 hours ago|

[-]

Part of how AI works is that it's just really complicated compression, you can get AI to write out Harry Potter novels word for word with the right prompting.

When it picks out a rare bit of code, it will be simply copying that code, illegally, and presenting it without attribution or any licenses which is in fact breaking the law but AI companies are too important for the law to apply to them.

There's been instances where models have spat out comments in code that mention original authors, etc., effectively outing itself as a copyright thief.

There's nothing anyone can do about it, but the suspicion is that the big companies have taken everyone's code on GitHub, without consent, and trained on it.

And now are spitting out big chunks of copyrighted code and presented it as somehow transformed even though all they've actually done is change a few variable names.

It is copyright theft, but because programmers are little people, not Disney, we don't have any recourse.

reply

upvote

by CWuestefeld14 hours ago|

[-]

And now are spitting out big chunks of copyrighted code and presented it as somehow transformed even though all they've actually done is change a few variable names.

It's pretty likely that I've done the same thing. I mean, I've written enough CRUD functions in my life, for example, that in all likelihood I'm regurgitating stuff that's a copy, for all practical purposes, of stuff I've done before as work-for-hire for my employer. I'm not stealing intentionally or consciously, but it seems quite likely that it's happening. And that's probably true for many of you, at least that have been in the industry for a while.

reply

upvote

by winstonwinston17 hours ago|

[-]

> There's nothing anyone can do about it, but the suspicion is that the big companies have taken everyone's code on GitHub, without consent, and trained on it.

I asked agent X what is the source of training data it generated code from, it couldn’t say. Then I asked why the code implementation is exactly the same as the output of agent Y. It said they were trained on the same ‘high-quality library’, and still couldn’t say which one.

So I guess that’s fine because everyone is doing it.

reply

upvote

by atleastoptimal18 hours ago|

[-]

Anthropic was sued successfully for training on books, the law still applies to them

https://www.npr.org/2025/09/05/g-s1-87367/anthropic-authors-...

When I write fizzbuzz do I owe royalties to the inventor of fizzbuzz? Is my brain copyright thieving because I can write out the song lyrics from memory?

reply

upvote

by veber-alex8 hours ago|

[-]

They got sued for downloading pirated books and not for using them for training. Huge difference.

reply

upvote

by sobjornstad44 minutes ago|

[-]

Indeed, the court actually explicitly held that Anthropic had the right to train their AIs on books, so long as they paid for them.

reply

upvote

by blks17 hours ago|

[-]

I think if you write fizzbuzz and then sell it, without attribution, and it goes against the original fizzbuzz license, then you’re infringing.

reply

upvote

by rspeele18 hours ago|

[-]

For another human being to look at my open source code, learn from it, get inspired by it, appreciate what I did, and let it influence their own creativity would bring me joy. That's why I open sourced it in the first place.

Few people ever actually read open source code, but I'd like to think on the rare occasions they do, they share a connection with the author. I know when I read somebody else's code, for me to understand it I have to be thinking about the problem the same way they were when they wrote it. I feel empathy with them and can sometimes picture the struggle, backtracking, and eureka moments they went through to come up with their solution.

Somehow I don't get the same warm fuzzy feelings about a machine powered by investor money ingesting my work automatically, in milliseconds, and coldly compressing it down to a few nudges on a few weights out of trillions of parameters. All so the machine can produce outputs on-demand for lazy users who will never know of me or appreciate my little contribution, and ultimately for the financial benefit of some billionaires who see me as an obsolete waste of space.

I guess I'm just irrational that way.

reply

upvote

by all217 hours ago|

[-]

We're moving into the 'industrial age of software'. You exact issue, of bespoke, well thought out and well-crafted code is one that craftsmen felt at the beginning of the industrial age. Now, parts are designed and churned out by machines that no one sees or cares about (generally speaking). This is where we are going with software, and production at a truly industrial scale has its place.

And so does well-crafted bespoke software.

The engineers who built the foundation for the industrial expansion of our forefathers went through the same exact thing we're going through now. They look at what existed, and use it to inform their efforts. This is what LLMs do.

I'm not attempting to moralize here, just comment on the parallels. Do I agree that a craftman's work is consumed by the juggernauts and no second thought is given? No. I think its a shame. But I also think the output will never match the artisans that practice now. By the very nature of the machines we employ, we cannot match the skill or thought that goes into bespoke code.

reply

upvote

by rspeele9 hours ago|

[-]

It is not even about quality. In fact with an LLM following my orders I can create higher quality code than I ever did before. I always was operating within a budget whether it was defined by the # of hours my customers were willing to pay for, or the # of hours I was personally willing to invest in a side project. This budget manifested in the form of cut features, limited test coverage, limited documentation, and so on. So given the same budget or even a slightly reduced budget I can actually make higher quality software with slop superpowers.

If I spend 2 hours designing the domain model, 1 hour slopping out a rough implementation, and 5 hours polishing it with a combo of handwritten and vibed refactorings, I will get a better result than if I spent 8 hours writing everything by hand.

So my point is not that vibe software is lower quality, as my experience has shown the opposite. It is simply that the spirit of sharing my work was done with the idea that I was sharing it with others who toiled in the same craft, not sharing for consumption by machine. Not that I ever contributed anything very important to the open source world, that anybody depended on. Just personal projects I thought were neat or educational.

In hindsight I would probably still have open sourced what I did, because I think it's valuable to have on record that I competently programmed stuff before AI even existed, like pre-atomic steel. But I don't know if I will open source any personal code going forward.

====

To put it more succinctly: if somebody "ripped off" my open source code in 2018, I wasn't mad about that. Even if they didn't bother to attribute me, well, at least they saw my stuff, had a human brain cell light up appreciating it, and thought it was worth stealing. I'm flattered. But with LLMs my work can be reappropriated without a single human ever directly knowing or caring about it.

reply

upvote

by blks17 hours ago|

[-]

You’re confusing yourself with a commercial product. You’re not a product that was created by other human beings based on someone else’s IP.

reply

upvote

by CWuestefeld14 hours ago|

[-]

You’re not a product that was created by other human beings based on someone else’s IP.

It turns out that's false. We know that genes are patentable; remember back during the Human Genome Project, when there was such a rush to patent them? So genes are IP. (This seems bizarre to me, since they're patenting something that was found just sitting there, but this is what the system says right now.)

Well, two other humans (aka mom and dad) did create me, based on those patentable genes (and most likely including some genes that were, in fact, patented).

I'm not sure what to conclude from all of that, but I do think that it invalidates your argument.

reply

upvote

by gspr18 hours ago|

[-]

> When I write code, what I write and how I write it is informed by having read countless source code files over my education and my career. Just as I ingest all that experience to fine-tune how my later code is written, so does the LLM from the code it's seen.

You are presumably human. We have granted humans specific exemptions in copyright law. We have not granted that to LLMs. Why are we so eager to?

reply

upvote

by p_l5 hours ago|

[-]

We did not grant human exemptions in copyright law.

We gave certain temporary monopoly on certain uses to humans under rules little understood by laymen even if their livelihood depends on it.

reply

upvote

by gspr4 hours ago|

[-]

... and from that temporary monopoly humans have exemptions (critique, inspiration, etc.)

reply

upvote

by RealityVoid15 hours ago|

[-]

Ok, so I use the LLM. I use the tool. Can I now apply the exemption to me?

Are you telling me that I can use the thing, but I can't use it if I process it through an LLM? It get slippery, fast.

reply

upvote

by gspr4 hours ago|

[-]

What's special about LLMs in your argument? When I was an edgy teenager in the 90s, I'd argue that it's not piracy because the DivX representation of the movie isn't bit-for-bit identical to the Hollywood master or whatever. If your reasoning works for LLMs as the tools, surely it also works for video compression.

reply

upvote

by habinero11 hours ago|

[-]

No, that's how copyright normally works.

If I write a story, I can put it online. That doesn't mean it's ok to take that story and publish it in an anthology.

reply

upvote

by CWuestefeld14 hours ago|

[-]

I'm not sure where in our lawbooks there are laws that specifically target humans to the exclusion of human-operated tools.

There's also a TON of irony here. What an about face it is, for the community at large* to switch from "information wants to be free, we support copyleft and FOSS" to leaning so heavily on an incredibly conservative reading of IP law.

reply

upvote

by lelanthran7 hours ago|

[-]

> I'm not sure where in our lawbooks there are laws that specifically target humans to the exclusion of human-operated tools.

It doesn't need to. Laws are for humans.

Laws don't give rights to chainsaws. Or lawnmowers. Or kitchen knives, hammers, screwdrivers, and spades.

You can't use any of those to commit a crime and then claim that the law specifically did not exclude those tools.

Why are you seemingly in favour of carving out an exemption for LLMs?

Laws are for humans.

Arguing that the law did not specifically address "intentionally killing a person by tickling them till they died" means that you found a loophole which can be used to kill people is...

well, it's in the "not even wrong" category...

reply

upvote

by gspr3 hours ago|

[-]

> I'm not sure where in our lawbooks there are laws that specifically target humans to the exclusion of human-operated tools.

If we take the point of view that LLMs are tools (I agree), then people need to be absolutely certain that these tools don't contain (compressed) representations of copyrighted works.

People seem not to want to do that. And they argue that the LLMs have "learned" or "been inspired" by the copyrighted works, which is OK for humans.

This is the problem. People can't even agree on which of two mutually exclusive defenses to appeal to! Are LLMs tools which we have to ensure aren't used to reproduce copyrighted work without permission, or are they entities that can be granted exemptions like humans can? It can't be both!

> There's also a TON of irony here. What an about face it is, for the community at large* to switch from "information wants to be free, we support copyleft and FOSS" to leaning so heavily on an incredibly conservative reading of IP law.

True. While IP-owning companies like Microsoft now say "it's online, so we can use it".

It's bizarre.

I'll tell you what: I'll drop my conservative stance in defense if FOSS when Windows and the latest Hollywood movie are "fair use" for consumption by whatever LLM I cook up.

reply

upvote

by ako17 hours ago|

[-]

Because that allows us to create useful tools that we didn't have before. For me it feels like a carpenter going from a hand-saw to an electrical saw. Still requires the skills of a good carpenter, but faster and easier.

reply

upvote

by gspr17 hours ago|

[-]

… so a bunch of people just decided that rights we granted to humans also apply to their tools? Without any discussion? This isn't how anything is supposed to work when it comes to common rules!

reply

upvote

by RealityVoid15 hours ago|

[-]

The common rules are so because we agree on them. On principle, in this case, we do not agree what the rule should be here and it's in a way unprecedented. We'll soon converge to a societal agreement. I hope society abstaining itself from tools will not be the answer.

reply

upvote

by gspr4 hours ago|

[-]

And the process by which we agree is lawmaking.

reply

upvote

by 18 hours ago|

[-]

deleted

reply