upvote
If consent to use of your code in AI training can be revoked at any time, that makes training impossible, since if anyone ever withdraws consent, it's not like you can just take out their work from your finished model.
reply
Yup. Not my problem.

You could even say it strongly would very strongly incentivize the LLM companies to be on their best behavior, otherwise people would start revoking consent en-masse and they'd have to keep training new models all the time.

If you want something more realistic, there would probably be time limits how long they have to comply and how much they have to compensate the authors for the time it took them to comply.

There absolutely are ways to make it work in mutually beneficial ways, there's just no political will because of the current hype and because companies have learned they can get away with anything (including murder BTW).

reply
Almost all the productivity enhancement provided by an AI coding assistant is provided by circumventing the copyright laws, with the remaining enhancement being provided by the fact that it automates the search-copy-paste loop that you would do if you had direct access to the programs used during training.

(Much of the apparent gain of the automatic search-copy-paste is wasted by skipping the review phase that would have been done at that time when that were done manually, which must then be done in a slower manner when you must review the harder-to-understand entire program generated by the AI assistant.)

Despite the fact that AI coding assistants are copyright breaking tricks, the fact that this has become somehow allowed is an overall positive development.

The concept of copyright for programs has been completely flawed from its very beginning. The reason is that it is absolutely impossible to write any kind of program that is not a derivative of earlier programs.

Any program is made by combining various standard patterns and program structures. You can construct a derivation sequence between almost any 2 programs, where you decompose the first in some typical blocks, than compose the second program from such blocks, while renaming all identifiers.

It is quite subjective to decide when a derivation sequence becomes complex enough that the second program should not be considered as a derivative of the first from the point of view of copyright.

The only way to avoid the copyright restrictions is to exploit loopholes in the law, e.g. if translating an algorithm to a different programming language does not count as being derivative or when doing other superficial automatic transformations of a source program changes its appearance sufficiently that it is not recognized as derivative, even if it actually is. Or when combining a great number of fragments from different programs is again not recognized as derivative, though it still kind of is.

The only way how it became possible for software companies like Microsoft or Adobe to copyright their s*t is because the software industry based on copyrighted programs has been jumpstarted by a few decades of programming during which programs were not copyrighted, which could then be used as a base by the first copyrighted programs.

So AI coding agents allow you to create programs that you could not have written when respecting the copyright laws. They also may prevent you from proving that a program written by someone else infringes upon the copyright that you claim for a program written with assistance.

I believe that both these developments are likely to have more positive consequences than negative consequences. The methods used first in USA and then also in most other countries (due to blackmailing by USA) for abusing the copyright laws and the patent laws have been the most significant blockers of technical progress during the last few decades.

The most ridiculous claim about the copyright of programs is that it is somehow beneficial for "creators". Artistic copyrights sometimes are beneficial for creators, but copyrights on non-open-source programs are almost never owned by creators, but by their employers, and even those have only seldom any direct benefit from the copyright, but they use it with the hope that it might prevent competition.

reply
> The reason is that it is absolutely impossible to write any kind of program that is not a derivative of earlier programs.

And that's why copyright has exceptions for humans.

You're right copyright was the wrong tool for code but for the wrong reasons.

It shouldn't be binary. And the law should protect all work, not just creative. Either workers would come to a mutual agreement how much each contributed or the courts would decide based on estimates. Then there'd be rules about how much derivation is OK, how much requires progressively more compensation and how much the original author can plainly tell you what to do and not do with the derivative.

It's impossible to satisfy everyone but every person has a concept of fairness (it has been demonstrated even in toddlers). Many people probably even have an internally consistent theory of fairness. We should base laws on those.

> abusing the copyright laws and the patent laws have been the most significant blockers of technical progress during the last few decades

Can you give examples?

> copyrights on non-open-source programs are almost never owned by creators, but by their employers

Yes and that's another thing that's wrong with the system, employment is a form of abusive relationship because the parties are not equal. We should fix that instead of throwing out the whole system. Copyright which belongs to creators absolutely does give creators more leverage and negotiating power.

reply