undefined

[-]

It's not even clear you can license language model weight though.

I'm not a lawyer but the analysis I've read had a pretty strong argument that there's no human creativity involved in the training, which is an entirely automatic process, and as such it cannot be copyrighted in any way (the same way you cannot put a license on a software artifact just because you compiled it yourself, you must have copyright ownership on the source code you're compiling).

[-]

IANAL either but the answer likely depends on the jurisdiction

US standards for copyrightability require human creativity and model weights likely don’t have the right kind of human creativity in them to be copyrightable in the US. No court to my knowledge has ruled on the question as yet, but that’s the US Copyright Office’s official stance.

By contrast, standards for copyrightability in the UK are a lot weaker than-and so no court has ruled on the issue in the UK yet either, it seems likely a UK court would hold model weights to be copyrightable

So from Google/Meta/etc’s viewpoint, asserting copyright makes sense, since even if the assertion isn’t legally valid in the US, it likely is in the UK - and not just the UK, many other major economies too. Australia, Canada, Ireland, New Zealand tend to follow UK courts on copyright law not US courts. And many EU countries are closer to the UK than the US on this as well, not necessarily because they follow the UK, often because they’ve reached a similar position based on their own legal traditions

Finally: don’t be surprised if Congress steps in and tries to legislate model weights as copyrightable in the US too, or grants them some sui generis form of legal protection which is legally distinct from copyright but similar to it-I can already hear the lobbyist argument, “US AI industry risks falling behind Europe because copyrightability of AI models in the US is legally uncertain and that legal uncertainty is discouraging investment”-I’m sceptical that is actually true, but something doesn’t have to be true for lobbyists to convince Congress that it is

by lawlessone226 days ago|

[-]

>don’t be surprised if Congress steps in and tries to legislate model weights as copyrightable in the US too

"Your Honor i didn't copy their weights, i used them to train my models weights"

by simonw226 days ago|

[-]

> US standards for copyrightability require human creativity and model weights likely don’t have the right kind of human creativity in them to be copyrightable in the US. No court to my knowledge has ruled on the question as yet, but that’s the US Copyright Office’s official stance.

Has the US copyright office said that about model weights? I've only heard them saying that about images produced entirely from a prompt to a model.

[-]

I thought I read something by them explicitly addressing the question but I can’t find it now.

However, read page 22 of https://www.copyright.gov/comp3/chap300/ch300-copyrightable-... - it is their settled position that the output of a mechanical process cannot be copyrightable unless there was substantial human creative input into it - and it is pretty clear that AI training doesn’t involve human creative input in the relevant sense. Now, no doubt there is lots of human skill and art in picking the best hyperparameters, etc - but that’s not input of the right kind. An analogy - a photocopier does not create a new copyright in the copy, even though there is skill and art in picking the right settings on the machine to produce the most faithful copy. The human creativity in choosing hyperparameters isn’t relevant to copyrightability because it isn’t directly reflected in the creative elements of the model itself

A model with RLHF fine-tuning could be a different story - e.g. Anthropic went to a lot of effort to make Claude speak with a distinctive “voice”, and some of that involved carefully crafting data to use for fine-tuning, and the model may contain some of the copyright of that training data.

But, even if that argument also applies to Gemma or Llama - if someone intentionally further fine-tunes the model in order to remove that distinctive “voice”, then you’ve removed the copyrightable element from the model and what is left isn’t copyrightable. Because the really expensive part of building a model is building the foundation model, and that’s the part least likely to be copyrightable; whereas, fine-tuning to speak with a distinctive voice is more likely to be copyrightable, but that’s the easy part, and easy to rip out (and people have motivation to do so because a lot of people desire a model which speaks with a different voice instead)

by tough226 days ago|

[-]

A very good lawyer could argue that creating the data sets for training, doing the evals, and RLHF, constitutes -human creativity- and not a mechanical endeavor.

but who knows judges can be weird about tech

[-]

Right, but it isn’t legally enough for there to be creativity in the supervision of the mechanical process - that creativity has to take the form of creative elements which survive in some identifiable form in the end product. The technical skill of managing a mechanical process can involve a great deal of creativity, but that doesn’t legally count as “creative” unless that is directly surfaced in the model output

I think the case is the strongest with RLHF - if your model speaks with a distinctive “voice”, and to make it do so you had to carefully craft training data to give it that voice, such that there are obvious similarities (shared turns of speech, etc) between your RLHF training input and the model outputs - that aspect of the model likely is copyrightable. But if you are trying to improve a model’s performance at mathematics problems, then no matter how much creativity you put into choosing training data, it is unlikely identifiable creative elements from the training data survive in the model output, which suggests that creativity didn’t actually make it into the model in the sense relevant to US copyright law

by 47282847226 days ago|

[-]

In that line of reasoning, does it really matter how “close“ jurisdictions are to each other — also considering how what courts rule doesn’t matter as much in countries governed by civil law - but merely the enforcement of the Berne convention? As in, if something is considered to be under copyright in any one of all the signatory countries of it, the others have to respect that?

[-]

No, the Berne convention doesn’t work that way. It requires you to extend copyright protection to the works of the nationals of the other parties on the same terms as you offer it to the works of your own nationals; but if a certain category of works are excluded from copyright for your own nationals, it doesn’t require you to recognise copyright in those works when authored by foreign nationals, even if their own country’s laws do

Real example: UK law says telephone directories are eligible for copyright, US law says they aren’t. The US is not violating the Berne convention by refusing to recognise copyright in UK phone directories, because the US doesn’t recognise copyright in US phone directories either. A violation would be if the US refused to recognise copyright in UK phone directories but was willing to recognise it in US ones

by 47282847225 days ago|

[-]

Makes sense. Thanks!

by dragonwriter226 days ago|

[-]

> It's not even clear you can license language model weight though.

It is clear you can license (give people permissions to) model weights, it is less clear that there is any law protecting them such that they need a license, but since there is always a risk of suit and subsequent loss in the absence of clarity, licenses are at least beneficial in reducing that risk.

by AlanYx226 days ago|

[-]

That's one of the reasons why they gate Gemini Nano with the "Gemini Nano Program Additional Terms of Service". Even if copyright doesn't subsist in the weights or if using them would be fair use, they still have recourse in breach of contract.

by derefr226 days ago|

[-]

I've wondered about this for a while now (where e.g. some models of HuggingFace require clickwrap license agreements to download, that try to prohibit you from using the model in certain ways.)

It seems to me that if some anonymous ne'er-do-well were to publicly re-host the model files for separate download; and you acquired the files from that person, rather than from Google; then you wouldn't be subject to their license, as you never so much as saw the clickwrap.

(And you wouldn't be committing IP theft by acquiring it from that person, either, because of the non-copyrightability.)

I feel that there must be something wrong with that logic, but I can't for the life of me think of what it is.

[-]

The problem is that contracts don’t bind subsequent recipients, copyright does

Google gives the model to X who gives it to Y who gives it to Z. X has a contract with Google, so Google can sue X for breach of contract if they violate its terms. But do Y and Z have such a contract? Probably not. Of course, Google can put language in their contract with X to try to make it bind Y and Z too, but is that language going to be legally effective? More often than not, no. The language may enable Google to successfully sue X over Y and Z’s behaviour, but not successfully sue Y and Z directly. Whereas, with copyright, Y and Z are directly liable for violations just as X is

by jinlisp226 days ago|

[-]

Thank you, this is a nice point to consider. Don't know if using the weights could be considered equivalent or implying accepting the terms of services from weights creators.

[-]

Contracts require agreement (a “meeting of the minds”)… if X makes a contract with Google, that contract between Google and X can’t create a contract between Google and Y without Y’s agreement. Of course, Google’s lawyers will do all they can possibly can to make the contract “transitive”, but the problem is contracts fundamentally don’t have the property of transitivity.

Now, if you are aware of a contract between two parties, and you actively and knowingly cooperate with one of them in violating it, you may have some legal liability for that contractual violation even though you weren’t formally party to the contract, but there are limits - if I know you have signed an NDA, and I personally encourage you to send me documents covered by the NDA in violation of it, I may indeed be exposed to legal liability for your NDA violation. But, if we are complete strangers, and you upload NDA-protected documents to a file sharing website, where I stumble upon them and download them - then the legal liability for the NDA violation is all on you, none on me. The owner of the information could still sue me for downloading it under copyright law, but they have no legal recourse against me under contract law (the NDA), because I never had anything to do with the contract, neither directly nor indirectly

If you download a model from the vendor’s website, they can argue you agreed to the contract as a condition of being allowed to make the download. But if you download it from elsewhere, what is the consideration (the thing they are giving you) necessary to make a binding contract? If the content of the download is copyrighted, they can argue the consideration is giving you permission to use their copyrighted work; but if it is an AI model and models are uncopyrightable, they have nothing to give when you download it from somewhere else and hence no basis to claim a contractual relationship

What they’ll sometimes do, is put words in the contract saying that you have to impose the contract on anyone else you redistribute the covered work to. And if you redistribute it in full compliance with those terms, your recipients may find themselves bound by the contract just as you are. But if you fail to impose the contract when redistributing, the recipients escape being bound for it, and the legal liability for that failure is all yours, not theirs

by jinlisp225 days ago|

[-]

Thanks for such a clear and logical explanation, it is a pleasure to read explanations like this. Anyway, I am always skeptical about how law is applied, sometimes the spirit of the law is bended by the weight of the powerful organizations, perhaps there are some books which explains how the spirit of the law is not applied when powerful organizations are able to tame it.

by km3r226 days ago|

[-]

Why not? Training isn't just "data in/data out". The process for training is continuously tweaked and adjusted. With many of those adjustments being specific to the type of model you are trying to output.

[-]

The US copyright office’s position is basically this-under US law, copyrightability requires direct human creativity, an automated training process involves no direct human creativity so cannot produce copyright. Now, we all know there is a lot of creative human effort in selecting what data to use as input, tinkering with hyperparameters, etc - but the copyright office’s position is that doesn’t legally count - creative human effort in overseeing an automated process doesn’t change the fact that the automated process itself doesn’t directly involve any human creativity. So the human creativity in model training fails to make the model copyrightable because it is too indirect

By contrast, UK copyright law accepts the “mere sweat of the brow” doctrine, the mere fact you spent money on training is likely sufficient to make its output copyrightable, UK law doesn’t impose the same requirements for a direct human creative contribution

by IncreasePosts226 days ago|

[-]

Doesn't that imply just the training process isn't copyrightable? But weights aren't just training, they're also your source data. And if the training set shows originality in selection, coordination, or arrangement, isn't that copyrightable? So why wouldn't the weights also be copyrightable?

[-]

The problem is, can you demonstrate that originality of selection and arrangement actually survives in the trained model? It is legally doubtful.

Nobody knows for sure what the legal answer is, because the question hasn’t been considered by a court - but the consensus of expert legal opinion is copyrightability of models is doubtful under US law, and the kind of argument you make isn’t strong enough to change that. As I said, different case for UK law, nobody really needs your argument there because model weights likely are copyrightable in the UK already

by littlestymaar225 days ago|

[-]

> The problem is, can you demonstrate that originality of selection and arrangement actually survives in the trained model? It is legally doubtful.

It's particularly perilous since the AI trainers are at the same time in a position where they want to argue that copyrighted work they included in the training data don't actually survive in the trained model.

by badsectoracula226 days ago|

[-]

For the same reason GenAI output isn't copyrightable regardless of how much time you spend tweaking your prompts.

Also i'm pretty sure none of the AI companies would really want to touch the concept of having the copyright of source data affect the weight's own copyright, considering all of them pretty much hoover up the entire Internet without caring about those copyrights (and IMO trying to claim that they should be able to ignore the copyrights of training data and also that the GenAI output is not under copyright but at the same trying trying to claim copyright for the weights is dishonest, if not outright leechy).

by rvnx226 days ago|

[-]

The weights are mathematical facts. As raw numbers, they are not copyrightable.

by IncreasePosts226 days ago|

[-]

`en_windows_xp_professional_with_service_pack_3_x86_cd_vl_x14-73974.iso` is also just raw numbers, but I believe Windows XP was copyrightable

by vntok226 days ago|

[-]

Interesting.

From what I understand, copyright only applies to the original source code, GUI and bundled icon/sound/image files. Functionality etc. would fall under patent law. So the compiled code on your .ISO for example would not only be "just raw numbers" but uncopyrightable raw numbers.

by lolc225 days ago|

[-]

Of course copyright applies to binaries too. It's long been established that compiled code is a derived work of its source.

by victorbjorklund225 days ago|

[-]

A computer program is just 0s and 1s. Harry Potter books are just raw letters or raw numbers if an ebook.

(The combination is what makes it copyrightable).

by littlestymaar225 days ago|

[-]

In practice it's not the combination that is copyrighted (you cannot claim copyright over a binary just because you zipped it, or over a movie because you re-encoded it, for instance).

It's the “actual creativity” inside. And it is a fuzzy concept.

by floridianfisher226 days ago|

[-]

According go the Gemma 3n preview blog, Gemma 3n shares the same architecture as the upcoming version of Gemini Nano.

The ‘n’ presumably stands for Nano.

Nano is a proprietary model that ships with Android. Gemma is an open model that can be adapted and used anywhere.

Sources: https://developers.googleblog.com/en/introducing-gemma-3n/

Video in the in the blog linked in this post

by jabroni_salad226 days ago|

[-]

Gemma is open source and apache 2.0 licensed. If you want to include it with an app you have to package it yourself.

gemini nano is an android api that you dont control at all.

by nicce226 days ago|

[-]

> Gemma is open source and apache 2.0 licensed

Closed source but open weight. Let’s not ruin the definition of the term in advantage of big companies.

by zackangelo226 days ago|

[0] https://github.com/google-deepmind/gemma [1] https://github.com/vllm-project/vllm/pull/2964

[-]

Your reply adds more confusion, imo.

The inference code and model architecture IS open source[0] and there are many other high quality open source implementations of the model (in many cases contributed by Google engineers[1]). To your point: they do not publish the data used to train the model so you can't re-create it from scratch.

by candiddevmike226 days ago|

[-]

If for some reason you had the training data, is it even possible to create an exact (possibly same hash?) copy of the model? Seems like there are a lot of other pieces missing like the training harness, hardware it was trained on, etc?

by OneDeuxTriSeiGo226 days ago|

[-]

to be entirely fair that's quite a high bar even for most "traditional" open source.

And even if you had the same data, there's no guarantee the random perturbations during training are driven by a PRNG and done in a way that is reproducible.

Reproducibility does not make something open source. Reproducibility doesn't even necessarily make something free software (under the GNU interpretation). I mean hell, most docker containers aren't even hash-reproducible.

by zackangelo226 days ago|

[-]

Yes, this is true. A lot of times labs will hold back necessary infrastructure pieces that allow them to train huge models reliably and on a practical time scale. For example, many have custom alternatives to Nvidia’s NCCL library to do fast distributed matrix math.

Deepseek published a lot of their work in this area earlier this year and as a result the barrier isn’t as high as it used to be.

by nicce226 days ago|

[-]

I am not sure if this adds even more confusion. Linked library is about fine-tuning which is completely different process.

Their publications about producing Gemma is not accurate enough that even with data you would get the same results.

by zackangelo226 days ago|

[-]

In the README of the linked library they have a code snippet showing how to have a conversation with the model.

Also, even if it were for fine tuning, that would require an implementation of the model’s forward pass (which is all that’s necessary to run it).

by nicce226 days ago|

[-]

That is completely different discussion. Otherwise, even Gemini 2.5 Pro would be open-source with this logic since clients are open-source for interacting with the cloud APIs.

by Imustaskforhelp226 days ago|

[-]

Yes!! But I doubt how many are truly truly open source models since most just confuse open source with open weights and the definition has been changed really smh.

by cesarb226 days ago|

[-]

> Gemma is open source and apache 2.0 licensed.

Are you sure? On a quick look, it appears to use its own bespoke license, not the Apache 2.0 license. And that license appears to have field of use restrictions, which means it would not be classified as an open source license according to the common definitions (OSI, DFSG, FSF).

by jabroni_salad225 days ago|

[-]

Perhaps we could rephrase my statement to "there are a bunch of green checkmarks on github that may or may not mean anything depending on who you ask."

by yencabulator225 days ago|

https://ai.google.dev/gemma/prohibited_use_policy

[-]

Wait, what files are you reading? https://github.com/google-deepmind/gemma/blob/main/LICENSE

(Even then, releasing some source code under Apache-2 does not make a model "open source".)

Ah I found https://ai.google.dev/gemma/terms

  > You must not use any of the Gemma Services:
  >
  > 1. for the restricted uses set forth in the Gemma Prohibited Use Policy at ai.google.dev/gemma prohibited_use_policy ("Prohibited Use Policy"), which is hereby incorporated by reference into this Agreement; or
  > 2. in violation of applicable laws and regulations.

Yeah, definitely not open source, even if they had released all the training data.

by impure226 days ago|

[-]

I suspect the difference is in the training data. Gemini is much more locked down and if it tries to repeat something from the draining data verbatim you will get a 'recitation error'.

by 226 days ago|

[-]

deleted

by readthenotes1226 days ago|