upvote
Yes, this is the reason I've completely stopped releasing any open-source projects. I'm discovering that newer models are somewhat capable of reverse-engineering even compiled WebAssembly, etc. too, so I can feel a sort of "dark forest theory" taking hold. Why publish anything - open or closed - to be ripped off at negligible marginal cost?
reply
People are just not realizing this now because it's mostly hobby projects and companies doing it in private, but eventually everyone will realize that LLMs allow almost any software to be reverse engineered for cheap.

See e.g. https://banteg.xyz/posts/crimsonland/ , a single human with the help of LLMs reverse engineered a non-trivial game and rewrote it in another language + graphics lib in 2 weeks.

reply
It’s a real problem. I threw it at an old MUD game just to see how hard it is [0] then used differential testing and LLMs to rewrite it [1]. Just seems to be time and money.

[0] https://reorchestrate.com/posts/your-binary-is-no-longer-saf...

[1] https://reorchestrate.com/posts/your-binary-is-no-longer-saf...

reply
Wow, as a former MajorMUD addict (~30 years ago) that's extremely interesting to see. Especially since MajorMUD is rarely discussed on HN, even in MUD or BBS-related threads.

Did you find it worked reasonably well on any portion of the codebase you could throw at it? For example, if I recall correctly, all of MajorMUD's data file interactions used the embedded Btrieve library which was popular at the time. For that type of specialized low-level library, I'm curious how much effort it would take to get readable code.

reply
I am getting closer and closer to a full verified rewrite in Rust. I have also moved to a much easier sqlite relational structure for the backend.

I actually sidestepped the annoying btrieve problem by exporting the data using a go binary [0] and I write it to a sqlite instance with raw byte arrays (blobs). btreive is weird because it has a dll but also a a service to interact with the files.

P.s. I have spent a lot of hours on this mainly to learn actual LLM capabilities that have improved a huge amount in the last year.

[0] https://github.com/barchart/go-btrieve

reply
Why does it matter if it is 'ripped off' if you released it as open source anyway? I get that you might want to impose a particular licence, but is that the only reason?
reply
Even the most permissive open source licenses such as MIT require attribution. Releasing as open source would therefore benefit the author through publicity. Bein able to say that you're the author of library X, used by megacorp Y with great success, is a good selling point in a job interview.

LLM ripping off open source code removes that.

reply
Yes, that's a good point.
reply
This is pretty much exactly why copyright laws came about in the first place. Why bother creating a book, painting, or other work of art if anyone can trivially copy it and sell it without handing you a dime?

I think refusing to publish open source code right now is the safe bet. I know I won't be publishing anything new until this gets definitively resolved, and will only limit myself to contributing to a handful of existing open source projects.

reply
It's not just open source, it is literally anything source-available, whether intentional or not.
reply
I find the wording "protect from unwanted use" interesting.

It is my understanding that what a GPL license requires is releasing the source code of modifications.

So if we assume that a rewrite using AI retains the GPL license, it only means the rewrite needs to be open source under the GPL too.

It doesn't prevent any unwanted use, or at least that is my understanding. I guess unwanted use in this case could mean not releasing the modifications.

reply
If the AI product is recognised as "derivative work" of a GPL-compliant project, then it must itself be licensed under the GPL. Otherwise, it can be licensed under any other license (including closed source/proprietary binary licenses). This last option is what threatens to kill open source: an author no longer has control over their project. This might work for permissive licenses, but for GPL/AGPL and similar licenses, it's precisely the main reason they exist: to prevent the code from being taken, modified, and treated as closed source (including possible use as part of commercial products or Sass).
reply
Yeah, the GPL is deficient in that way and doesn't handle other hostile uses.
reply
If you'd be willing to close source your "libre" open source project because somebody might do something you don't like with it, you never wanted a "libre" project.
reply
In this case someone is making a non-libre project with it.
reply