upvote
What would be the incentive for someone to do this for real?

We all have access to SOTA LLMs. If I want a "clean room" implementation of some OSS library, and I can choose between paying a third party to run a script to have AI rebuild the whole library for me and just asking Claude to generate the bits of the library I need, why would I choose to pay?

I think this argument applies to most straightforward "AI generated product" business ideas. Any dev can access a SOTA coding model for $20p/m. The value-add isn't "we used AI to do the thing fast", it's the wrapping around it.

Maybe in this case the "wrapping" is that some other company is taking on the legal risk?

reply
There's a lot of things you could do to be malicious towards other people with minimal effort, yet strangely few people do it. Virtually everyone has morals, and most people's are quite compatible with society (hence we have a society) even if small perturbations in foundational morals sometimes lead to seemingly large discrepancies in resultant actions

You need the right kind of person, in the right life circumstances, to have this idea before it happens for real. By having publicity, it becomes vastly more likely that it finds someone who meets the former two criteria, like how it works with other crime (https://en.wikipedia.org/wiki/Copycat_crime). So thanks, Malus :P

reply
Also, there's a difference between "willing to do a bad thing for money" and "actively searching out a bad thing, then proactively building a whole company out of it in the hopes of making money."

It's the difference between a developer taking a job at Palantir out of college because nobody had a better offer, and a guy spending years in his basement designing "Immigrant Spotter+" in the hopes of selling it to the government. Sure, they're both evil, but lots of people pick the first thing, and hardly anybody does the second.

reply
What do you mean nobody has done it?

It's an inevitable outcome of automatic code generation that people will do this all the time without thinking about it.

Example: you want a feature in your project, and you know this github repo implements it, so you tell an AI agent to implement the feature and link to the github repo just for reference.

You didn't tell the agent to maliciously reimplement it, but the end result might be the same - you just did it earnestly.

reply
The bottleneck is trust and security. I'd rather defenestrate 3rd party libraries with a local instance of copilot than send all my secret sauce to some cloud/SaaS system.

Put differently, this system already exists and is in heavy use today.

reply
>why hasn't anyone done this for real?

because LLMs can't program anything of non-trivial complexity despite the persistent delusions from its advocates, same reason the lovers of OSS haven't magically fixed every bug in open source software.

reply
> why hasn't anyone done this for real?

WDYM? LLMs are essentially this.

reply
Most LLMs are trained on a lot of the source code for many open-source projects. This 'project' has the whole song-and-dance about never seeing the source code and separating the system to skirt around legal trouble. Why didn't anyone do that yet?
reply
Because that's impossible. Any "robot" that can generate code must be trained on massive amounts of code, most of which is open source.
reply
And how are you supposed to guarantee equivalent functionality by analyzing "README files, API docs, and type definitions"?
reply
It's described on the web page but it's by having 2 agents. One has access to the code and one doesn't.
reply
Are they the same model?

Not that it matters, I just think the joke is more fun if they are different.

reply
The joke is that you don’t.
reply
not a lot of code is public domain and thus not a lot of training data is available
reply
For each project you want to rip off, you'd have to first train an entirely new LLM on all sources except for the target project. Prohibitively expensive.
reply