undefined

points

[-]

Most LLMs are trained on a lot of the source code for many open-source projects. This 'project' has the whole song-and-dance about never seeing the source code and separating the system to skirt around legal trouble. Why didn't anyone do that yet?

by imiric3 hours ago|

parent|

[-]

Because that's impossible. Any "robot" that can generate code must be trained on massive amounts of code, most of which is open source.

by sdwr3 hours ago|

parent|

[-]

And how are you supposed to guarantee equivalent functionality by analyzing "README files, API docs, and type definitions"?

by Nolski2 hours ago|

parent|

[-]

It's described on the web page but it's by having 2 agents. One has access to the code and one doesn't.

by fmbb2 hours ago|

parent|

[-]

Are they the same model?

Not that it matters, I just think the joke is more fun if they are different.

by dymk2 hours ago|

parent|

prev|

[-]

The joke is that you don’t.

by preisschild3 hours ago|

parent|

prev|

[-]

not a lot of code is public domain and thus not a lot of training data is available

by phyzome1 hours ago|

prev|

[-]

For each project you want to rip off, you'd have to first train an entirely new LLM on all sources except for the target project. Prohibitively expensive.