undefined

upvote

points

by sheepscreek5 hours ago |

upvote

by SlinkyOnStairs5 hours ago|

[-]

Reputation isn't very relevant here. Yes, for established well known FOSS developers, their reputation will tank if they put out sloppy PRs and people will just ignore them.

But the projects aren't drowning under PRs from reputable people. They're drowning in drive-by PRs from people with no reputation to speak of. Even if you outright ban their account, they'll just spin up a new one and try again.

Blocking AI submissions serves as a heuristic to reduce this flood of PRs, because the alternative is to ban submissions from people without reputation, and that'd be very harmful to open source.

And AI cannot be the solution here, because open source projects have no funds. Asking maintainers to fork over $200/month for "AI code reviews" just kills the project.

reply

upvote

by bityard5 hours ago|

[-]

> because the alternative is to ban submissions from people without reputation, and that'd be very harmful to open source.

Hmmm, no? That's actually very common in open source. Maybe "banning" isn't the right word, but lots of projects don't accept random drive-by submissions and never have. Debian is a perfect example, you are very unlikely to get a nontrivial patch or package into Debian unless you have some kind of interaction or rapport with a package maintainer, or commit to the process of building trust to become a maintainer yourself.

I have seen high profile GitHub projects that summarily close PRs if you didn't raise the bug/feature as an issue or join their discord first.

reply

upvote

by SlinkyOnStairs4 hours ago|

[-]

Setting aside "make an issue first" because those too are flooded with LLMs.

> you are very unlikely to get a nontrivial patch or package into Debian unless you have some kind of interaction or rapport with a package maintainer

I did mean the "trivial" patches as well, as often it's a lot of these small little fixes to single issues that improve software quality overall.

But yes, it's true that it's not uncommon for projects to refuse outside PRs.

This already causes massive amounts of friction and contributes (heh) heavily to what makes Open Source such a pain in the ass to use.

Conversely, many popular "good" open source libraries rely extensively on this inflow of small contributions to become comprehensively good.

And so it's a tradeoff. Forcing all open source into refusing drive-by PRs will have costs. What makes sense for major security-sensitive projects with large resources doesn't make sense for others.

It's not that we won't have open source at all. It's that it'll just be worse and encourage further fragmentation. e.g. One doesn't build a good .ZIP library by carefully reading the specification, you get it by collecting a million little examples of weird zip files in the wild breaking your code.

reply

upvote

by LtWorf2 hours ago|

[-]

You can literally just attach a patch to a bugreport on debian…

reply

upvote

by hombre_fatal5 hours ago|

[-]

Well, the problem you just outlined is a reputation (+ UI) problem: why are contributions from unknown contributors shown at the same level as PRs from known quality contributors, for example?

We need to rethink some UX design and processes here, not pretend low quality people are going to follow your "no low quality pls i'm serious >:(" rules. Rather, design the processes against low quality.

Also, we're in a new world where code-change PRs are trivial, and the hard part isn't writing code anymore but generating the spec. Maybe we don't even allow PRs anymore except for trusted contributors, everyone else can only create an issue and help refine a plan there which the code impl is derived?

You know, even before LLMs, it would have been pretty cool if we had a better process around deliberating and collaborating around a plan before the implementation step of any non-trivial code change. Changing code in a PR with no link to discussion around what the impl should actually look like always did feel like the cart before the horse.

reply

upvote

by SlinkyOnStairs5 hours ago|

[-]

In the long distant past of 4-5 years ago, it simply wasn't a problem. Few projects were overwhelmed with PRs to begin with.

And for the major projects where there was a flood of PRs, it was fairly easy to identify if someone knew what they were talking about by looking at their language; Correct use of jargon, especially domain-specific jargon.

The broader reason why "unknown contributor" PRs were held in high regard is that, outside of some specific incidents (thank you, DigitalOcean and your stupid tshirts), the odds were pretty good of a drive by PR coming from someone who identified a problem in your software by using it. Those are incredibly valuable PRs, especially as the work of diagnosing the problem generally also identifies the solution.

It's very hard to design a UX that impedes clueless fools spamming PRs but not the occasional random person finding sincere issues and having the time to identify (and fix them) but not permanent project contribution.

> and the hard part isn't writing code anymore but generating the spec

My POV: This is a bunch of crap and always has been.

Any sufficiently detailed specification is code. And the cost of writing such a specification is the cost of writing code. Every time "low code" has been tried, it doesn't work for this very reason.

e.g. The work of a ticket "Create a product category for 'Lime'" consists not of adding a database entry and typing in the word 'Lime', it consists of the human work of calling your client and asking whether it should go under Fruit or Cement.

reply

upvote

by bombcar5 hours ago|

[-]

Because until now, unknown contributors either submitted obvious junk which could be closed by even an unskilled moderator (I've done triage work for OS projects before) or they submitted something that was workable and a good start.

The latter is where you get all known contributors from! So if you close off unknown contributors the project will eventually stagnate and die.

reply

upvote

by dudeinhawaii4 hours ago|

[-]

I don't see why we can't have AI powered reviews as a verification of truth and trust score modifier. Let me explain.

1. You layout policy stating that all code, especially AI code has to be written to a high quality level and have been reviewed for issues prior to submission.

2. Given that even the fastest AI models do a great job of code reviews, you setup an agent using Codex-Spark or Sonnnet, etc to scan submissions for a few different dimensions (maintainability, security, etc).

3. If a submission comes through that fails review, that's a strong indication that the submitter hasn't put even the lowest effort into reviewing their own code. Especially since most AI models will flag similar issues. Knock their trust score down and supply feedback.

3a. If the submitter never acts on the feedback - close the submission and knock the trust score down even more.

3b. If the submitter acts on the feedback - boost trust score slightly. We now have a self-reinforcing loop that pushes thoughtful submitters to screen their own code. (Or ai models to iterate and improve their own code)

4. Submission passes and trust score of submitter meets some minimal threshold. Queued for human review pending prioritization.

I haven't put much thought into this but it seems like you could design a system such that "clout chasing" or "bot submissions" would be forced to either deliver something useful or give up _and_ lose enough trust score that you can safely shadowban them.

reply

upvote

by SlinkyOnStairs3 hours ago|

[-]

The immediate problem is just cost. Open Source has no money, so any fancy AI solution is off the table immediately.

In terms of your plan though, you're just building a generative adversarial network here. Automated review is relatively easy to "attack".

Yet human contributors don't put up with having to game an arbitrary score system. StackOverflow imploded in no small part because of it.

reply

upvote

by lich_king4 hours ago|

[-]

> Precisely. “AI” contributions should be seen as an extension of the individual.

That's an OK view to hold, but I'll point out two things. First, it's not how the tech is usually wielded to interact with open-source software. Second, your worldview is at odds with the owners of this technology: the main reason why so much money is being poured into AI coding is that it's seen by investors as a replacement for the individual.

reply

upvote

by aerodexis5 hours ago|

[-]

Interesting argument for AI ethics in general. It takes the form of "guns don't kill people - people kill people".

reply

upvote

by jazzyjackson5 hours ago|

[-]

Unfortunately ChatGPT turned “text continuation” into “separate entity you can talk to”

reply

upvote

by aerodexis4 hours ago|

[-]

The desire to anthropomorphize LLMs is super interesting. People naturally anthropomorphize technology (even printers: "why are you not working!?"). It's a natural and useful heuristic. However, I can easily see how chatGPT would want to intensify this tendency in order to sell the technology's "agency" and the promise that it can solve all your problems. However, since it's a heuristic, it papers over a lot of details that one would do well to understand.

(as an aside - this reminds me of the trend of Object Oriented Ontology that specifically /tried/ to imbue agency onto large-scale phenomena that were difficult to understand discretely. I remember "global warming" being one of those things - and I can see now how this philosophy would have done more to obscure the dominion of experts wrt that topic)

reply

upvote

by 4 hours ago|

[-]

deleted

reply

upvote

by dataflow5 hours ago|

[-]

I don't think any side on the issue of gun ownership has ever claimed that statement is false, so I'm not sure what your point is.

reply

upvote

by johnnyanmac3 hours ago|

[-]

The point is thst this is a common pro-gun argument to deflect from the fact that making guns harder to own does in fact reduce gun violence. Which is how much of the rest of the world works.

But post Sandy Hook, it's clear which side prevailed in this argument.

reply

upvote

by glhaynes5 hours ago|

[-]

An argument that I have some sympathy for, while still being moderately+ in favor of gun control (here in the USA where I'm a citizen).

It seems that gun control—though imperfect—in regions that have implemented it has had a good bit of success and the legitimate/non-harmful capabilities lost seem worth it to me in trade for the gains. (Reasonable people can disagree here!)

Whereas it seems to me that if we accept the proposition that the vast majority of code in the future is going to be written by AI (and I do), these valuable projects that are taking hard-line stances against it are going to find themselves either having to retreat from that position or facing insurmountable difficulties in staying relevant while holding to their stance.

reply

upvote

by estebank5 hours ago|

[-]

> these valuable projects that are taking hard-line stances against it are going to find themselves either having to retreat from that position or facing insurmountable difficulties in staying relevant while holding to their stance.

It is the conservative position: it will be easier to walk back the policy and start accepting AI produced code some time down the road when its benefits are clearer than it will be to excise AI produced code from years prior if there's a technical or social reason to do that.

Even if the promise of AI is fulfilled and projects that don't use it are comparatively smaller, that doesn't mean there's no value in that, in the same way that people still make furniture in wood with traditional methods today even if a company can make the same widget cheaper in an almost fully automated way.

reply

upvote

by datsci_est_20155 hours ago|

[-]

> It seems that gun control—though imperfect—in regions that have implemented it has had a good bit of success and the legitimate/non-harmful capabilities lost seem worth it to me in trade for the gains.

This is even true despite the fact that there are bad actors only a few minutes drive away in many cases (Chicago->Indiana border, for example).

reply