upvote
I think there's a couple levels here:

First of all, building a system that constrains the output of the AI sufficiently, whether that's typing, testing, external validation, or manual human review in extremis. That gets you the best result out of whatever harness or orchestration you're using.

Secondly, there's the level at which you're intervening, something along the hierarchy of "validate only usage from the customer perspective" to "review, edit, and validate every jot and tiddle of the codebase and environment". I think for relatively low importance things reviewing at the feature level (all code, but not interim diffs) is fine, but if you're doing network protocol you better at least validate everything carefully with fuzzing and prop testing or something like that.

And then you've got how you structure your feedback to the LLM itself - is it an in-the-loop chat process, an edit-and-retry spec loop, go-nogo on a feature branch, or what? How does the process improve itself, basically?

I agree with you entirely that the responsibility rests on the human, but there are a variety of ways to use these things that can increase or decrease the quality of code to time spent reviewing, and obviously different tasks have different levels of review scrutiny, as well.

reply
On the other hand, I don’t need to review carefully every line of code in my thumbnail generator and associated UI.

My nonexistent backend isn’t going to be pwned if there is a bug in the thumbnail generation.

After the QA testing on my device, a quick scroll through of the code is enough.

Maybe prompt „are errors during thumbnail generation caught to prevent app crashes?“ if we‘re feeling extra cautious today.

And just like that it saved a day of work.

reply
> My nonexistent backend isn’t going to be pwned if there is a bug in the thumbnail generation.

Hmm. Historically image editing was one of the easier to exploit security holes in many systems. How do you feel about having unknown entities having shell inside your datacenter or vpc?

reply
I feel pretty good about the odds of attackers exploiting security holes in image editing functions my app does not have, in order to enter my also nonexistent datacenter or vpc.
reply
But a thumbnail generator is a 1 hour task at best if you’re on a solo greenfield project and it’ll still be a 6 week project at an enterprise, even with AI.
reply
I would be impressed if you implement it in an hour with the following features:

- webview fallback with canvas capture for codecs not supported in the default player

- detecting blank frames and diff between thumbnails to maximize variety

- UI integration to visualize progress and pending thumbnails, batched updates to the gallery

- versioning scheme and backfill for missing/outdated thumbnail formats

Honestly, a day seems rather optimistic to me. Maybe if I was an expert for this platform and would have implemented a similar feature before, then I could hope to do it in a day.

If I had to handwrite it and estimate it for Scrum at work, I‘d budget a week.

reply
Ok, fair. I incorrectly assumed you meant resizing static images to create a lower resolution preview image.

Video thumbnails are a different beast altogether. And you might want to double check your assumptions about security considerations. If any of your ffmpeg, opencv, pyscenedetect code is running on your server, it might well be exploitable.

reply
It’s in-app on iOS.

Ironically, already another user in this comment section was concerned about the security of my nonexistent backend.

But it’s good to know, I was not previously aware that video processing on the backend is a common source of vulnerabilities.

reply