undefined

points

[-]

So, in human interaction: When the business logic goes wrong because it was described with a lack of specificity, then: Who gets blamed for this?

by drewbeck4 hours ago|

parent|

[-]

In my job the task of fully or appropriately specifying something is shared between PMs and the engineers. The engineers' job is to look carefully at what they received and highlight any areas that are ambiguous or under-specified.

LLMs AFAIK cannot do this for novel areas of interest. (ie if it's some domain where there's a ton of "10 things people usually miss about X" blog posts they'll be able to regurgitate that info, but are not likely to synthesize novel areas of ambiguity).

by ssl-33 hours ago|

parent|

[-]

They can, though. They just aren't always very good at it.

As an experiment, recently I've been using Codex CLI to configure some consumer networking gear in unusual ways to solve my unusual set of problems. Stuff that pros don't bother with (they don't have the same problems I face), and that consumers tend to shy away from futzing with. The hardware includes a cheap managed switch, an OpenWRT router, and a Mikrotik access point. It's definitely a rather niche area of interest.

And by "using," I mean: In this experiment, the bot gets right in there, plugging away with SSH directly.

It was awful with this at first, mostly consisting of a long-winded way to yet-again brick a device that lacks any OOB console port. It'd concoct these elaborate strings of shit and feed them in, and then I'd wander over and reset whatever box was borked again. Footgun city.

But after I tired of that, I had it define some rules for engaging with hardware, validation, constraints, and for order of execution, and commit those rules to AGENTS.md. It got pretty decent at following high-level instructions to get things done in the manner that I specified, and the footguns ceased.

I didn't save any time by doing this. But I also didn't have to think about it much: I never got bogged down in wildly-differing CLI syntax of the weirdo switch, the router (whose documentation is locked behind a bot firewall), and access point's bespoke userland. I didn't touch those bits myself at all.

My time was instead spent observing the fuckups and creating a rather generic framework that manages the bot, and just telling it what to do -- sometimes, with some questions. I did that using plain English.

Now that this is done, I get to re-use this framework for as many projects as I dare, revising it where that seems useful.

(That cheap switch, by the way? It's broken. It has bizarro-world hardware failure modes that are unrelated to software configuration or firmware rev. Today, a very different cheap switch showed up to replace it. When I get around to it, I'll have the bot sort that transition out. I expect that to involve a bit of Q&A, and I also expect it to go fine.)

by shakna14 hours ago|

parent|

prev|

[-]

I wasn't specific, because I'd rather not piss of my employer. But anyone who works in a similar space will recognise the pattern.

It's not underspecified. More... Overspecified. Because it needs to be. But AI will assume that "impossible" things never happen, and choose a happy path guaranteed to result in failure.

You have to build for bad data. Comes with any business of age. Comes with international transactions. Comes with human mistakes that just build up over the decades.

The apparent current state of a thing, is not representative of its history, and what it may or may not contain. And so you have nonsensical rules, that are aimed at catching the bad data, so you have a chance to transform it into good data when it gets used, without needing to mine the entire petabytes of historical data you have sitting around in advance.

by necovek15 hours ago|

parent|

prev|

[-]

Depends on what was missing.

If we used MacOS throughout the org, and we asked a SW dev team to build inventory tracking software without specifying the OS, I'd squarely put the blame on SW team for building it for Linux or Windows.

(Yes, it should be a blameless culture, but if an obvious assumption like this is broken, someone is intentionally messing with you most likely)

There exists an expected level of context knowledge that is frequently underspecified.