undefined

points

[-]

> But you can't talk to them about the flow of the code. You can't ask them for their thinking as to why certain things are.

You can absolutely do this. It's even right most of the time.

by chmod77522 hours ago|

parent|

[-]

Let's be real. Most of the time you ask an LLM "Why did you do it like this?", it responds with something along the lines of "Oops. My bad. You're right to point this out."

You even have a fair chance of getting a response like that when there isn't anything wrong and the question wasn't rhetorical - which perfectly illustrates the level of the genuine understanding LLMs operate at.

by seventhtiger21 hours ago|

parent|

[-]

When you criticize AI, always remember that the alternative is the average employee. Today's models are pretty good.

by devin21 hours ago|

parent|

[-]

A lot of people think they're above average. A lot of them are wrong.

A lot of average people are producing gigantic messes. At least previous to this they were gated by their mediocrity.

by Frieren12 hours ago|

parent|

prev|

[-]

> the alternative is the average employee. Today's models are pretty good.

I have never seen anywhere in the world people that hates so much the working class as people do in the USA.

In my country the average employee is competent, they do their work and create wealth for the nation.

Again, only in the USA people think that billionaires are the ones creating value. Total non-sense indoctrination.

by seventhtiger6 hours ago|

parent|

[-]

I'm not American or ever worked in the USA. It's not a judgement of human value. It's a judgement of work output.

by gamerslexus9 hours ago|

parent|

prev|

[-]

To adequately validate work you must be at least at the same level, so if you were right (which dunning-kruger suggests unlikely) that would mean your "terrible" average employee is given a tool that will 10x their output which they cannot even check for correctness. And correctness will be low if the average employee is bad like you say, because it means they will give badly specified tasks and even with the best of us it's garbage in, garbage out. I am sure there is no way this can backfire.

by seventhtiger6 hours ago|

parent|

[-]

All enablers also enable mediocrity. That's not new. At least when the non-mediocre engineer has to work with someone, they can have a tireless responsive partner.

I find this varies by individual, but the AI taking care of so much boilerplate and rote work of coding, and taking the role of architect, test designer, and reviewer is a lot more productive for me. Check the code may take the same skill, but it's an order of magnitude less work.

by gamerslexus3 hours ago|

parent|

[-]

Perhaps if you need that much boilerplate it's not going to be a well-architected codebase in the first place. Abstract it out, make a lib out of it. Easier to review & test in separation. Loose coupling, high cohesion.

by cyh55513 hours ago|

parent|

prev|

[-]

and have they totally got rid of the average employees? They can blame the models for the production outages already?

by djeastm21 hours ago|

parent|

prev|

[-]

I remember hearing (perhaps last year?) that the model companies have specifically tried to obfuscate the "thinking/reasoning" behind the decisions the models make so as to prevent cheaper models from training on the reasoning logs. So asking one "why did you do it like this" might be not fruitful.

Not sure if that's true or if it might be influencing what you're seeing, but it's a thought.

by NewsaHackO20 hours ago|

parent|

[-]

I think that has to do more with the thinking "train of thought" that some models show as what the model is processing before making the response. There shouldn't be a distillation risk with actually asking the model to explain why it made a decision and getting the response.

by saulpw22 hours ago|

parent|

prev|

[-]

This has happened to me, so I put this in my global CLAUDE.md, and it seems to help (I don't remember getting the response you mentioned for awhile now):

    **Lead with the answer when asked how/which/whether.** Name the command/mechanism first; a question seeking understanding isn't a go-ahead to execute. Answer, then offer to act.

by dmayle21 hours ago|

parent|

prev|

[-]

That's because of a fundamental misunderstanding of what an LLM is. The only correct answer to "Why did you do it like this?" is that the specific combination of input text and RNG state caused this particular output. There's no reasoning to be had.

* EDIT * What's with the downvoting? That's a correct description of what happened. You can't ask an LLM why it did something and expect a coherent response, because there's no thinking chain, and no stored thinking state... At best, you can get a reconstruction of how the context relates to the output (basically a summarization of the context).

by baggy_trough22 hours ago|

parent|

prev|

[-]

Can't remember the last time that happened.

by javier221 hours ago|

parent|

[-]

Happened to me at least three times the past 14 days. I point out where it made a design decision that causes data loss. «Oops my mistake»

by theshackleford9 hours ago|

parent|

prev|

[-]

I encounter it constantly with the latest models. Claude is particularly prone to it.

> I shouldn’t have said that with confidence

> I got ahead of myself there

> I overstepped, allow me to correct that

It’s wild seeing how often it’s wrong, and I only know it’s wrong because I am an SME or actually reading the sources. Most of my coworkers are not SMEs with what they are asking and do not read the sources.

A huge part of my job now is fixing fuck ups and failures resulting from these slop jockeys who have already moved on to slop up the next task.

by therealdrag013 hours ago|

parent|

prev|

[-]

So what? That doesn’t negate the value they provide.

by datsci_est_201522 hours ago|

parent|

prev|

[-]

I believe the “them” the OP was talking about was referring to the people opening the PRs, not the LLMs.

by saulpw22 hours ago|

parent|

[-]

My mistake, that is definitely a different scene.

by ssss1121 hours ago|

parent|

prev|

[-]

And you can certainly tell it the flow you want (and any other constraints) in the prompt.

by scubbo19 hours ago|

prev|

[-]

> But you can't talk to them about the flow of the code. You can't ask them for their thinking as to why certain things are.

There are plenty of valid criticisms or warnings about over-reliance on AI coding, but this is not one of them. Today, I am using a semi-autonomous agentic coding system which has an `interview` functionality built in - when it spits out the PR from the input, if you have questions about the motivation or context for a particular choice, you can start up a clone of the original agent in a sandbox to question it.

Now, you might claim that those responses aren't always reliable, accurate, or consistent, and that claim has a little more weight (though, in my experience, decreasingly so) - but it is _certainly_ not the case that you cannot interview an agent about choices made. I'm literally doing it every day.

by OptionOfT17 hours ago|

parent|

[-]

Sorry, I meant interviewing the PR author for certain choices.

by com2kid15 hours ago|

prev|

[-]

> Because companies are betting that this spending will allow them to reduce cost by firing people.

I've never worked at a company that didn't have a technical backlog measured in years.

by LtWorf6 hours ago|

parent|

[-]

If they don't hire to get it done it means they don't think it's really important to get it done.

by tuesdaynight4 hours ago|

parent|

[-]

That is an amazing point that invalidates the backlog in my mind. Stated vs revealed preferences in the end.

by scuff3d21 hours ago|

prev|

[-]

Literally in the middle of ripping apart a vibe coded mess at work to figure out what's even worth keeping. Not fun :(

by bvcp9 hours ago|

parent|

[-]

use ai to do that

by foolserrandboy16 hours ago|

parent|

prev|

[-]

What happens if you just keep vibe coding is? Does it whack-a-mole fix one area and break another?

by HNisCIS21 hours ago|

prev|

[-]

It's so fucking bad. I'm watching a team try to maintain a huge dashboard/control application that interfaces with a large amount of hardware using solely AI workflows.

Literally nothing works, all the timers/time counters are different across the pages, constantly commands hardware to do stupid shit, breaks during critical moments/in front of clients.

Eventually mgmt had to institute change freezes for high profile events because the team was breaking too much shit all the time.

The average C suite dipshit doesn't realize that the performance drops off a cliff once your project is more than some fraction of the context window so they will make pretty dashboards all day long but once you need to cover all the edge cases of a real system it all explodes.

AI isn't trained on the type of software style we'll need to create systems using AI, it's trained on how we used to write software. It doesn't reuse code or elegantly structure annoying, it just adds more code until the thing builds and passes some fake tests, even if half of it is functionally dead/unused.