undefined

points

[-]

> I find that for things I'm already capable at, LLMs are relatively inconsequential. But for things I'm no good at, it's a huge game changer.

What are the chances that this is the Gell-Mann amnesia effect? Sounds like the textbook definition of it.

Personally, I find the exact opposite to be true. LLMs only help me when I already know exactly what I'm doing.

by xeromal13 hours ago|

parent|

[-]

I can give an anecdote. I'm a backend engineer for a service that I would consider pretty high horsepower. We get about 30k sign ups and trillions of events a day. I haven't touched the front end with a 10 foot pole since college.

I got the opportunity to rewrite our aging login page just as a fun experiment. I sat down with one of our analysts and we just went to town in a zoom trying out stuff with claude until we made something pretty sweet. Ran it through all our systems for accessibility, performance, etc and it came out clean. Made a PR and fired up a test that day in production. I haven't written a lick of our front end framework ever in my entire life and we were able to build something that has had a marked improvement in our user engagement in a day.

by alt22712 hours ago|

parent|

[-]

> a marked improvement in our user engagement in a day.

Do you have any idea what has caused this engagement improvement and indeed do you actually have any metrics or is it hearsay?

It is much easier to knock something up in a day as you have done, but often the reason manual things take longer is they are based on actual testing and research which takes longer than a day however you do it. The manual way gives you much more data on the hows and whys, and will inform you much more in the future when you need to change again instead of just 'ai did it last time, lets use it again!'

by xeromal12 hours ago|

parent|

[-]

No, we did a actual test using our existing testing framework. We have shitloads of metrics to know when a user gets stuck, when they give up, which login path they took, etc.

This wasn't a half assed test but a legitimate effort to improve something that we never prioritized

We had a legitimate 25% reduction in users giving up logging in in a system that has millions of users.

We ran a 50-50 AB test for several weeks to confirm the data and then turned it on completely

edit: If you haven't already read my post, I'd also like to say that the benefit AI gives us is that I worked on something I never get to work on, the analyst got to try a hunch he always had, and we got to see it go live in a day. If it didn't' work out, we were out a day of work which beats the few weeks of an effort prior to AI that we would spend on something just to find out it didn't work.

by gwern9 hours ago|

parent|

[-]

This seems consistent with OP. You had a feature where most of his Gantt chart is, in effect, already done: you had a clear problem with a clear well thought out design/solution (with associated documentation) in mind, you had a well setup analytics process for deployment and followup... you really had everything except that big fat chunk in the middle labeled 'coding'. So in your anecdote, an agentic coding LLM really could deliver a huge speedup by doing the remaining 10% or whatever of the work.

This is why LLMs are really great 'knocking off the todo/wishlist' of things you always meant to do. The problem, as far as broader discussions of 'productivity multipliers' or 'total factor productivity' go is that there's a certain perverse diminishing returns to such wishlist items (if each item was all that important, why didn't it get done before?), they generally only apply to a small part of a large complicated whole (what % of your ecosystem/business/community as a whole is the login page, as pleasing and profitable as that fix is relative to the investment? Probably not a big %), and they are also finite (what happens when you have worked through your backlog of lowhanging fruit?).

by Eiriksmal8 hours ago|

parent|

[-]

I ask myself these same questions every workday. Are you cooking any new articles on this topic, Gwern? Reading your (thoroughly researched) thoughts often helps me clarify my own.

by simondotau14 hours ago|

parent|

prev|

[-]

Just because one isn’t good at a thing doesn’t preclude one from being a sufficiently passable judge of a thing.

To wit, the answer pre-AI was to hire an expert on that thing, and you would then critically assess their work product, despite being unable to build it yourself.

by argee14 hours ago|

parent|

[-]

True, but if you hire a generalist and they are consistently under-performing specifically in the subject matter where you are an expert, it may behoove you to take the rest of their work with a grain of salt as well.