undefined

points

[-]

I’ve had a couple of weeks of downtime at work, so I decided to incorporate agents into my work processes - things like note taking, task tracking, document management.

Your comment EXACTLY mirrors my experience. Week 1 was ever expanding prompts, and degrading performance. Week 2 has been all about actually defining the objects precisely (notes, tasks, projects, people etc) and defining methods for performing well defined operations against these objects. The agent surface has, as you rightly point out, shrunk to a translation layer that converts natural language to commands and args that pass the input validator.

by sowbug16 hours ago|

prev|

[-]

A full-circle system prompt would be to "find every opportunity to put yourself out of your job by automating it away. When you are given a question that code can answer, answer the question by writing code and running it to obtain the result."

Such an LLM might have fared better with the strawberry test.

by Imanari20 minutes ago|

parent|

[-]

That’s exactly the approach of smolagemts. The only “tool“ available is writing python code

by edgarvaldes17 hours ago|

prev|

[-]

Some have expressed the opinion in this forum that the future of software lies in programs that are created and adapted at runtime, using genAI. I don't know how far we are from that.

by aleksiy12316 hours ago|

parent|

[-]

It’s already here the question is just to what extent?

Are google search results modifying your software at runtime?

Take or agent chat for example, the output text is a ui, agents can generate charts and even constrained ui elements.

Isn’t that created and adapted at run time?

If you mean like agents live modifying your code. I think that’s pretty much here as well. Can read the logs and send prs.

The only thing is how fast that loop will execute from days or hours to mins or seconds, and what validation gates it needs to pass.

My git repo is pretty much self modifying personal software at this point, that I interface through the ide chat window.

But I don’t think we will ever lose the intermediary deterministic language (code) between the llm and the execution engine.

It would be prohibitively expensive to run everything through models all the time.

But I am starting to think we need a more precise language than English when talking with LLMs. That can do both precision and ambiguity when you need either.

by jmaw16 hours ago|

parent|

[-]

Some kind of "code", you could say

by aleksiy12314 hours ago|

parent|

[-]

Yes but more declarative vs imperative.

I say what the llm says how.

by pishpash13 hours ago|

parent|

[-]

Not that long ago the workflow was to turn code comments into code. Maybe leave some comments as is now.

by pishpash13 hours ago|

parent|

prev|

[-]

Sounds like assemblers bemoaning loss of control to C. The solution was inline assembly...

by mjr0017 hours ago|

parent|

prev|

[-]

> Some have expressed the opinion in this forum that the future of software lies in programs that are created and adapted at runtime, using genAI.

Good luck with that. Users will flood you with complaints if a button moves 5px to the left after a design update. A program that is generated at runtime, with not just a variable UI but also UX and workflows, would get you death threats.

by hilariously17 hours ago|

parent|

[-]

I think many software adjacent folks are super excited because they can now have the personalized toothbrush they keep asking people to make for them.

The problem is that outside of that most people want boring and regular interfaces so they can get in and solve the problem and get out - they don't want to "love" it or care if its "sexy" they want it to work and get out of the way.

LLMs transmogrifying your software at ever request assumes people are software architects and creators who love the computer interface, and that just doesn't describe the bulk of the population.

Most people using computers use the to consume things or utilize access to things, not for their own sake, and they certainly don't think "what if I just had code to do x..." unless x is make them a lot of money.

by munk-a14 hours ago|

parent|

prev|

[-]

A program that is generated at runtime is fine (we have interpreted languages and often compile on demand) - the issue is with the non-deterministic nature of the output.

I think the core issue is that non-deterministic output is great for a chatbot experience where you want unpredictable randomness so it feels less like talking to the mirror - but when it comes to coding I think we're pretty fundamentally misaligned in sticking to that non-deterministic approach so firmly.

by cassianoleal15 hours ago|

parent|

prev|

[-]

So we're back to vim over ssh in production, only without a human with _some semblance_ of judgement in the loop?

by QuercusMax15 hours ago|

prev|

[-]

I've seen cases where models will get stuck in a particular mode of problem solving and need a nudge to tell them to move to a new mode. For example, instead of trying to massage a bunch of system service configs to handle hot-plug/unplug of an audio stream, what I really needed was to just write a couple dozen lines of Python to handle stuff.

I just had Claude write itself a couple shell scripts to handle a bunch of common cases (like running tests) in my workflow where it just couldn't figure it out efficiently. Now it just runs those tools and sets things up instead of spinning in circles for half an hour.

Every time it tries to ask me if it can run some one-off crazy shell or python one-liner to do something, I've started asking myself if I should have it write a tool I can auto-approve instead.

by 3uba16 hours ago|

prev|

[-]

[dead]