At the end of the day, if I am spending X$s for automation, I want to be able to sleep at night knowing my factory will not build a WMD or delete itself.
If its simply a tool that is a multiplier for experts, then do I really need it? How much does it actually make my processes more efficient, faster, or more capable of earning revenue?
There is a LOT that is forgiven when tech is new - but at some point the shiny newness falls off and it is compared to alternatives.
Review and oversight does address reliability directly, and hence why we make use of those in processes to improve the reliability of mechanical processes as well, and why they are core elements of AI harnesses.
> If its simply a tool that is a multiplier for experts, then do I really need it? How much does it actually make my processes more efficient, faster, or more capable of earning revenue?
You can ask the same thing about all the supporting staff around the experts in your team.
> There is a LOT that is forgiven when tech is new - but at some point the shiny newness falls off and it is compared to alternatives.
Only teams without mature processes are not doing that for AI today.
Most of the deployments of AI I work on are the outcome of comparing it to alternatives, and often are part of initiatives to increase reliability of human teams jut as much as increasing raw productivity, because they are often one and the same.
So many applications of LLMs have even to start with deterministic brain when using a non-deterministic llm and then wonder why it’s not working.