undefined

points

[-]

This is akin to “don’t make mistakes”

“Verify all facts and compliance requirements” leaves enormous holes even if you assume the LLM has a concept of facts and requirements (it does not).

What facts? What requirements? For what industry? For what subset of that industry? For what country or countries that you will be doing business in? Are these current “facts” and “requirements” or is the LLM referencing a dusty article from 1992 for which the subject matter has been radically overhauled?

In my job I regularly see small but incredibly important mistakes like this lead to major issues. Some of those are human driven but increasingly the defense of the person responsible has turned into “Claude said it was fine though!”

by kolinko20 minutes ago|

parent|

[-]

Well, you wouldn't just give human a task "verify all facts and compliance requirements" and expect it to end well either, no?

by ilaksh1 hours ago|

parent|

prev|

[-]

It can make mistakes and will sometimes, but what he specifically mentioned was a case where it did not pull up a reference that it needed. So using a web search tool effectively would make a big difference.

by ofjcihen1 hours ago|

parent|

[-]

It still does not rise the standard he requires which your response indicated would be easy for the model to achieve with a simple prompt.

Additionally, using a specific tool does not suddenly give the model common sense enough to say “this piece of information doesn’t answer the question of whether this solution fits in this specific industry at this time in this place”.

by ilaksh1 hours ago|

parent|

[-]

A web search tool to pull up the law that is relevant?

by vor_28 minutes ago|

prev|

[-]

> 3 years max. Maybe 5 if you are lucky.The models will continue to improve. The exponential gains in compute efficiency that have been ongoing for 70+ years will continue and that will result in even smarter models. There are dramatic hardware changes in the pipeline.

I remember hearing that 10 years ago about self-driving.

by oblio1 minutes ago|

parent|

[-]

60 years ago about flying cars, 40 years ago about cold fusion, the list is long.

We need a lot more basic research into LLMs and also a lot cheaper hardware.

The current batch of LLMs will turn a lot of fields upside down, but not to the tune of $3tn or whatever crazy amounts are being invested right now.

by DaSHacka5 minutes ago|

parent|

prev|

[-]

"Just 2 more weeks guys, and AI will be able to do everything!"

by jppope1 hours ago|

prev|

[-]

Stuff like that is risk tolerance... its not strictly codified and its more akin to probability. Different companies at different stages, in different industries will all interpret their risk differently... how will a smarter model improve that?

by suttontom2 hours ago|

prev|

[-]

Ah yes, the magical equivalent of "you are a senior software engineer who writes bug-free code".

IME people would benefit greatly from the process, albeit tedious and time-consuming, of testing out the same prompt sequence/session with the exact same model multiple times. It becomes clear extremely quickly how capable but unreliable and inconsistent a model can be even when given the same context. If you have ever completed a long, complicated task with an agent and then lost the session and tried doing the same thing again from scratch you may have had the experience of seeing the subtle changes that come up in the model's thinking which lead it to accept or reject certain paths and ignore or incorporate prompt instructions like the one you've provided.

by eikenberry1 hours ago|

prev|

[-]

The classic 3-5 year window for a new technology that is uncertain and requires just a few more breakthroughs to get there...

by Upvoter3310 minutes ago|

parent|

[-]

written with confidence too. I'm amazed at the levels of confidence people have in predicting the (unclear) future.

by weakfish1 hours ago|

parent|

prev|

[-]

Like full self driving!