undefined

upvote

points

by applfanboysbgon7 hours ago |

upvote

by dspillett4 hours ago|

[-]

> You cannot permit your employees to use LLMs in this manner and then tell them it's entirely their fault when it makes mistakes, because you gave them permission to use something that will make mistakes 100% without fail.

Yes you can. The same way Wikipedia (or, way back when, a paper encyclopedia) can be used for research but you have to verify everything with other sources because it is known there are errors and deficiencies in such sources. Or using outsourced dev resource (meat-based outsourced devs can be as faulty as an LLM, some would argue sometimes more so) without reviewing their code before implemeting it in production.

Should they also ban them from talking to people as sources of information, because people can be misinformed or actively lie, rather than instead insisting that information found from such sources be sense-checked before use in an article?

Personally I barely touch LLMs at all (at some point this is going to wind up DayJob where they think the tech will make me more efficient…) but if someone is properly using them as a different form of search engine, or to pick out related keywords/phrases that are associated with what they are looking for but they might not have thought of themselves, that would be valid IMO. Using them in these ways is very different from doing a direct copy+paste of the LLM output and calling it a day. There is a difference between using a tool to help with your task and using a tool to be lazy.

> it's company policy not to burn everything to the ground!

The flamethrower example is silly hyperbole IMO, and a bad example anyway because everywhere where potentially dangerous equipment is actually made available for someone's job you will find policies exactly like this. Military use: “we gave them flamethrowers for X and specifically trained them not to deploy them near civilians, the relevant people have been court-martialled and duly punished for the burnign down of that school”. Civilian use: “the use of flamethrowers to initiate controlled land-clearance burns must be properly signed-off before work commences, and the work should only be signed of to be performed by those who have been through the full operation and safety training programs or without an environmental risk assessment”.

reply

upvote

by brey7 hours ago|

[-]

The next sentence after your quoted section:

“Even then, AI output is never treated as an authoritative source. Everything must be verified.”

reply

upvote

by applfanboysbgon7 hours ago|

[-]

Any verification process thorough enough to catch all LLM fabrications would take more work than simply not using the LLM in the first place. If anything verifying what an LLM wrote is substantially more difficult than just reading the material it's "summarising", because you need to fully read and comprehend the material and then also keep in mind what the LLM generated to contrast and at that point what the fuck are you even doing?

I believe this policy can never result in a positive outcome. The policy implicitly suggests that verification means taking shortcuts and letting fabrications slip through in the name of "efficiency", with the follow-up sentence existing solely so that Ars won't take accountability for enabling such a policy but instead place the blame entirely on the reporters it told to take shortcuts.

reply

upvote

by klausa5 hours ago|

[-]

The LLM can find material that it would be hard or time-consuming for you to do.

You still need to verify it, but "find the right things to read in the first place" is often a time intensive process in itself.

(You might, at that point, argue that "what if LLM fails to find a key article/paper/whatever", which I think is both a reasonable worry, and an unreasonable standard to apply. "What if your google search doesn't return it" is an obvious counterpoint, and I don't think you can make a reasonable argument that you journalists should be forced to cross-compare SERPs from Google/Bing/DuckDuckGo/AltaVista or whatever.)

reply

upvote

by madamelic1 hours ago|

[-]

I believe what their point is is that if you give people a "extract-needle-from-haystack" machine and then tell them they have to manually find where in the haystack the needle was, it defeats the purpose of having the machine.

With that said, a good RAG solution would come with metadata to point to where it was sourced from.

reply

upvote

by palmotea38 minutes ago|

[-]

> I believe what their point is is that if you give people a "extract-needle-from-haystack" machine and then tell them they have to manually find where in the haystack the needle was, it defeats the purpose of having the machine.

We've got to be careful to not let the perfect be the enemy of the good.

I'm not an LLM enthusiast, but I think you have actually compare it against what the alternative would really be. If you give the journalist a haystack but insufficient time to manually search it properly, they're going to have to take some shortcut. And using an LLM and verifying everything is probably better than randomly sampling documents at random or searching for keywords.

reply

upvote

by klausa14 minutes ago|

[-]

I don't want to come off as an AI-maximalist or whatever, but, I mean, at some point, skill issue, right?

You can use Google to find you results reinforcing your belief that the earth is flat too; but we don't condemn Google as a helpful tool during research.

If you trust whatever the LLM spits out unconditionally, that's sorta on you. But they _can_ be helpful when treated as research assistants, not as oracles.

reply

upvote

by madeofpalk10 minutes ago|

[-]

when you use the extract-needle-from-haystack machine, verify that it actually extracted a needle.

that's much easier than manually extracting the needle yourself

reply

upvote

by JumpCrisscross6 hours ago|

[-]

> Any verification process thorough enough to catch all LLM fabrications would take more work than simply not using the LLM in the first place

Sometimes you have a weak hunch that may take hours to validate. Putting an LLM to doing the preliminary investigation on that can be fruitful. Particularly if, as if often the case, you don't have a weak hunch, but a small basket of them.

reply

upvote

by Mordisquitos5 hours ago|

[-]

You can prompt LLMs to scan thousands of documents to generate text validating your hunches. In some cases those validated hunches may even be correct.

reply

upvote

by Eisenstein1 hours ago|

[-]

It's easy to get an LLM to make any argument you like based on whatever data is available. Those arguments are going to be trivially bad if that data is bad.

reply

upvote

by Jtarii2 hours ago|

[-]

It's more using LLMs like a metal detector, rather than digging through the entire beach by yourself.

You still need to check the junk you dig up using the metal detector.

reply

upvote

by Angostura6 hours ago|

[-]

Disagree. If I’m I’m a reporter and I’m trawling though a mass data dump - say the Epstein files or Wilileaks or statistics on environmental spills or something, using AI to pull out potential patterns in the data, or find specific references can be useful. Obviously you go and then check the particular citations. This will still save a lot of time.

reply

upvote

by Paracompact6 hours ago|

[-]

> I believe this policy can never result in a positive outcome.

I get where you're coming from (I'm learning more and more over time that every sentence or line of code I "trust" an AI with, will eventually come back to bite me), but this is too absolutist. Really, no positive result, ever, in any context? We need more nuanced understanding of this technology than "always good" or "always bad."

reply

upvote

by applfanboysbgon6 hours ago|

[-]

I didn't say in any context. I'm specifically talking about this policy on journalistic research.

reply

upvote

by bandrami4 hours ago|

[-]

If you need accuracy, an LLM is not the tool for that use case. LLMs are for when you need plausibility. There are real use cases for that, but journalism is not one of them.

reply

upvote

by furyofantares4 hours ago|

[-]

I'm not a journalist and just for random things I'm interested in, I have no problem using an LLM to point me in a direction and then directly engage with the source rather than treat any of the LLM output as authoritative. It's easy to do. This is not a flamethrower.

reply

upvote

by empath7543 minutes ago|

[-]

They're also allowed to use wikipedia to use research. It has similar sorts of problems.

reply

upvote

by JumpCrisscross6 hours ago|

[-]

> the author they fired for the fabricated reporting

Didn't one of the magazine's editors share the byline?

reply

upvote

by bombcar2 hours ago|

[-]

Yes but editors don't take the fall, they take the credit.

Everything occurred exactly as predicted.

reply

upvote

by fooker6 hours ago|

[-]

> LLMs are terrible at accurately summarizing anything.

I think you are perhaps stuck in 2023?

reply

upvote

by Mordisquitos5 hours ago|

[-]

And yet we are discussing this in the context of a reporter having been fired from Ars Technica for publishing an article which included inaccurate LLM-generated summaries in 2026. How come?

https://news.ycombinator.com/item?id=47226608

reply

upvote

by fooker4 hours ago|

[-]

Maybe you should read the article? :)

What failed was extracting verbatim quotes, not summarizing.

If you want an LLM to do verbatim anything, it has to be a tool call. So I’m not surprised.

reply

upvote

by Mordisquitos4 hours ago|

[-]

[dead]

reply

upvote

by inquirerGeneral1 hours ago|

[-]

[dead]

reply

upvote

by Rekindle80906 hours ago|

[-]

[dead]

reply

upvote

by enraged_camel5 hours ago|

[-]

[dead]

reply

upvote

by knighthack6 hours ago|

[-]

> LLMs are terrible at accurately summarizing anything. They very randomly latch on to certain keywords and construct a narrative from them, with the result being something that is plausibly correct but in which the details are incorrect, usually subtly so, or important information is omitted because it wasn't part of the random selection of attention.

I don't know what you've been doing, but the summaries I get from my LLMs have been rather accurate.

And in any event, summaries are just that - summaries.

They don't need to be 100% accurate. Demanding that is unreasonable.

reply

upvote

by lamasery3 hours ago|

[-]

The LLM meeting-summary bot in Teams seems accurate… unless you were in the meeting, and also closely read the summary afterward. It misrepresents what people actually said all the time.

reply

upvote

by avereveard3 hours ago|

[-]

Depends on topic, often what they consider important isn't what is important and details that are essential get out of view. I'm having good success with youtube video, not as much with technical docs.

reply

upvote

by carefree-bob6 hours ago|

[-]

Yes, search and summarization is where LLMs shine. I use them all the time for that, and much less for code generation. I would say search > summarization > debugging > code gen/image gen

reply

upvote

by suddenlybananas6 hours ago|

[-]

>They don't need to be 100% accurate. Demanding that is unreasonable.

If an intern was routinely making up stuff in the summaries they provided to their bosses, they'd be let go.

reply