upvote
> But this is a logic fail is it not?

It is not. The LLM approach is not dependent on system configurations. You can expect that it probably works the same from any device or application, because it can surmise slang/jargon from training and context rather than needing to be fed every little individual case as a per-user configuration. There are advantages to making a program more sophisticated than a literal == check against a list of pre-programmed words.

And even if there were an easy and satisfying way to unify dictionaries cross-device, it still wouldn't be a pleasant experience. That first time adding every single jargon term you use is not enjoyable. If there was a solution that just... didn't require that, it would solve a problem current spellcheckers do not solve. And what do you know, it appears there is one!

> This is a second logic fail.

Saying things are logic fails doesn't make them logic fails, all the more so when the failure is your own reading comprehension. I explicitly noted that non-determinism doesn't need to be flawless, only better than the deterministic solution on average. If the non-deterministic error rate of LLMs is below 1%, that still puts it far, far, far ahead of the deterministic tool's error rate.

It may be possible to create a deterministic tool that is better on average, but I haven't seen one. The current tooling is so fucking horrendously bad that after decades they cannot handle pluralising any uncommon word that is pluralised with "ies", for example squiggly is recognised and squigglies is not. That is fucking shamefully bad technology.

reply
>The LLM approach is not dependent on system configurations

How is it not dependent? Like, help me out here: I'm writing something up in vim on my FreeBSD system using the built-in dictionary capability, maybe I've got grammar too with LanguageTool via the ALE plugin. I've added various words to my good words list over time. I save it to a network drive and want to keep working on it and do some graphical formatting as well for output to a different audience with a different tool on an iPad for a flight. How does "the LLM approach" uniquely slot into vim and the iPad app. "Uniquely" as-in a way that you couldn't slot in a shared sync'd dictionary file or whatever else. What if one of the developers doesn't want to and I don't have time or (if it's closed source) can't? How does it help all the other different software I use that are still using their own thing?

If by "LLM approach" you specifically mean "I copy/paste into this whole other software, and that software is what I use from different platforms" well, that's nice but it's not an "LLM approach" it's a "copy/paste into different software" approach which again could be done with whatever.

I explicitly noted that non-determinism doesn't need to be flawless, only better than the deterministic solution on average.

But how do you know what the "average" is? You can't get that from a single shot. And what's the upside vs downside of false positives or false negatives or meaning changes/hallucinations? That's also a point of contention, particularly when it comes to any problem space (coding of course, but also law, medicine etc) where precision in language is important even 1% of the time. And you clearly have an intense personal issue here around grammar/spelling that is not universally shared. Which is fine, but the tradeoffs you're willing to make are also going to be personal. It's also going to vary, just as with using LLMs for coding, based on the user. Some people are sufficiently capable with language to realistically be able to expect to double check an LLM and mostly do fine. It's a lot riskier though for someone with a weak grasp to depend on.

reply
You seem to be pointing out that spellcheckers are more mature and currently have more system integration, but it's not inevitable that will always be the case. One could suppose the LLM spellchecking is eventually integrated at the OS-level, and the way it would be superior to existing OS-level spellcheckers is that the program is sophisticated enough to not need to maintain and sync a dictionary around because it simply spellchecks effectively without one (and there's no reason it couldn't also use a dictionary augmenting its context, if you had some really bespoke conditions that warranted one).

> But how do you know what the "average" is? You can't get that from a single shot.

I don't know what the average is. I never made a claim that LLMs are categorically better than spellcheckers; I simply said it's hard to imagine they'd be worse, given how bad spellcheckers already are, and that I understand why people would be willing to give a non-deterministic tool a try, contrary to it being stated like doing so was the dumbest thing imaginable and that spellchecking was a 'solved problem'.

You're correct that one shot is not a statistical analysis, but multiple people were throwing around assertive claims that LLMs rewrite entire sentences and change their meaning when prompted to spellcheck, or that LLMs were incapable of handling a joke with intentional mispellings being integral to the joke, both of which seemed incorrect on their face to me, so I gave it a try. LLMs are typically conditioned to a high degree of mode collapse, so I do expect that if I retried the same prompt and context on the same model 100 times, it probably would give approximately the same output at least 90/100 times, if not 99, but I'm not presenting a thesis here.

> And what's the upside vs downside of false positives or false negatives or meaning changes/hallucinations?

Sure, these are valid considerations. I would not, under any circumstances, let an LLM touch my legal documents for any reason. However, the stakes for spellchecking an internet comment are non-existent, so one could easily imagine trading the downsides for the benefit of not being nagged by squigglies.

> And you clearly have an intense personal issue here around grammar/spelling

I really don't, actually. As I mentioned, I disable spellcheckers on sight, and I don't use LLMs for spellchecking myself. I rely on my own two eyes for spellchecking, and sometimes I miss things, which is an outcome I'm okay with. Spellcheckers, then, are not something I ever think about, beyond the time it takes to disable them after being nagged on a new device or application. I do take offense to calling such a laughably poor state of technology a "solved problem", though, and the sneering at people attempting to find new solutions to it. There is absolutely nothing wrong with attempting to iterate on a bad status quo.

I would also note that I think the non-determinism could also be solved to an appreciable degree by simply having the integrated LLM tool offer suggestions, which require human approval to correct, much as current squigglies operate but perhaps with a lower failure rate on average. Or not! But it's an area I can see value in exploring, anyways.

reply