undefined

points

[-]

Probably the one elephant in the roomy thing that matters: failing to say they don't know/can't answer

[-]

With tool use, it's actually quite doable!

[-]

Claude does it all the time, in my experience.

[-]

Same here, it's even told me "I don't have much experience with this, you probably know better than me, want me to help with something else?".

[-]

The speed on a constrained device isn't entirely the point. Two years ago, LLMs failed at answering coherently. Now...

You're absolutely right. Now, LLMs are too slow to be useful on handheld devices, and the future of LLMs is brighter than ever.

LLMs can be useful, but quite often the responses are about as painful as LinkedIn posts. Will they get better? Maybe. Will they get worse? Maybe.

[-]

> Will they get better? Maybe. Will they get worse? Maybe.

I find it hard to understand your uncertainty; how could they not keep getting even better when we've been seeing qualitative improvements literally every second week for months on end? These improvements being eminently public and applied across multiple relevant dimensions: raw inference speed (https://github.com/ggml-org/llama.cpp/releases), external-facing capabilities (https://github.com/open-webui/open-webui/releases) and performance against established benchmarks (https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks)