undefined

points

[-]

The verbosity is likely a result of the system prompt for the LLM telling it to be explanatory in its replies. If the system prompt was set to have the model output shortest final answers, you would likely get the result your way. But then for other questions you would lose benefitting from a deeper explanation. It's a design tradeoff, I believe.

by BoredomIsFun14 hours ago|

parent|

[-]

My system prompt is default - "you are a helpful assistant". But that beyound the point though. You don't want too concise outputs as it would degrade the result, unless you are using a reasoning model.

I recommend rereading my top level comment.

by BoredomIsFun15 hours ago|

prev|

[-]

Well, when I asked for a very long answer (prompt #2), the quality had dramatically improved. So yes, longer answer produces better result. At least with small LLMs I can run on my GPU locally.