undefined

points

[-]

If you want the model to have function calls available you need to run it in an agentic harness that can do the proper sandboxing etc. to keep things safe and provide the spec and syntax in your system prompt. This is true of any model: AI inference on its own can only involve guessing, not exact compute.

by neonstatic1 hours ago|

parent|

[-]

Thanks, I am very new to this and just run models in LMStudio. I think it would be very useful to have a system prompt telling the model to run python scripts to calculate things LLMs are particularly bad at and run those scripts. Can you recommend a harness that you like to use? I suppose safety of these solutions is its own can of worms, but I am willing to try it.

by Computer01 hours ago|

parent|

[-]

I use Claude Code. Codex and Opencode both work too. You could even do it with VScode Copilot.

by zozbot2341 hours ago|

parent|

[-]

These are typically coding oriented as opposed to general chat, so their system prompts may be needlessly heavy for that use case. I think the closest thing to a general solution is the emerging "claw" ecosystem, as silly as that sounds. Some of the newer "claws" do provide proper sandboxing.

by augusto-moura1 hours ago|

prev|

[-]

The date command is not wrong, it works on GNU date, if you are in MacOS try running gdate instead (if it is installed):

   gdate -u -d @1775060800

To install gdate and GNU coreutils:

  brew install coreutils

The date command still prints the incorrect value: Wed Apr 1 16:26:40 UTC 2026

by neonstatic1 hours ago|

parent|

[-]

Good catch, I just ran it verbatim in iTerm2 on macOs:

date -u -d @1775060800

date: illegal option -- d

btw. how do you format commands in a HN comment correctly?

by augusto-moura1 hours ago|

parent|

[-]

Start the line indented with two or more spaces [1]

[1]: https://news.ycombinator.com/formatdoc

by nullbyte8 minutes ago|

prev|

[-]

Last paragraph made me chuckle

by fc417fc8021 hours ago|

prev|

[-]

Given the working script I don't follow how a broken verification step is supposed to lead to it being off by 1600 seconds?

by neonstatic1 hours ago|

parent|

[-]

The model didn't run the script. As pointed out by @zozbot234 in another response, it would need to be run in an agentic harness. This prompt was executed in LMStudio, so just inference.

by fc417fc8021 hours ago|

parent|

[-]

I'm curious what the thinking trace looked like. Interesting that it can get that close to the answer yet still be off.

by neonstatic1 hours ago|

parent|

[-]

[dead]

by 1 hours ago|

parent|

prev|

[-]

deleted