undefined

points

[-]

In my case, what I like to do is extract data into machine-readable format and then once the data is appropriately modeled, further actions can use programmatic means to analyze. As an example, I also used Claude Code on my taxes:

1. I keep all my accounts in accounting software (originally Wave, then beancount)

2. Because the machinery is all in programmatically queriable means, the data is not in token-space, only the schema and logic

I then use tax software to prep my professional and personal returns. The LLM acts as a validator, and ensures I've done my accounts right. I have `jmap` pull my mail via IMAP, my Mercury account via a read-only transactions-only token and then I let it compare against my beancount records to make sure I've accounted for things correctly.

For the most part, you want it to be handling very little arithmetic in token-space though the SOTA models can do it pretty flawlessly. I did notice that they would occasionally make arithmetic errors in numerical comparison, but when using them as an assistant you're not using them directly but as a hypothesis generator and a checker tool and if you ask it to write out the reasoning it's pretty damned good.

For me Opus 4.6 in Claude Code was remarkable for this use-case. These days, I just run `,cc accounts` and then look at the newly added accounts in fava and compare with Mercury. This is one of those tedious-to-enter trivial-to-verify use-cases that they excel at.

To be honest, I was fine using Wave, but without machine-access it's software that's dead to me.

by shepherdjerred3 hours ago|

prev|

[-]

I've gotten better results by telling it "write a Python program to calculate X"

by brotchie1 hours ago|

parent|

[-]

For the tax thing. I had Claude write a CLI and a prompt for Gemini Flash 2.5 to do the structured extraction: i.e. .pdf -> JSON. The JSON schema was pretty flexible, and open to interpretation by Gemini, so it didn't produce 100% consistent JSON structures.

To then "aggregate" all of the json outputs, I had Claude look at the json outputs, and then iterate on a Python tool to programmatically do it. I saw it iterating a few times on this: write the most naive Python tool, run it, throws exception, rinse and repeat, until it was able to parse all the json files sensibly.

by dmd2 hours ago|

parent|

prev|

[-]

Yeah, in my user prompt I have "Whenever you are asked to perform any operation which could be done deterministically by a program, you should write a program to do it that way and feed it the data, rather than thinking through the problem on your own." It's worked wonders.

by 2 hours ago|

parent|

[-]

deleted

by cj2 hours ago|

parent|

prev|

[-]

Good call. I’ve also had better results pre-processing PDFs, extracting data into structured format, and then running prompts against that.

Which should pair well with the “write a script” tactic.

by tavavex2 hours ago|

parent|

[-]

Yeah, asking for a tool to do a thing is almost always better than asking for the thing directly, I find. LLMs are kind of not there in terms of always being correct with large batches of data. And when you ask for a script, you can actually verify what's going on in there, without taking leaps of faith.

by ElFitz2 hours ago|

prev|

[-]

I’d say for these use cases it’s better to make it build the tools that do the thing than to make it doing the thing itself.

And it usually takes just as long.