undefined

points

[-]

Right more simply put it's great at being a copy cat, exploring similar data points that match your token needs.

It is not great at decision making or judgment calls that don't have a well defined spec or plan in place yet; like unofficial or unapproved tokens if you will. A lot of this stuff simply never has had specs as it has been internal to how companies work and their secret sauce.

The closest thing we have are governance and compliance policies due to legal/business needs requiring it so it's far more well documented than operational ones in how we work. It is more about the how versus the what here I guess is what I'm saying.

But yeah this is why it does great when there are tests, design systems, evals, and other artifacts to mirror. Far more reckless and unpredictable without these things, but still great for exploration and finding the data output you seek.

by withinboredom31 minutes ago|

parent|

[-]

Doesn't that make sense? Its text prediction. If you give it examples, it can predict. Synthesizing "put semi-colons on new lines" requires it to generate its own examples 'in its head' (so to speak) and remember that. It won't.

It's like when I see people feeding it a whole bunch of "best practices" and expect it to follow them. It won't. But you could ask it questions about the best practices all day long.

by mikeyouse2 hours ago|

prev|

[-]

I ran into similar issues as we started to roll out LLM generated financials in our org.. I’m so used to the old SQL workflow of “grab this data from this table, that data from that table, combine it into a final result that looks like xxxx” where the tables were outputs from reports in our ERP but I was having terrible results.

Ended up pointing Claude at a few sample files from our existing reporting, gave it read-only oauth access to the ERP and said “build a new report showing the cash by project as calculated by xxxx - yyyy + zzzz in the style of the existing reports” and it basically one-shot from there.

Kind of crazy and I built a bunch of redundant check-sums because I honestly didn’t think it would be able to replace like 6 workdays of effort for the 2 FTEs who generate that kind of thing manually every month but so far so good..

by BlueTierOps57 minutes ago|

parent|

[-]

[flagged]

by KaiShips36 minutes ago|

prev|

[-]

[flagged]