undefined

points

by buryat6 hours ago |

comments

by aluzzardi6 hours ago|

[-]

From our experience running this, we're seeing patterns like these:

- Opus agent wakes up when we detect an incident (e.g. CI broke on main)

- It looks at the big picture (e.g. which job broke) and makes a plan to investigate

- It dispatches narrowly focused tasks to Haiku sub agents (e.g. "extract the failing log patterns from commit XXX on job YYY ...")

- Sub agents use the equivalent of "tail", "grep", etc (using SQL) on a very narrow sub-set of logs (as directed by Opus) and return only relevant data (so they can interpret INFO logs as actually being the problem)

- Parent Opus agent correlates between sub agents. Can decide to spawn more sub agents to continue the investigation

It's no different than what I would do as a human, really. If there are terabytes of logs, I'm not going to read all of them: I'll make a plan, open a bunch of tabs and surface interesting bits.

by prescriptivist6 hours ago|

parent|

[-]

I have an agent system analyzing time series data periodically. What I've landed on is the tools themselves pre-process time series data, giving it more semantic meaning. AKA converting timestamps to human dates, additionally preprocessing it with statistical analysis, such as calculating current windows min/mean/max value for the series as well as a the same for a trailing window and surfacing those in the data. Also adding a volatility score, and doing things like collapsing runs of similar series that aren't particularly interesting from a volatility perspective and just trying to highlight anomalous series in the window in various ways.

This isn't anything new. It's not particularly technical or novel in any way, but it seems to work pretty well for identifying anomalies and comparing series over time horizons. It's even less token efficient on small windows than piping in a bunch of json, but it seems to be more effective from an analysis point of view.

The strange thing about it is that it involves fairly deterministic analysis before we even send the data to the LLM, so one might ask, what's the point if you're already doing analysis? The answer is that LLMs can actually find interesting patterns across a lot of well presented data, and they can pick up on patterns in a way that feels like they are cross-referencing many different time series and correlate signals in interesting ways. That's where the general purpose LLMs are helpful in my experience.

Breaking out analysis into sub-agents is a logical next step, we just haven't gotten there yet.

And yeah the goal is to approximate those of us engineers who are good at RCAs in the moment, who have instincts about the system and can juggle a bunch of tabs and cross reference the signals in them.

by azinman25 hours ago|

parent|

prev|

[-]

So how can this be a company when it’s just what Claude code already does?

by almosthere3 hours ago|

parent|

prev|

[-]

You may want to also have your agents write small scripts that auto flag future logs.

Have an array of scripts to run against each log (just rust code probably for speed) and have them flag for performance, errors, intrusions, etc...