undefined

points

[-]

The fundamental problem wit RAG is that it extracts only surface level features, "31+24" won't embed close to "55", while "not happy" will be close to "happy". Another issue is that embedding similarity does not indicate logical dependency, you won't retrieve the callers of a function with RAG, you need a LLM or code for that. Third issue is chunking, to embed you need to chunk, but if you chunk you exclude information that might be essential.

The best way to search I think is a coding agent with grep and file system access, and that is because the agent can adapt and explore instead of one shotting it.

I am making my own search tool based on the principle of LoD (level of detail) - any large text input can be trimmed down to about 10KB size by doing clever trimming, for example you could trim the middle of a paragraph keeping the start and end, or you could trim the middle of a large file. Then an agent can zoom in and out of a large file. It skims structure first, then drills into the relevant sections. Using it for analyzing logs, repos, zip files, long PDFs, and coding agent sessions which can run into MB size. Depending on content type we can do different types of compression for code and tree structured data. There is also a "tall narrow cut" (like cut -c -50 on a file).

The promise is - any size input fit into 10KB "glances" and the model can find things more efficiently this way without loading the whole thing.

by andai1 hours ago|

parent|

[-]

>The best way to search I think is a coding agent with grep and file system access, and that is because the agent can adapt and explore instead of one shotting it.

I tried the knowledge base feature in Claude web recently, uploaded a long textbook.

The indexer crashed and the book never fully indexed, but Claude had access to some kind of VM and reverse engineered the (automatically converted) book's fileformat and used shell tools to search it for the answers to my questions.

(Again, this was the web version of Claude, not Claude Code on my computer!)

I thought that was really neat, a little silly, and a little scary.

by visarga7 hours ago|

parent|

prev|

[-]

Ok 2 hours later here is the release: https://github.com/horiacristescu/nub

by andai1 hours ago|

parent|

[-]

>Code retains its function signatures

Nice. In original GPT-4 days (April 2023), I made a simple coding agent that worked with GPT-4's 8K (!) context window. The original version used some kind of AST walker, but then I realized I can get basically identical result (for Python) with `grep def` and `grep class`...

Took a look at your repo though, I am impressed you put a lot of thought into this.

It's interesting that Anthropic doesn't seem to be incentivized to do anything like this. Their approach seems to be "spawn a bunch of Haikus to grep around whole codebase until one of them finds something". You'd think a few lines of code could give you an overview of the codebase before you go blindly poking around. But maybe they're optimized for massive codebases where even the skeletons end up eating huge amounts of context.

The subagents "solve" context pollution by dying. If they find something, they only tell you the parent agent where it is. If not, they tell nothing. I guess that works but it feels heavy-handed somehow.

In CC I added a startup hook that similar to yours, dumps the skeleton of current dir, files, function names etc. into context, and the "time spent poking around" drops to zero.

by gervwyk4 hours ago|

parent|

prev|

[-]

This is a very cool idea. I’ve been dragging CC around very large code bases with a lot of docs and stuff. it does great but can be a swing and a miss.. have been wondering if there is a more efficient / effective way. This got me thinking. Thanks for sharing!

by y1n09 hours ago|

prev|

[-]

Context rot is still a problem though, so maybe vector search will stick around in some form. Perhaps we will end up with a tool called `vector grep` or `vg` that handles the vectorized search independent of the agent.