upvote
Exactly this, and this tool called qmd is what I use for the hybrid search portion. It also uses local LLMs to provide summaries on your own markdown data too. My agents use both depending on what type of search they are doing, and both provide good results.

https://github.com/tobi/qmd

reply
Both is usually the right answer, since you can use LLMs to do query expansion and effectively increase the recall performance of your retrieval algo
reply
That assumes that the agent knows which one is better. And to bake in which one is better via post-training would require a study like this to establish where each one works well
reply
I’ve got a custom ultra high performance streaming semantic search I exposed as a tool and the RL bias in Claude is almost insurmountable without copious and consistent steering. Codex will follow instructions and use the tools I ask it to but for gods sake between Claude asking to take a nap because it’s getting late in the session and it regressing to RL biased tools like grep it’s maddening. When I can get it to use my compositional tools tool calls drop from like 20-50 to 3-4, but it’s almost impossible to steer.
reply
it will only use tools it was trained on? what's the benfit of givig it all the tools.
reply
I'm still disappointed that ai can't use ctags, its used for finding strings and patterns, its right there.
reply
> I'm still disappointed that ai can't use ctags,

What do you mean by this? Do you mean not automatically build the index?

reply
it inspects a project, finds the ctags files, then goes on to use grep.
reply
[flagged]
reply