I don't see the problem if you give the LLM the ability to generate multiple search queries at once. Even simple vector search can give you multiple results at once.
> "How many cars from 1980 to 1985 and 1990 to 1997 had between 100 and 180PS without Diesel in the color blue that were approved for USA and Germany from Mercedes but only the E unit?"
I'm a human and I have a hard time parsing that query. Are you asking only for Mercedes E-Class? The number of cars, as in how many were sold?
- Chunk properly;
- Elide "obviously useless files" that give mixed signals;
- Re-rank and rechunk the whole files for top scoring matches;
- Throw in a little BM25 but with better stemming;
- Carry around a list of preferred files and ideally also terms to help re-rank;
And so on. Works great when you're an academic benchmaxing your toy Master's project. Try building a scalable vector search that runs on any codebase without knowing anything at all about it and get a decent signal out of it.
Ha.