upvote
For code specifically this is the hardest part — the "source data" (the codebase) changes constantly with every commit, but the AI config files that describe it don't update automatically.The approach that works best is AST-diffing rather than hash-based reindexing — you can detect semantic changes (function renamed, interface deleted) rather than just textual changes, which gives you much more precise invalidation signals.
reply
Depends on the use case, ie frequency and impact of changes.

Typically you would have a reindex process, and you keep track of hashes of chunks to check if you’ve already calculated this exact block before to avoid extra costs. And then run such a reindex process pretty frequently as it’s cheap / costs nothing when there are no changes.

reply
makes great sense, thanks!
reply