And I hear you on cross-network fragmentation — in a lot of real environments the hardest part isn’t search quality, it’s that data lives on different machines, different networks, and you only have partial visibility at any given time.
If you had to pick, would you rather have:
1.instant local indexing over whatever is reachable right now (even if incomplete), or
2.a lightweight distributed approach that can index in-place on each machine/network and only share metadata/results across boundaries?
I’m exploring this “latency + partial visibility” constraint as a first-class requirement (more context in my HN profile/bio if you want to compare notes).
The tension I’m trying to understand is that in a lot of real setups the “corpus” isn’t voluntarily curated — it’s fragmented across machines/networks/tools, and the opportunity cost of “move everything into one place” is exactly why people fall back to grep and ad-hoc search.
Do you think the right answer is always “accept the constraint and curate harder”, or is there a middle ground where you can keep sources where they are but still get reliable re-entry (even if it’s incomplete/partial)?
I’m collecting constraints like this as the core design input (more context in my HN profile/bio if you want to compare notes).