undefined

points

[-]

That’s a really important definition of “zero setup”: no long-running indexing jobs, and no “crunch for 24 hours on my laptop” just to make search usable.

And I hear you on cross-network fragmentation — in a lot of real environments the hardest part isn’t search quality, it’s that data lives on different machines, different networks, and you only have partial visibility at any given time.

If you had to pick, would you rather have:

1.instant local indexing over whatever is reachable right now (even if incomplete), or

2.a lightweight distributed approach that can index in-place on each machine/network and only share metadata/results across boundaries?

I’m exploring this “latency + partial visibility” constraint as a first-class requirement (more context in my HN profile/bio if you want to compare notes).

by Neywiny2 days ago|

parent|

[-]

To be very clear, there are networks that exist that do not share anything across the boundary. I'm maybe not your prime customer, but some people get very hung up on such things and we go in circles about feasibility. So in that 2 is an impossibility at times, I'd prefer 1.

by dexdal2 days ago|

prev|

[-]

Indexing everything becomes unbounded fast. Shrink scope to one source of truth and a small curated corpus. Capture notes in one repeatable format, tag by task, and prune on a fixed cadence. That keeps retrieval predictable and keeps the model inside constraints.

by item0072 days ago|

parent|

[-]

That’s another strong point, and I think it’s the pragmatic default: shrink scope, keep one source of truth, enforce a repeatable format, and prune on a cadence. It’s basically how you keep both retrieval and any automation predictable.

The tension I’m trying to understand is that in a lot of real setups the “corpus” isn’t voluntarily curated — it’s fragmented across machines/networks/tools, and the opportunity cost of “move everything into one place” is exactly why people fall back to grep and ad-hoc search.

Do you think the right answer is always “accept the constraint and curate harder”, or is there a middle ground where you can keep sources where they are but still get reliable re-entry (even if it’s incomplete/partial)?

I’m collecting constraints like this as the core design input (more context in my HN profile/bio if you want to compare notes).