upvote
I think we're running really different monitoring setups, I'd never expect my alerting solution to still be able to alert to me if it's down or degraded, nor would I expect my metrics gathering software to alert me if it's down, that's why I have monitoring setup for those things in the first place.

But, I'm sure your setup makes as much sense in your context as mine makes in my context. As long as it works for you, we're all happy :)

reply
"I have monitoring set up for those things" - but that doesn't solve the ambiguity. When Prometheus misses a scrape, nothing fires. Silence looks identical whether your service is down, the network blipped, or Prometheus itself is struggling. A defensive monitoring system has to treat absence of data as a signal, not just absence of a problem.
reply