Improve the "host is down" alert by eliminating false positive alerts
Currently, whenever an index build is kicked off on a replica set, it tends to trigger a "host is down" alert. Although this is benign and a false positive alert the on-call DBA has to wake up in the middle of a night when this alert pages us to make sure the host is NOT DOWN. When the monitoring agent tries to ping a node (where index build is running) and fails to communicate with it, it tends to think that the host is down and triggers this alert, although the host is up and running. When an index build is kicked off, it is capture in the mongodb.log. If monitoring agent can check the mongodb.log for an index build or somehow be notified that an index build is running on a node, so NOT to trigger a host is down, that would really help with all the false positive alerts that are being generated.
-
Sri commented
I totally agree with you Bhavini and there were plenty of scenarios where we are receiving false host is down alerts ., i wish mongo support will address these asap :)
-
Sateesh commented
For sure, this will help if these false alerts are suppressed for an index build.
-
Bhavini commented
I hope support team will review and honor this request sooner than later, as it will help improve a DBA's on call life. :)
-
Tim commented
this would be massively helpful and is a clear gap in the current monitoring capabilities
-
David commented
This would be very helpful. There are too many situations where false "Host is down" alerts are received and this is the most common.