Have a configurable "active_for" alert parameter to only open alerts when conditions last a specified duration
Some conditions are only worth an alert if this conditions persists for some time (usually minutes).
One example: some poorly performing background queries produce spikes in the "Scanned objects/returned objects" metric. We only want an alert if the ratio remains consistently high for x minutes/hours, instead of triggering the alert whenever a workload has a short spike that crosses the threshold.
My perception of the current state is as follows: once the metric reaches 1000, atlas will open an alert right away (it will show up in the UI under open alerts), but will wait with sending notifications until the notification delay has been elapsed. If the metric recovers in time (which is usually the case) the alert will be closed without sending out a notification at all.
It would be great to have instead/in addition of the notification_delay
a setting active_for
etc: the alert will only be opened once the metrics is e.g. above the threshold for the defined active_for
period. The opened/closed alerts tab in the Atlas UI should only include alerts that were intended to be actionable and get notified about.