Encryption at rest with our own keys - Manage alert and shutdown parameters
It would be interesting to improve the encryption at rest feature in order to be able to define the alert parameters in case of unavailability of the key as well as the delay before a Replica Set fails.
for example. I want to be alerted if my key is not available x times in the last 10 minutes and stop the replica game after 60 minutes if the situation is not resolved. This makes it possible for the support team to take corrective action without causing downtime.
-
Geoffrey commented
Bonjour MongoDB,
We again had a problem with encryption at rest on one of our MongoDB Atlas clusters. After a rotation of the secret for access to the encryption keys, one of the nodes of a recluster did not go up. No alerts available in MongoDB Atlas to monitor this are available. We don't even have an event that indicates this problem to us.
We receive an email (which is not a good reliable monitoring method for us) from MongoDB informing us of the problem (ISSUE REFERENCE: PROACTIVE-57330).
We opened a support ticket after reading this message 24 days after receiving it. You must be able to monitor this type of incident via alerting in order to capture this type of problem as early as possible and avoid a production incident.
See our ticket for this occurrence of the problem.
-
Geoffrey commented
Hello MongoDB,
We have a problem last week with Encryption at rest. MongoDB was not able to reach our Key Manager on Azure even it was available. MongoDB decide to shutdown our cluster for 3 hours. Do you plan to prioritize this feature request. Dont have this can compromize the usage of MongoDB in production.
See ticket: 00859378: Encryption at rest using Key stored in KeyVault failed
-
Guillaume commented
Currently if the encryption key is not available, Atlas send an alert by email to the project owner. This alert is a default one and is not accessible in the Atlas Alert configuration. We would like the possibility to setup this alert so we can send it on Pager Duty or any other alerting system. And like the feature request said, it would be awesome if the cluster does not get shutdown right away so we have time to fix the issue.