Ops Manager and Backup infrastracture Disaster Recovery support with K8s Operator
We have carried out tests with MongoDB v1.5.5 K8s Operator and Ops Manager 4.2.18 with Backup infrastructure (S3 Snapshots) in an Openshift 3.11 environment (MongoDB Support case attached).
In this case, a "Disaster Recovery" simulation has been carried out. However, several components created by the Operator had to be restored to obtain a similar state to the one before the "disaster".
Furthermore, it is very likely that the S3 Snapshots will be lost if the process is not completed in a certain manner.
It would be great to have an official approach to deploy/restore an OM resource using MongoDB K8s Operator with backed up data (AppDB, Oplog and S3 Metada in our case).
There is no current supported mechanism for backing up Ops Manager in a way that guarantees the data. As Ops Manager is itself a backup tool, it's challenging to maintain the integrity of the data in DR scenarios.
For this reason we recommend multi-site high availability for OM and AppDB. This is already possible when running OM on hardware of in VMs, but not currently supported in Kubernetes (unless a Kubernetes cluster is spanning sites).
Later this year (2023) we hope to support OM deployments across multiple Kubernetes clusters - as we already support (in beta) for Replica Sets (full release in April 2023 with Sharded cluster support in May/June 2023). Doing so will reduce the criticality of a OM/AppDB backup solution within Kubernetes.