Improvement of Restore Process in Ops Manager
Presently Ops manager restore process removes all the data from the target MongoDb deployment and then restores from the user selected Snapshot and subsequently applies the PIT restore.
It is noted that time taken by the restore operations increases as and when the DB size increases. The restore operation would also require the Applications to be shutdown and this would lead to increase in Application downtime during the restore process.
In order to minimize the downtime we need to reduce the restore time.
One suggestion to reduce the restore time is to perform automated restore from the latest available snapshots to the standby replica set(Managed by Ops Manager) on regular basis and then perform only the Point In Time Recovery which would only apply the Oplog changes from the latest restored snapshot to the standby replica set and do not delete the data before performing the restore from the target replica set. This would help pointing the applications immediately to the standby replica set after the PIT is completed and minimize the downtime.
For this the restore operation should have a feature for user to select if the data in the target replica set should be removed or not before performing the PIT restore.
Scheduling the restore operations from the latest available snapshots on the standby replica set and then performing only the PIT restore from the last available snapshot whenever required without deleting the data from the standby replica set would help restoring the Application data faster.