Have live tool migration re-establish connectivity with on-prem cluster in case of connection failures
While trying out Live Migration tool for data transfer from our on-prem cluster to Atlas, we had an issue where Atlas ran into connectivity issues with the on-prem cluster with respect to Live Migration tool. As the tool couldn't re-establish connectivity with on-prem cluster, we had to cancel migration and initiate data sync all over again. While this is fine for a small cluster, this can't be feasible for a large PROD cluster where we have 20+ shards and with data volume in order of 40-50TB compressed size.
So, even if there is a network disconnect or any maintenance happening on source cluster that involved a primary flip, we would like the Atlas target cluster to be able to continue with the data sync process.
We had taken care of the inbound network access from Atlas to all data members in our on-prem cluster. It looked like the Atlas live migration tool somehow couldn't re-establish connectivity with the new primary member in each shard. Thanks.
AdminAndrew (Admin, MongoDB) commented
Is it possible that you did not open up the inbound network access from the MongoDB Atlas Live Migration service to the secondary node that your cluster failed over to?