Adding balancer activity from sharding Statistics into Atlas UI
The customer often experience multiple performance issues in different clusters related to chunk migrations and in each case the customers are struggling with being able to determine that chunk migrations were occurring and started at the same time as the performance issue. The customer is limited in the observability available to identify this problem via Atlas UI:
Using db.serverStatus().shardingStatistics on each shard can provide what the customers need.
In particular, key metrics below will provide good insight on balancer activity :
For donor:
- db.serverStatus().shardingStatistics.countDonorMoveChunkStarted
: The total number of times that MongoDB starts the moveChunk command or moveRange command on the primary node of the shard as part of the range migration procedure. This increasing number does not consider whether the chunk migrations succeed or not
- db.serverStatus().shardingStatistics.countDocsClonedOnDonor
: The cumulative, always-increasing count of documents that MongoDB clones on the primary node of the donor shard.
- db.serverStatus().shardingStatistics.totalDonorMoveChunkTimeMillis
: Cumulative time in milliseconds to move chunks from the current shard to another shard. For each chunk migration, the time starts when a moveRange or moveChunk command starts, and ends when the chunk is moved to another shard in a range migration procedure.
- db.serverStatus().shardingStatistics.countDocsDeletedByRangeDeleter
: The cumulative, always-increasing count of documents that MongoDB deletes on the primary node of the donor shard during chunk migration.
- db.serverStatus().shardingStatistics.rangeDeleterTasks
: The current total of the queued chunk range deletion tasks that are ready to run or are running as part of the range migration procedure.
For Recipient:
- db.serverStatus().shardingStatistics.countRecipientMoveChunkStarted
: Cumulative, always-increasing count of chunks this member, acting as the primary of the recipient shard, has started to receive (whether the move has succeeded or not).
- db.serverStatus().shardingStatistics.countDocsClonedOnRecipient
: The cumulative, always-increasing count of documents that MongoDB clones on the primary node of the recipient shard.