Atlas Administration API: Expose more information on cluster status
We are using the Atlas Administration API in order to automate scaling a production database cluster deployment in a more fine-grained way than the existing autoscaling offer. The specific database cluster is deployed as a replica set, so there is a number of electable nodes (one primary + 2, 4 or 6 secondaries) and any number of read-only (RO) nodes.
The goal is to be able to tune the number of nodes in a specific region, and being able to monitor the progress of the change being rolled out. Currently, the API offerings are too limited to do this effectively, but the web console clearly has access to the information we need.
The current implementation of the returnAllMongodbProcessesInOneProject
operation in the administration API lists all processes in the project, which would need to be filtered down to the processes that apply to just the specific cluster. The specific cluster processes do have a unique replicaSetName string in common but that name appears to be internal to Atlas and only appears in the connectionString -> standard URI for the given cluster. This information doesn't distinguish electable nodes from read-only nodes or analytics nodes, nor will it indicate if a machine is in the process of being deployed or decommissioned.
Because of this the public administration API only gets you part of the way there, as it won't let a process on our side recover from being restarted during a cluster adjustment operation: there is no way of detecting a machine being decommissioned (which will delay deploying a new change), nor would a net-zero machine count change (e.g. moving from RO nodes to electable nodes) be observable.
The web console on the other hand, has access to much more information, via the (private) https://cloud.mongodb.com/nds/clusters/{groupId}/{name}
and https://cloud.mongodb.com/nds/clusters/{groupId}/{name}/instances
endpoints; these are, respectively, a more detailed version of the public returnOneCluster
operation and an array of process states specific to the cluster (so a filtered version of returnAllMongodbProcessesInOneProject
) . These APIs lets us monitor cluster change operations much, much better, as they include more detailed node role information (electable vs read-only vs analytics).
To properly manage scaling via the public API, we'd really need to have that information available in the public API too.