Collection size metrics
Hi,
From time to time we have Atlas auto-scale up our clusters' disks. We then need to start analyzing why. In some cases it is organic growth of the data we store, but in some cases we are missing TTLs or they are misconfigured and we accumulate data we do not need.
In both cases, trying to realize what causes the disk increase is a very tedious process as some clusters have thousands of collections.
To overcome this, we started running a small utility that gathers some data over all our collections. It iterates on all the organizations, all the projects, all the clusters, all the databases and all the collections to collect total size, total index size, document count, etc, and then stores all this info once a day into BigQuery. We then created a simple looker dashboard over it and now it is very easy to find the collections that grow and figure out if that's OK or not.
We think this is something that will be beneficial for more than just us. We believe this should become some service + charts dashboard Atlas should keep for each project (or better - each organization).
We'd be happy to share more info with your product team if this seems a good idea to you.
Thanks,
Oren

-
Tapani commented
While it's not automated, you can get a point-in-time snapshot with collStats: https://www.mongodb.com/docs/manual/reference/operator/aggregation/collStats/
You could solve this over time by writing a scheduled trigger than takes a snapshot at the desired interval and stores it in a collection. Then it's a matter of creating a Charts view of that collection.
-
Chris commented
I would like to be able to track the size of our collections (i.e. # of records, total disk size) over time. It would be ideal if there was an automated way to do this with Atlas, perhaps with Charts ?