Alert for WiredTiger Cache
Can you please create an alert for WiredTiger metrics, such as used cache?
We had several cluster instances going over the 5% of used cached (dirty data) and would like to be notified when it happens.
This is needed in order to determine whether
Jumping back in after I thought that this link was no longer present.
To elaborate more about the use case:
Currently we don't see the actual percentage of wiredTiger cache usage in Atlas, only the amount of its utilization in GB's.
We had already few incidents of bad instance performances due to critical (>90%) levels of cache utilization.
Displaying this type of information and metric will assist us in catching these issues before they snowball and affect our system. In addition, if the matter is related to natural growth and workload, we would know that it is time to upgrade the cluster to its next tier.
Used bytes exceeding 80% of WT Cache or dirty bytes consistently exceeding 5% are signs that the system is undersized. I have to set this explicitly for many of our customers, which means performing the calculation and setting per cluster within each project; there is no way to do this as a global alert within ops manager, this way. Moreover, in cases where the WT cache is not the standard (RAM -1 GB)2, there is no easy way to do this calculation
It looks like your last sentence was cut off: can you provide more detail? It's super helpful to understand how you would use this information and why it's important to you so we can think about the problem space holistically.
Note that current cache alerts are based on absolute quantity (bytes, GB, etc.) and not %, whereas performance deals in percentages (cache util 80%, dirty fill 5%). Beyond requiring a calculation, the current need to specify an absolute quantity means that the alert must be manually updated any time the instance size changes.