Change Streams Monitoring and Alerting

Change streams can cause performance issues if not used properly. In some cases, administrators of multi-tenant dbs have no control (and shouldn't) over how various clients create change streams.

I think it is important that we accommodate these use-cases and provide useful metrics in the OM/Atlas metrics pages, and alerts on those metrics. Some potential metrics:
1. Number of change streams open
2. Average change stream lifetime
3. Query targeting ratios for change streams
4. Avg time between consecutive polls of the change stream (and other statistics)
--thought here is that change streams that are polled infrequently will result in less performant reads against the oplog
5. Num docs read from change streams
6. Difference between timestamp of most recently consumed change stream and end of the oplog
7. Difference between timestamp of most recently consumed change stream and beginning of oplog

I realize that probably some of these are unrealistic to implement once the details are considered, but Im interested in any useful metrics we can add regarding change streams. Currently the only way to retrieve some of this info is from the logs or via db.currentOp.

9 votes

Errol shared this idea · Jun 30, 2023 · Report… · Admin →

An error occurred while saving the comment

Dhananjay commented · July 3, 2023 10:31 AM · Report

Such metrics will assist in managing and monitoring the change streams. An immediate use case is for Ops Manager admins to pre-empt any rogue change streams that may affect the MongoDB cluster operation.

Submitting...

How can we improve the platform?

Change Streams Monitoring and Alerting

Feedback

Atlas: Monitoring and Metrics

Feedback and Knowledge Base

Searching…

Give feedback

Change Streams Monitoring and Alerting

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

Atlas: Monitoring and Metrics

Categories

Searching…

Give feedback