Atlas

← MongoDB Feedback Engine

Share your idea. In order to help prioritize, please include the following information

A brief description of what you are looking to do
How you think this will help
Why this matters to you

How can we improve the platform?

Enter your idea

(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

Integrar la alerta de Replication lag de Atlas en el API de Prometheus

Se solicita incluir métricas en el API de prometheus para poder implementar la alarma de Replication lag

6 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Adding balancer activity from sharding Statistics into Atlas UI
The customer often experience multiple performance issues in different clusters related to chunk migrations and in each case the customers are struggling with being able to determine that chunk migrations were occurring and started at the same time as the performance issue. The customer is limited in the observability available to identify this problem via Atlas UI:

Using db.serverStatus().shardingStatistics on each shard can provide what the customers need.

https://www.mongodb.com/docs/manual/reference/command/serverStatus/#mongodb-serverstatus-serverstatus.shardingStatistics

In particular, key metrics below will provide good insight on balancer activity :

For donor:

db.serverStatus().shardingStatistics.countDonorMoveChunkStarted

: The total number of times that MongoDB starts the moveChunk command or moveRange command on the primary node of the shard as part of the range migration procedure. This increasing number does not consider whether the chunk migrations succeed or not

db.serverStatus().shardingStatistics.countDocsClonedOnDonor

: The cumulative, always-increasing count of documents that MongoDB clones on the primary node of the donor shard.

db.serverStatus().shardingStatistics.totalDonorMoveChunkTimeMillis

: Cumulative time in milliseconds to move chunks from the current shard to another shard. For each chunk migration, the time starts when a moveRange or moveChunk command starts, and ends when the chunk is moved to another shard in a range migration procedure.

db.serverStatus().shardingStatistics.countDocsDeletedByRangeDeleter

: The cumulative, always-increasing count of documents that MongoDB deletes on the primary node of the donor shard during chunk migration.

db.serverStatus().shardingStatistics.rangeDeleterTasks

: The current total of the queued chunk range deletion tasks that are ready to run or are running as part of the range migration procedure.

For Recipient:

db.serverStatus().shardingStatistics.countRecipientMoveChunkStarted

: Cumulative, always-increasing count of chunks this member, acting as the primary of the recipient shard, has started to receive (whether the move has succeeded or not).

db.serverStatus().shardingStatistics.countDocsClonedOnRecipient

: The cumulative, always-increasing count of documents that MongoDB clones on the primary node of the recipient shard.
The customer often experience multiple performance issues in different clusters related to chunk migrations and in each case the customers are struggling with being able to determine that chunk migrations were occurring and started at the same time as the performance issue. The customer is limited in the observability available to identify this problem via Atlas UI:

Using db.serverStatus().shardingStatistics on each shard can provide what the customers need.

https://www.mongodb.com/docs/manual/reference/command/serverStatus/#mongodb-serverstatus-serverstatus.shardingStatistics

In particular, key metrics below will provide good insight on balancer activity :

For donor:

db.serverStatus().shardingStatistics.countDonorMoveChunkStarted

: The total number of times that MongoDB starts the moveChunk command or moveRange command…
2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Query profiler drilldown by number of write conflicts

One sort of issue that can lead to elevated CPU utilization but is otherwise hard to identify is queries with high numbers of write conflicts. Having some way to drill down to queries that exhibit these symptoms would be quite convenient.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Forward MongoDB Atlas logs to Securonix

This is a feature request to integrate the forwarding of MongoDB Atlas logs to Securonix.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Ability to "Mass Kill" slow running queries

Currently, Atlas has a "Kill Op" option which is useful to kill single long-running queries.

When upgrading to MongoDB 7.0, we were faced with a situation where the Slot-Based Query Engine (SBE) was causing 1000s of queries to execute slowly, we wanted to kill them all, but it was more than a human could do by clicking "Kill Op" 1-by-1. Hence a "Mass Kill" feature which kills queries longer than X seconds (X is configurable) would have helped us greatly in an outage scenario. We ultimately rebooted our cluster to kill queries, then manually implemented a script which did this from the MongoDB console. We would need ability to do this on both primary and secondary nodes.

Related requests:
- https://feedback.mongodb.com/forums/924145-atlas/suggestions/43772352-killallsessionsbypattern-and-kill-sessions
- https://feedback.mongodb.com/forums/924145-atlas/suggestions/42420421-allow-db-killop-and-manual-restarts-on-secondari

Currently, Atlas has a "Kill Op" option which is useful to kill single long-running queries.

When upgrading to MongoDB 7.0, we were faced with a situation where the Slot-Based Query Engine (SBE) was causing 1000s of queries to execute slowly, we wanted to kill them all, but it was more than a human could do by clicking "Kill Op" 1-by-1. Hence a "Mass Kill" feature which kills queries longer than X seconds (X is configurable) would have helped us greatly in an outage scenario. We ultimately rebooted our cluster to kill queries, then manually implemented a script which did this…

7 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Properly Formatted Prometheus Integration (mongodb_ metrics)

The (mongodb*) metrics collected by the integration lack the metric type and the description is extremely vague (mongodbcatalogStats_views has "catalogStats." as description). It would be easier to setup dashboards and queries if the type (e.g., gauge, counter) was properly set up and the metric provided a proper description.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Prometheus database and collection metrics

We checked the Prometheus metrics provided by MongoDB Atlas and didn't find metrics for the following:
Database size

Collection storage size

Record per collection

Indexes per collection

Index size

We would like to have this kind of metrics to add to dashboards.

23 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

3 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Atlas metrics granularity after 48 hours

For metrics older than 48 hours, the data is presented in 1-hour intervals. This level of granularity is often too coarse for a thorough examination of past events and trends. Such a broad view can obscure smaller yet significant details critical for understanding and resolving performance issues that occurred in the past.

Suggested Improvement:

having a smaller granularity value for historical metrics beyond the 48-hour timeframe. Providing data in smaller intervals would greatly enhance our ability to conduct in-depth analyses and diagnose past performance issues accurately. This would be particularly beneficial for conducting detailed investigations of historical data and identifying subtle performance trends.

For metrics older than 48 hours, the data is presented in 1-hour intervals. This level of granularity is often too coarse for a thorough examination of past events and trends. Such a broad view can obscure smaller yet significant details critical for understanding and resolving performance issues that occurred in the past.

Suggested Improvement:

having a smaller granularity value for historical metrics beyond the 48-hour timeframe. Providing data in smaller intervals would greatly enhance our ability to conduct in-depth analyses and diagnose past performance issues accurately. This would be particularly beneficial for conducting detailed investigations of historical data and identifying…

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Add document count to Datadog metrics

We'd like to monitor the number of documents in a collection via DataDog.

For On-Premise MongoDB the stats are already reported via mongodb.collection.count, mongodb.collection.size and mongodb.collection.avgobjsize.

If the same metrics could be made available for Atlas (E.g. mongodb.atlas.stats.collection.count) that would really help in monitoring.

E.g. spikes in different parts of the application could be tied to the number of documents on a glance. Without having that metric available, it is hard to pinpoint if a recent change had a negative impact on performance.

If the metrics can't be made available on the collection but only on the database level, this would already be helpful as well.

We'd like to monitor the number of documents in a collection via DataDog.

For On-Premise MongoDB the stats are already reported via mongodb.collection.count, mongodb.collection.size and mongodb.collection.avgobjsize.

If the same metrics could be made available for Atlas (E.g. mongodb.atlas.stats.collection.count) that would really help in monitoring.

E.g. spikes in different parts of the application could be tied to the number of documents on a glance. Without having that metric available, it is hard to pinpoint if a recent change had a negative impact on performance.

If the metrics can't be made available on the collection but only on the database level, this…

8 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Change Streams Monitoring and Alerting

Change streams can cause performance issues if not used properly. In some cases, administrators of multi-tenant dbs have no control (and shouldn't) over how various clients create change streams.

I think it is important that we accommodate these use-cases and provide useful metrics in the OM/Atlas metrics pages, and alerts on those metrics. Some potential metrics:
1. Number of change streams open
2. Average change stream lifetime
3. Query targeting ratios for change streams
4. Avg time between consecutive polls of the change stream (and other statistics)
--thought here is that change streams that are polled infrequently will result in less performant reads against the oplog
5. Num docs read from change streams
6. Difference between timestamp of most recently consumed change stream and end of the oplog
7. Difference between timestamp of most recently consumed change stream and beginning of oplog

I realize that probably some of these are unrealistic to implement once the details are considered, but Im interested in any useful metrics we can add regarding change streams. Currently the only way to retrieve some of this info is from the logs or via db.currentOp.

Change streams can cause performance issues if not used properly. In some cases, administrators of multi-tenant dbs have no control (and shouldn't) over how various clients create change streams.

I think it is important that we accommodate these use-cases and provide useful metrics in the OM/Atlas metrics pages, and alerts on those metrics. Some potential metrics:
1. Number of change streams open
2. Average change stream lifetime
3. Query targeting ratios for change streams
4. Avg time between consecutive polls of the change stream (and other statistics)
--thought here is that change streams that are polled infrequently will result in…

6 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
View cluster/storage scalable (yes/no) in "All cluster" dashboard

Right now in the "all cluster" view, one cannot see, if the clusters are enabled for auto scaling. my idea is to add this also to avoid last minute performance alerts for non scaling enabled clusters

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Collection query audit logs

It would be nice to be able to see what queries were run on which collection over the last x days.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
MongoDB Atlas historical stats by collection

It would be helpful to have historical metrics for storage by collection. Storage space used, indexspace used, number of indexes.
Thanks

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Metric reporting private endpoint state

On Mongo Atlas platform we are able to see the status of both Atlas Private Endpoint and Azure Private Endpoint. It would be helpful to have these statuses available as a metric on the prometheus integration.

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Improve Metric Correlation

Would be nice to better correlate metrics, server events, and individual operations.

For example would be helpful to have in the profiler indication of automatic scale up/down of the cluster in order to easily correlate actions that trigger them in the last 24h.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Metric Grouping

A huge improvement and help when it comes to metrics would be the ability to query by grouping (e.g. for database access users). This way if you were to use a specific database user per a specific service connection, we could see how much load to the database that specific service is causing.
Any form of implementation would be helpful, one example could be adding labels to the prometheus metrics per user, replica/shard etc.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Throttle instances within a shared tier

Right now, if a cluster is in a shared tier, other "noisy neighbor" tenants can destroy performance by overusing the resources, forcing a restart, causing unstable behevior etc.

Throttle noisy neighbors so that others don't suffer!
- Screenshot from 2024-01-08 18-55-33.png 40 KB
1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Dump audit logs to log analytics workspace

Why can't we dump audit logs to log analytics workspace when we are using azure private endpoints to connect atlas

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Investigation Buckets (dropdown menu) to enable predefined set of MongoDB & Hardware Metrics to investigate certain Data Layer SLAs

We have many metrics under "MongoDB Metrics" & "Hardware Metrics". User will have a good idea which metrics to enable while troubleshooting certain requirements.

E.g. User is investigating current lags in the data Layer. User has to select all the metrics that would help them get the data they are looking.

We can help the customer bit more by provide buckets like.
Options ( select one )
1. Investigate Lags
2. Investigate IOPS
3. Investigate Replication
4. Investigate Search Index
... and etc..

This will pre-select certain metrics from list of Mongodb & Hardware metrics which will give them all the information needed for them to investigate. More like best practise for monitoring.

We have many metrics under "MongoDB Metrics" & "Hardware Metrics". User will have a good idea which metrics to enable while troubleshooting certain requirements.

E.g. User is investigating current lags in the data Layer. User has to select all the metrics that would help them get the data they are looking.

We can help the customer bit more by provide buckets like.
Options ( select one )
1. Investigate Lags
2. Investigate IOPS
3. Investigate Replication
4. Investigate Search Index
... and etc..

This will pre-select certain metrics from list of Mongodb & Hardware metrics which will give them all…

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
prometheus integration to use PrivateLink

There is a possibility to integrate Prometheus into an Atlas project.
However, for enabling this integration, one needs to add Prometheus's IP address in the IP Access List.
This procedure has 2 flaws in it:
1. Prometheus runs as pods on some use-cases, meaning that its IP is ephemeral.
2. For projects that work solely with PrivateLink enabled and no open IP in the IP Access List, one cannot use the Prometheus integration (already talked with support about that).

The improvement here is to add the Prometheus integration to work as well in "PrivateLink-only" mode.

80 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

12 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

← Previous 1 2 3 4 5 … 8 9 Next →

Don't see your idea?

Atlas

How can we improve the platform?

Integrar la alerta de Replication lag de Atlas en el API de Prometheus

Adding balancer activity from sharding Statistics into Atlas UI

Query profiler drilldown by number of write conflicts

Forward MongoDB Atlas logs to Securonix

Ability to "Mass Kill" slow running queries

Properly Formatted Prometheus Integration (mongodb_ metrics)

Prometheus database and collection metrics

Atlas metrics granularity after 48 hours

Add document count to Datadog metrics

Change Streams Monitoring and Alerting

View cluster/storage scalable (yes/no) in "All cluster" dashboard

Collection query audit logs

MongoDB Atlas historical stats by collection

Metric reporting private endpoint state

Improve Metric Correlation

Metric Grouping

Throttle instances within a shared tier

Dump audit logs to log analytics workspace

Investigation Buckets (dropdown menu) to enable predefined set of MongoDB & Hardware Metrics to investigate certain Data Layer SLAs

prometheus integration to use PrivateLink

Feedback

Atlas

Feedback and Knowledge Base

Searching…

Give feedback

How can we improve the platform?

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

Atlas

Categories

Searching…