Atlas

← MongoDB Feedback Engine

Share your idea. In order to help prioritize, please include the following information

A brief description of what you are looking to do
How you think this will help
Why this matters to you

How can we improve the platform?

Enter your idea

(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

Integrar la alerta de Replication lag de Atlas en el API de Prometheus

Se solicita incluir métricas en el API de prometheus para poder implementar la alarma de Replication lag

6 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Ability to "Mass Kill" slow running queries

Currently, Atlas has a "Kill Op" option which is useful to kill single long-running queries.

When upgrading to MongoDB 7.0, we were faced with a situation where the Slot-Based Query Engine (SBE) was causing 1000s of queries to execute slowly, we wanted to kill them all, but it was more than a human could do by clicking "Kill Op" 1-by-1. Hence a "Mass Kill" feature which kills queries longer than X seconds (X is configurable) would have helped us greatly in an outage scenario. We ultimately rebooted our cluster to kill queries, then manually implemented a script which did this from the MongoDB console. We would need ability to do this on both primary and secondary nodes.

Related requests:
- https://feedback.mongodb.com/forums/924145-atlas/suggestions/43772352-killallsessionsbypattern-and-kill-sessions
- https://feedback.mongodb.com/forums/924145-atlas/suggestions/42420421-allow-db-killop-and-manual-restarts-on-secondari

Currently, Atlas has a "Kill Op" option which is useful to kill single long-running queries.

When upgrading to MongoDB 7.0, we were faced with a situation where the Slot-Based Query Engine (SBE) was causing 1000s of queries to execute slowly, we wanted to kill them all, but it was more than a human could do by clicking "Kill Op" 1-by-1. Hence a "Mass Kill" feature which kills queries longer than X seconds (X is configurable) would have helped us greatly in an outage scenario. We ultimately rebooted our cluster to kill queries, then manually implemented a script which did this…

7 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Atlas metrics granularity after 48 hours

For metrics older than 48 hours, the data is presented in 1-hour intervals. This level of granularity is often too coarse for a thorough examination of past events and trends. Such a broad view can obscure smaller yet significant details critical for understanding and resolving performance issues that occurred in the past.

Suggested Improvement:

having a smaller granularity value for historical metrics beyond the 48-hour timeframe. Providing data in smaller intervals would greatly enhance our ability to conduct in-depth analyses and diagnose past performance issues accurately. This would be particularly beneficial for conducting detailed investigations of historical data and identifying subtle performance trends.

For metrics older than 48 hours, the data is presented in 1-hour intervals. This level of granularity is often too coarse for a thorough examination of past events and trends. Such a broad view can obscure smaller yet significant details critical for understanding and resolving performance issues that occurred in the past.

Suggested Improvement:

having a smaller granularity value for historical metrics beyond the 48-hour timeframe. Providing data in smaller intervals would greatly enhance our ability to conduct in-depth analyses and diagnose past performance issues accurately. This would be particularly beneficial for conducting detailed investigations of historical data and identifying…

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Prometheus database and collection metrics

We checked the Prometheus metrics provided by MongoDB Atlas and didn't find metrics for the following:
Database size

Collection storage size

Record per collection

Indexes per collection

Index size

We would like to have this kind of metrics to add to dashboards.

20 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

3 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Add document count to Datadog metrics

We'd like to monitor the number of documents in a collection via DataDog.

For On-Premise MongoDB the stats are already reported via mongodb.collection.count, mongodb.collection.size and mongodb.collection.avgobjsize.

If the same metrics could be made available for Atlas (E.g. mongodb.atlas.stats.collection.count) that would really help in monitoring.

E.g. spikes in different parts of the application could be tied to the number of documents on a glance. Without having that metric available, it is hard to pinpoint if a recent change had a negative impact on performance.

If the metrics can't be made available on the collection but only on the database level, this would already be helpful as well.

We'd like to monitor the number of documents in a collection via DataDog.

For On-Premise MongoDB the stats are already reported via mongodb.collection.count, mongodb.collection.size and mongodb.collection.avgobjsize.

If the same metrics could be made available for Atlas (E.g. mongodb.atlas.stats.collection.count) that would really help in monitoring.

E.g. spikes in different parts of the application could be tied to the number of documents on a glance. Without having that metric available, it is hard to pinpoint if a recent change had a negative impact on performance.

If the metrics can't be made available on the collection but only on the database level, this…

8 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
View cluster/storage scalable (yes/no) in "All cluster" dashboard

Right now in the "all cluster" view, one cannot see, if the clusters are enabled for auto scaling. my idea is to add this also to avoid last minute performance alerts for non scaling enabled clusters

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Collection query audit logs

It would be nice to be able to see what queries were run on which collection over the last x days.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
MongoDB Atlas historical stats by collection

It would be helpful to have historical metrics for storage by collection. Storage space used, indexspace used, number of indexes.
Thanks

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Metric reporting private endpoint state

On Mongo Atlas platform we are able to see the status of both Atlas Private Endpoint and Azure Private Endpoint. It would be helpful to have these statuses available as a metric on the prometheus integration.

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Change Streams Monitoring and Alerting

Change streams can cause performance issues if not used properly. In some cases, administrators of multi-tenant dbs have no control (and shouldn't) over how various clients create change streams.

I think it is important that we accommodate these use-cases and provide useful metrics in the OM/Atlas metrics pages, and alerts on those metrics. Some potential metrics:
1. Number of change streams open
2. Average change stream lifetime
3. Query targeting ratios for change streams
4. Avg time between consecutive polls of the change stream (and other statistics)
--thought here is that change streams that are polled infrequently will result in less performant reads against the oplog
5. Num docs read from change streams
6. Difference between timestamp of most recently consumed change stream and end of the oplog
7. Difference between timestamp of most recently consumed change stream and beginning of oplog

I realize that probably some of these are unrealistic to implement once the details are considered, but Im interested in any useful metrics we can add regarding change streams. Currently the only way to retrieve some of this info is from the logs or via db.currentOp.

Change streams can cause performance issues if not used properly. In some cases, administrators of multi-tenant dbs have no control (and shouldn't) over how various clients create change streams.

I think it is important that we accommodate these use-cases and provide useful metrics in the OM/Atlas metrics pages, and alerts on those metrics. Some potential metrics:
1. Number of change streams open
2. Average change stream lifetime
3. Query targeting ratios for change streams
4. Avg time between consecutive polls of the change stream (and other statistics)
--thought here is that change streams that are polled infrequently will result in…

5 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Improve Metric Correlation

Would be nice to better correlate metrics, server events, and individual operations.

For example would be helpful to have in the profiler indication of automatic scale up/down of the cluster in order to easily correlate actions that trigger them in the last 24h.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Metric Grouping

A huge improvement and help when it comes to metrics would be the ability to query by grouping (e.g. for database access users). This way if you were to use a specific database user per a specific service connection, we could see how much load to the database that specific service is causing.
Any form of implementation would be helpful, one example could be adding labels to the prometheus metrics per user, replica/shard etc.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Throttle instances within a shared tier

Right now, if a cluster is in a shared tier, other "noisy neighbor" tenants can destroy performance by overusing the resources, forcing a restart, causing unstable behevior etc.

Throttle noisy neighbors so that others don't suffer!
- Screenshot from 2024-01-08 18-55-33.png 40 KB
1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Dump audit logs to log analytics workspace

Why can't we dump audit logs to log analytics workspace when we are using azure private endpoints to connect atlas

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Investigation Buckets (dropdown menu) to enable predefined set of MongoDB & Hardware Metrics to investigate certain Data Layer SLAs

We have many metrics under "MongoDB Metrics" & "Hardware Metrics". User will have a good idea which metrics to enable while troubleshooting certain requirements.

E.g. User is investigating current lags in the data Layer. User has to select all the metrics that would help them get the data they are looking.

We can help the customer bit more by provide buckets like.
Options ( select one )
1. Investigate Lags
2. Investigate IOPS
3. Investigate Replication
4. Investigate Search Index
... and etc..

This will pre-select certain metrics from list of Mongodb & Hardware metrics which will give them all the information needed for them to investigate. More like best practise for monitoring.

We have many metrics under "MongoDB Metrics" & "Hardware Metrics". User will have a good idea which metrics to enable while troubleshooting certain requirements.

E.g. User is investigating current lags in the data Layer. User has to select all the metrics that would help them get the data they are looking.

We can help the customer bit more by provide buckets like.
Options ( select one )
1. Investigate Lags
2. Investigate IOPS
3. Investigate Replication
4. Investigate Search Index
... and etc..

This will pre-select certain metrics from list of Mongodb & Hardware metrics which will give them all…

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Monitoring Metrics on dhandle

We'd like to monitor the WiredTiger dhandle over the time, directly from Cloud Atlas Monitoring view. It would allow to directly see the impact when updating cluster settings.

We'd like also being able to configure alert triggers on it, the goal for us is being alerted when an excessive amount of files (collections & indexes) is loaded into the MongoDB Memory, to avoid reaching an Out Of Memory error.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
prometheus integration to use PrivateLink

There is a possibility to integrate Prometheus into an Atlas project.
However, for enabling this integration, one needs to add Prometheus's IP address in the IP Access List.
This procedure has 2 flaws in it:
1. Prometheus runs as pods on some use-cases, meaning that its IP is ephemeral.
2. For projects that work solely with PrivateLink enabled and no open IP in the IP Access List, one cannot use the Prometheus integration (already talked with support about that).

The improvement here is to add the Prometheus integration to work as well in "PrivateLink-only" mode.

74 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

12 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Export only selected Atlas metrics

This is a request from a customer named: UPAX partnered with Grupo Salinas.
Inside the cluster metrics, where the Status is displayed, we are able to find the Export option (PDF, PNG) and they would like to export for only specific metrics (only the toggle charts) not the whole of them for reporting purposes with other teams and management.
- America_Mexico_City (1).pdf 452 KB
1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Duplication detection

One of our custmoers would like to see duplication detection as their project is spread out over various projects.

Value: It is important as different regions create sometimes duplicates and duplicates can be avoided in advance.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Metrics: Fix Y axis of "max CPU %" charts to 100%

In the various CPU % usage charts in the shard metrics, it looks like the upper limit of the Y axis isn't fixed, and it changes based on what the maximum value in the range of time being displayed is. For both normalized and regular CPU% values, there's a natural maximum to the value, I think: 100% for the normalized CPU%, and ncores*100% for the non-normalized CPU%.

With a changeable Y limit, it's hard to tell at a glance whether the value is actually really high or not. Oh, the blue line is hovering up near the top of the chart. Is that using a lot of CPU? Dunno; have to glance over at the Y axis labels and read the little scale numbers - oh, no, it isn't; this particular chart only goes up to 20% CPU on the Y axis at the moment. But if I come back later and refresh, maybe the scale will change to 100% and a blue line near the top of the chart does mean a lot of CPU.

Plus, if you view both the plain and max versions of the CPU metric on the same page - e.g. the Normalized System CPU and Max Normalized CPU, those two charts may not end up with the same Y axis scale. So you can't easily compare them visually. E.g. here's an screenshot attached where Normalized CPU was scaled to about 25% on the Y axis, and Max Normalized CPU was scaled to about 50%.

And, if you view multiple screenshots of these charts taken at different times, they're hard to compare visually, becuase they may end up on different scales.

I think it would be nice if the Normalized CPU % charts were always displayed with a fixed maximum Y axis value of 100%, regardless of the values actually graphed there, and the non-normalized CPU % charts displayed with a fixed maximum Y axis value of ncores*100%. (Where ncores is probably the maximum number of cores that shard had at any time during the displayed time period, so you don't end up with values off the chart in the case where the shard has been downsized and previously had high CPU usage.)

In the various CPU % usage charts in the shard metrics, it looks like the upper limit of the Y axis isn't fixed, and it changes based on what the maximum value in the range of time being displayed is. For both normalized and regular CPU% values, there's a natural maximum to the value, I think: 100% for the normalized CPU%, and ncores*100% for the non-normalized CPU%.

With a changeable Y limit, it's hard to tell at a glance whether the value is actually really high or not. Oh, the blue line is hovering up near the top of the…
- Atlas metrics - CPU pct normalized not scaled to 100%.png 407 KB
- Atlas metrics - CPU pct not scaled to N00%.png 368 KB
1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

← Previous 1 2 3 4 5 … 8 9 Next →

Don't see your idea?

Atlas

How can we improve the platform?

Integrar la alerta de Replication lag de Atlas en el API de Prometheus

Ability to "Mass Kill" slow running queries

Atlas metrics granularity after 48 hours

Prometheus database and collection metrics

Add document count to Datadog metrics

View cluster/storage scalable (yes/no) in "All cluster" dashboard

Collection query audit logs

MongoDB Atlas historical stats by collection

Metric reporting private endpoint state

Change Streams Monitoring and Alerting

Improve Metric Correlation

Metric Grouping

Throttle instances within a shared tier

Dump audit logs to log analytics workspace

Investigation Buckets (dropdown menu) to enable predefined set of MongoDB & Hardware Metrics to investigate certain Data Layer SLAs

Monitoring Metrics on dhandle

prometheus integration to use PrivateLink

Export only selected Atlas metrics

Duplication detection

Metrics: Fix Y axis of "max CPU %" charts to 100%

Feedback

Atlas

Feedback and Knowledge Base

Searching…

Give feedback

How can we improve the platform?

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

Atlas

Categories

Searching…