Atlas

← MongoDB Feedback Engine

Share your idea. In order to help prioritize, please include the following information

A brief description of what you are looking to do
How you think this will help
Why this matters to you

How can we improve the platform?

Enter your idea

(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

Disk throughput in monitoring

Currently we have disk IOPS in monitoring (both read and write).
One of the metrics that play a role to decide whether to have a provisioned disk or not, at least with AWS hosting, is the disk bandwidth.
For instance with a large enough disk, like 2000GB, I have max 250MB/s bandwidth with an unprovisioned (gp2) disk (the maximum), but could go to 500MB/s with a provisioned (io1) disk of that size.

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
More detailed update status

It would be amazingly helpful to see more detailed information on recovering nodes. Just knowing that the node is, for example, "81% of the way on initial sync" is much more informational (and lets users know that it isn't stuck) as compared to the node being in "Startup2" recovery.

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Publish statistics in Atlas to analyze what is filling oplog

It would be very useful to be able to see metrics/statistics about the contents of oplog. There are open-source tools like oplog analyzer (https://github.com/mhelmstetter/oplog-analyzer) that can be used, but it's a hassle to have to install it and run it in the same datacenter where the database is running (for performance).

The statistics I'm most interested is what collections have most oplog documents, what kind of operations they have been and what is the total size that each collection currently utilizes from the oplog. This will help improving code to use less oplog.

We've seen cases where bad code that always reads the whole document, adds something to e.g. an array and then saves the whole document again, causing huge amounts of oplog and causing constant pressure on making the oplog bigger, thus making the whole DB size bigger.

It would be very useful to be able to see metrics/statistics about the contents of oplog. There are open-source tools like oplog analyzer (https://github.com/mhelmstetter/oplog-analyzer) that can be used, but it's a hassle to have to install it and run it in the same datacenter where the database is running (for performance).

The statistics I'm most interested is what collections have most oplog documents, what kind of operations they have been and what is the total size that each collection currently utilizes from the oplog. This will help improving code to use less oplog.

We've seen cases where bad…

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

2 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
The page fault metric is not available in the Datadog integration

The page fault metric is not available in the Datadog integration

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Adding balancer activity from sharding Statistics into Atlas UI
The customer often experience multiple performance issues in different clusters related to chunk migrations and in each case the customers are struggling with being able to determine that chunk migrations were occurring and started at the same time as the performance issue. The customer is limited in the observability available to identify this problem via Atlas UI:

Using db.serverStatus().shardingStatistics on each shard can provide what the customers need.

https://www.mongodb.com/docs/manual/reference/command/serverStatus/#mongodb-serverstatus-serverstatus.shardingStatistics

In particular, key metrics below will provide good insight on balancer activity :

For donor:

db.serverStatus().shardingStatistics.countDonorMoveChunkStarted

: The total number of times that MongoDB starts the moveChunk command or moveRange command on the primary node of the shard as part of the range migration procedure. This increasing number does not consider whether the chunk migrations succeed or not

db.serverStatus().shardingStatistics.countDocsClonedOnDonor

: The cumulative, always-increasing count of documents that MongoDB clones on the primary node of the donor shard.

db.serverStatus().shardingStatistics.totalDonorMoveChunkTimeMillis

: Cumulative time in milliseconds to move chunks from the current shard to another shard. For each chunk migration, the time starts when a moveRange or moveChunk command starts, and ends when the chunk is moved to another shard in a range migration procedure.

db.serverStatus().shardingStatistics.countDocsDeletedByRangeDeleter

: The cumulative, always-increasing count of documents that MongoDB deletes on the primary node of the donor shard during chunk migration.

db.serverStatus().shardingStatistics.rangeDeleterTasks

: The current total of the queued chunk range deletion tasks that are ready to run or are running as part of the range migration procedure.

For Recipient:

db.serverStatus().shardingStatistics.countRecipientMoveChunkStarted

: Cumulative, always-increasing count of chunks this member, acting as the primary of the recipient shard, has started to receive (whether the move has succeeded or not).

db.serverStatus().shardingStatistics.countDocsClonedOnRecipient

: The cumulative, always-increasing count of documents that MongoDB clones on the primary node of the recipient shard.
The customer often experience multiple performance issues in different clusters related to chunk migrations and in each case the customers are struggling with being able to determine that chunk migrations were occurring and started at the same time as the performance issue. The customer is limited in the observability available to identify this problem via Atlas UI:

Using db.serverStatus().shardingStatistics on each shard can provide what the customers need.

https://www.mongodb.com/docs/manual/reference/command/serverStatus/#mongodb-serverstatus-serverstatus.shardingStatistics

In particular, key metrics below will provide good insight on balancer activity :

For donor:

db.serverStatus().shardingStatistics.countDonorMoveChunkStarted

: The total number of times that MongoDB starts the moveChunk command or moveRange command…
2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Monitoring Metrics on dhandle

We'd like to monitor the WiredTiger dhandle over the time, directly from Cloud Atlas Monitoring view. It would allow to directly see the impact when updating cluster settings.

We'd like also being able to configure alert triggers on it, the goal for us is being alerted when an excessive amount of files (collections & indexes) is loaded into the MongoDB Memory, to avoid reaching an Out Of Memory error.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Use additional metadata to differentiate processes

Right now Ops Manager monitoring identifies MongoDB processes according to hostname:port. Unfortunately, if 2 processes have the same short hostname & port in the same Ops Manager project, they'll be treated the same even if they are actually different processes with different FQDN.

Please either allow the use of additional characteristics (FQDN, replica set name, config server name, etc) for differentiating MongoDB processes or provide some way to tag 2 or more processes so monitoring doesn't accidentally miscategorize them as the same process.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Add support for HTTP based scraping without IP restriction

Prometheus supports a lot in the HTTP scraping space, including oauth and Bearer token based scrape targets.

Currently we use Grafana cloud that has a list of IP address that can scrape from. This "can" change, and if it does and we hardcode this access into Atlas based on IP then it will break the scrape.

Could Mongo add support for token/oauth based scraping. Where we provide these tokens in a HTTPS call?

I think this would solve the issues with things like private link, or peered network connections. It would also allow "any" prometheus server that can make outbound HTTPS to scrape Atlas.

Prometheus supports a lot in the HTTP scraping space, including oauth and Bearer token based scrape targets.

Currently we use Grafana cloud that has a list of IP address that can scrape from. This "can" change, and if it does and we hardcode this access into Atlas based on IP then it will break the scrape.

Could Mongo add support for token/oauth based scraping. Where we provide these tokens in a HTTPS call?

I think this would solve the issues with things like private link, or peered network connections. It would also allow "any" prometheus server that can make outbound…

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Allow switching between shards in the profiler and/or have a combined view

Since the profiler tabs are shard specific, it would radically improve the usability to:
a) Combine the profiler events to truly have a cluster level view so you don’t need to worry about shard specific views
b) Add a drop-down to the top of the page near the title that has the various shards listed in it, selecting a different shard brings you back to the same tab, but now with the changed view.
c) Consider tweaking the URL so that rather than using a SHA in the URL, you use the shard name, making it easy to manipulate the URL to change the shard view

Since the profiler tabs are shard specific, it would radically improve the usability to:
a) Combine the profiler events to truly have a cluster level view so you don’t need to worry about shard specific views
b) Add a drop-down to the top of the page near the title that has the various shards listed in it, selecting a different shard brings you back to the same tab, but now with the changed view.
c) Consider tweaking the URL so that rather than using a SHA in the URL, you use the shard name, making it easy to manipulate the…

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Zabbix

We should have Zabbix integration to monitor the performance of MongoDB Atlas.

Zabbix is OpenSource popular monitoring tool available and many of enterprise organizations use this tool.

Please add Zabbix in Integrate with Third-Party Monitoring Services.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
killAllSessionsByPattern and kill sessions

Please add killAllSessionsByPattern and kill sessions feature in Atlas UI

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Metrics charts x-axis showing more than 24 hours of data should be labeled according to the scale

Metrics charts x-axis showing more than 24 hours of data should be labeled according to the scale

Currently, when a mouse is not hovering over a Metrics plot, the plot will show particular hours as labels for the x-axis. For example, if an 8 hour range of time is displayed, the time every two hours will be labeled.

However, if more than 1 day is displayed, the x-axis labels are less useful. The particular time once/day is displayed, but the day is not included. For example, when I display a week of data, 07:00 is highlighted on each particular day, but it is not clear which day that is (unless if I hover over the chart).

This is even worse if more than 1 week is selected -- in that case, two 07:00 times are displayed, with no indication of the date. It's impossible to gauge when a trend started or changed based on the labels alone.

It would be better if the date were included or shown instead as the label for plots that show a wider date range.

Metrics charts x-axis showing more than 24 hours of data should be labeled according to the scale

Currently, when a mouse is not hovering over a Metrics plot, the plot will show particular hours as labels for the x-axis. For example, if an 8 hour range of time is displayed, the time every two hours will be labeled.

However, if more than 1 day is displayed, the x-axis labels are less useful. The particular time once/day is displayed, but the day is not included. For example, when I display a week of data, 07:00 is highlighted on each particular day,…
2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Show BI Connector resource use Metrics

It would be helpful to be able to view the resources used by the BI connector in Metrics, similar to how you can view resources used by Atlas Search (Search Disk Space Used, Search Normalized Process CPU, etc). This could help with identifying issues due to resource intensive queries submitted through the BI connector.

Currently it is only possible to guess that an issue could be related to the BI Connector by viewing system resource use.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Bytes available for reuse metric

This metric will show the available bytes for reuse. If it's too much and its graph is changed a little bit, we can decide to compact it for saving cost (from disk usage).
Currently, we have to run dbstat command to see the available bytes for reuse and can't know its trend.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Ability to block sending of Alert Close Notification

We have set up alerts and have noticed that when a condition is back to normal, there is another alert generated acknowledging its closed. Is there a way we can manage this (like disable or block it for some alerts)?

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
High resolution monitoring and alerting for WT dirty cache ratio, eviction workloads and checkpoints.

Implement better monitoring and alerting for WT dirty cache ratio. This should include sub-minute resolution and support an understanding of eviction workload as well as performance impact of checkpoints under heavy write workloads.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Support CPU/Disk metrics for free monitoring on Windows

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Monitoring Integration with Azure Event Grid

Monitoring Integration with Azure Event Grid.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Metrics

We would like the possibility to freeze the row S S P (freeze the top row) when we scroll down to the different charts.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Add support for replication lag and replication headroom metrics in Datadog

The metrics replset.replicationheadroom and replset.replicationlag would be useful to have exposed to identify network limtiations and/or too small oplogs

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring and Metrics · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

← Previous 1 2 3 4 5 6 7 8 9 Next →

Don't see your idea?

Atlas

How can we improve the platform?

Disk throughput in monitoring

More detailed update status

Publish statistics in Atlas to analyze what is filling oplog

The page fault metric is not available in the Datadog integration

Adding balancer activity from sharding Statistics into Atlas UI

Monitoring Metrics on dhandle

Use additional metadata to differentiate processes

Add support for HTTP based scraping without IP restriction

Allow switching between shards in the profiler and/or have a combined view

Zabbix

killAllSessionsByPattern and kill sessions

Metrics charts x-axis showing more than 24 hours of data should be labeled according to the scale

Show BI Connector resource use Metrics

Bytes available for reuse metric

Ability to block sending of Alert Close Notification

High resolution monitoring and alerting for WT dirty cache ratio, eviction workloads and checkpoints.

Support CPU/Disk metrics for free monitoring on Windows

Monitoring Integration with Azure Event Grid

Metrics

Add support for replication lag and replication headroom metrics in Datadog

Feedback

Atlas

Feedback and Knowledge Base

Searching…

Give feedback

How can we improve the platform?

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

Atlas

Categories

Searching…