Atlas
- A brief description of what you are looking to do
- How you think this will help
- Why this matters to you
175 results found
-
Manually replace / restart virtual machine
Allow to replace / restart the underlying virtual machine of a node.
Sometimes this is all that is needed to get a cluster out of an unhealthy state. Currently, only support seem to be able to do this.
3 votes -
Disk throughput in monitoring
Currently we have disk IOPS in monitoring (both read and write).
One of the metrics that play a role to decide whether to have a provisioned disk or not, at least with AWS hosting, is the disk bandwidth.
For instance with a large enough disk, like 2000GB, I have max 250MB/s bandwidth with an unprovisioned (gp2) disk (the maximum), but could go to 500MB/s with a provisioned (io1) disk of that size.3 votes -
More detailed update status
It would be amazingly helpful to see more detailed information on recovering nodes. Just knowing that the node is, for example, "81% of the way on initial sync" is much more informational (and lets users know that it isn't stuck) as compared to the node being in "Startup2" recovery.
3 votes -
Publish statistics in Atlas to analyze what is filling oplog
It would be very useful to be able to see metrics/statistics about the contents of oplog. There are open-source tools like oplog analyzer (https://github.com/mhelmstetter/oplog-analyzer) that can be used, but it's a hassle to have to install it and run it in the same datacenter where the database is running (for performance).
The statistics I'm most interested is what collections have most oplog documents, what kind of operations they have been and what is the total size that each collection currently utilizes from the oplog. This will help improving code to use less oplog.
We've seen cases where bad…
3 votes -
The page fault metric is not available in the Datadog integration
The page fault metric is not available in the Datadog integration
3 votes -
Adding balancer activity from sharding Statistics into Atlas UI
The customer often experience multiple performance issues in different clusters related to chunk migrations and in each case the customers are struggling with being able to determine that chunk migrations were occurring and started at the same time as the performance issue. The customer is limited in the observability available to identify this problem via Atlas UI:
Using db.serverStatus().shardingStatistics on each shard can provide what the customers need.
In particular, key metrics below will provide good insight on balancer activity :
For donor:
- db.serverStatus().shardingStatistics.countDonorMoveChunkStarted
: The total number of times that MongoDB starts the moveChunk command or moveRange command…
2 votes -
Metric Grouping
A huge improvement and help when it comes to metrics would be the ability to query by grouping (e.g. for database access users). This way if you were to use a specific database user per a specific service connection, we could see how much load to the database that specific service is causing.
Any form of implementation would be helpful, one example could be adding labels to the prometheus metrics per user, replica/shard etc.2 votes -
Monitoring Metrics on dhandle
We'd like to monitor the WiredTiger dhandle over the time, directly from Cloud Atlas Monitoring view. It would allow to directly see the impact when updating cluster settings.
We'd like also being able to configure alert triggers on it, the goal for us is being alerted when an excessive amount of files (collections & indexes) is loaded into the MongoDB Memory, to avoid reaching an Out Of Memory error.
2 votes -
Use additional metadata to differentiate processes
Right now Ops Manager monitoring identifies MongoDB processes according to hostname:port. Unfortunately, if 2 processes have the same short hostname & port in the same Ops Manager project, they'll be treated the same even if they are actually different processes with different FQDN.
Please either allow the use of additional characteristics (FQDN, replica set name, config server name, etc) for differentiating MongoDB processes or provide some way to tag 2 or more processes so monitoring doesn't accidentally miscategorize them as the same process.
2 votes -
Allow switching between shards in the profiler and/or have a combined view
Since the profiler tabs are shard specific, it would radically improve the usability to:
a) Combine the profiler events to truly have a cluster level view so you don’t need to worry about shard specific views
b) Add a drop-down to the top of the page near the title that has the various shards listed in it, selecting a different shard brings you back to the same tab, but now with the changed view.
c) Consider tweaking the URL so that rather than using a SHA in the URL, you use the shard name, making it easy to manipulate the…2 votes -
Zabbix
We should have Zabbix integration to monitor the performance of MongoDB Atlas.
Zabbix is OpenSource popular monitoring tool available and many of enterprise organizations use this tool.
Please add Zabbix in Integrate with Third-Party Monitoring Services.
2 votes -
killAllSessionsByPattern and kill sessions
Please add killAllSessionsByPattern and kill sessions feature in Atlas UI
2 votes -
Metrics charts x-axis showing more than 24 hours of data should be labeled according to the scale
Metrics charts x-axis showing more than 24 hours of data should be labeled according to the scale
Currently, when a mouse is not hovering over a Metrics plot, the plot will show particular hours as labels for the x-axis. For example, if an 8 hour range of time is displayed, the time every two hours will be labeled.
However, if more than 1 day is displayed, the x-axis labels are less useful. The particular time once/day is displayed, but the day is not included. For example, when I display a week of data, 07:00 is highlighted on each particular day,…
2 votes -
Show BI Connector resource use Metrics
It would be helpful to be able to view the resources used by the BI connector in Metrics, similar to how you can view resources used by Atlas Search (Search Disk Space Used, Search Normalized Process CPU, etc). This could help with identifying issues due to resource intensive queries submitted through the BI connector.
Currently it is only possible to guess that an issue could be related to the BI Connector by viewing system resource use.
2 votes -
Bytes available for reuse metric
This metric will show the available bytes for reuse. If it's too much and its graph is changed a little bit, we can decide to compact it for saving cost (from disk usage).
Currently, we have to run dbstat command to see the available bytes for reuse and can't know its trend.2 votes -
Ability to block sending of Alert Close Notification
We have set up alerts and have noticed that when a condition is back to normal, there is another alert generated acknowledging its closed. Is there a way we can manage this (like disable or block it for some alerts)?
2 votes -
High resolution monitoring and alerting for WT dirty cache ratio, eviction workloads and checkpoints.
Implement better monitoring and alerting for WT dirty cache ratio. This should include sub-minute resolution and support an understanding of eviction workload as well as performance impact of checkpoints under heavy write workloads.
2 votes -
2 votes
-
Monitoring Integration with Azure Event Grid
Monitoring Integration with Azure Event Grid.
2 votes -
Metrics
We would like the possibility to freeze the row S S P (freeze the top row) when we scroll down to the different charts.
2 votes
- Don't see your idea?