Atlas
- A brief description of what you are looking to do
- How you think this will help
- Why this matters to you
175 results found
-
Add metrics to monitor CPU credits for burstable performance Atlas clusters
Add metrics to Atlas for tracking burstable CPU credit spend for M10 and M20 cluster tier instances. Additional add support for creating alerts based on these metrics.
11 votes -
Disk queue length metric
Atlas exposes a few hardware metrics including Util% (presumably obtained from iostat or similar). However this metric is not very significant in the age of SSDs - a volume could be 100% utilized and still have spare capacity. A more useful storage-level metric is Average queue length - this is easier to interpret (high queue length = storage contention).
Would it be possible to add this metric to Atlas monitoring?
11 votes -
Graph connections per user (or per database)
Show a graph of connections per user.
It would be very useful to see how many connections each user has (or also, each db) over time.
It would allow us to see more clearly and faster which service uses how many connections.
11 votes -
Profiler should only watch queries from certain users
It would be useful to provide the profiler an "allowlist" and "denylist" of users to watch queries from. Essentially, it is only useful to receive alerts and profiler reports for queries made by actual applications. It is not useful to have alerts show up for one-off queries made by a DB admin using a DB explorer.
11 votes -
Rename Hardware Metric "Util %"
Under hardware metrics for a given replica set, there is a metric for "Util %". It is unclear on what this represents. After careful digging through the documentation, it appears to be a metric for Disk Bandwidth Utilization. I believe the metric name should be updated to reflect at least "Disk Util %", if not something more specific.
11 votes -
Show a graph of BANKED iops (AWS)
Atlas monitoring is great... but it would be super helpful to see a graph of banked iops. (or an approximation of this)
Suppose my iops limit is 100... and during the night my bank gets fully charged to 5.4MM.
During the heavy loads on my server, let's say my iops jump to a steady 500. This means that I am drawing 400 units from my bank. I can do this for 3.75 hours until my bank is exhausted.It would be so helpful to see an iops bank balance drawing down during peak, recharging during off-peak... and thus getting insight…
11 votes -
Allow threshold on "System Memory: Available" alert condition to be a percentage
In Atlas, it would be ideal if you could specify a percentage of total memory as the threshold for the "System Memory: Available Is..." rather than a literal number value. If it was possible to set this as a percentage, then the alert could be applied to all hosts in a project, rather than having to set hostname conditions and create a separate alert for each cluster.
10 votes -
Data Transfer Limit
since as documented there are limitation on network traffic in M0/M2/M5 instances, it is mandatory to have a metrics/alerts (also for free ones) to monitor this value in a 7d sliding window
10 votes -
Allow log level to be configured per cluster/node
Atlas clusters don't support the setParameter command and, as a result, users aren't able to configure log levels. I understand the reasoning behind not exposing permissions to run setParameter to DB users so, in lieu of that, it would really helpful if Atlas users were able to configure log levels through the Atlas UI, preferably at the Node or Cluster level.
Thanks!
10 votes -
Change Streams Monitoring and Alerting
Change streams can cause performance issues if not used properly. In some cases, administrators of multi-tenant dbs have no control (and shouldn't) over how various clients create change streams.
I think it is important that we accommodate these use-cases and provide useful metrics in the OM/Atlas metrics pages, and alerts on those metrics. Some potential metrics:
1. Number of change streams open
2. Average change stream lifetime
3. Query targeting ratios for change streams
4. Avg time between consecutive polls of the change stream (and other statistics)
--thought here is that change streams that are polled infrequently will result in…9 votes -
Webhook
Hi MongoDB Atlas Team,
Some of enterprise customers are left out with improper monitoring via Webhook to ServiceNow (an ITSM tool). Can you please improve on it so right set of fields can be included like "Priority", "Service", "Assignment Group" etc. other details which can be filled up via dropdown or entered manually so alerts to generate incidents via ServiceNow.
Regards,
Varun
Toyota Europe Database Team9 votes -
Custom replica set tags
Currently Atlas comes with pre-defined Replica set tags such as Provider, Node Types, Region.. But as of now no options for user-defined tags.
Please provide options for custom/user-defined replica set tags.9 votes -
Show in the UI when an index build is ongoing, and when it completes
When indexes are built from an application or a mongorestore there's no way to see if it's ongoing in the UI.
There should be an indication in the "Real-time" tab saying that an index on collection X is in progress. This would explain why performance is currently impacted.
It would be good to see index start and end marked on the "Metrics" visualizations so we can see the impact of index builds on IOPS, CPU, memory etc. This could help understand if we need to upscale the cluster.
9 votes -
Stackdriver Integration
Atlas Monitoring UI is great, but to ease centralization of alerts and dashboards..., it would be nice to have all atlas metrics in Cloud Monitoring too.
9 votes -
Profiler window should auto zoom to sampling period and show sampling period range
Atlas documentation states that the Query Profiler shows up to 10,000 queries within the past 24 hours: https://docs.atlas.mongodb.com/tutorial/profile-database/index.html#data-display-limitations
However, it is confusing to see that the Profiler cannot show more than a couple hours of data, likely because it is hitting the 10,000 entry limit.
The plot still shows a view showing 24 hours of time, but only the past couple hours have data plotted, misleadingly indicating that there are no slow queries before a couple hours ago – here's an example: https://p-37FYgJ.b1.n0.cdn.getcloudapp.com/items/JruWZDYK/Image%202020-03-25%20at%2010.49.32%20AM.png?v=7f79362e62c8d15a9f91f8ba4d5aecaf
Atlas should make the sampling time window clear in the Query Profiler graph so that we…
9 votes -
metrics
There might already be a way to do this, but I cannot find it. Please provide a way to combine "all primary" metrics into a single chart.
I love your metrics, but I hate that when primary moves from one server to another I get "data gaps" in my graphs. So then it becomes exceedingly difficult to look at temporal variations... requiring splicing together multiple segments from 2 or 3 different graphs.
I have attached a picture of what I am talking about. You can see that primary moved over for a few days so I get a graph with…
9 votes -
Add document count to Datadog metrics
We'd like to monitor the number of documents in a collection via DataDog.
For On-Premise MongoDB the stats are already reported via mongodb.collection.count, mongodb.collection.size and mongodb.collection.avgobjsize.
If the same metrics could be made available for Atlas (E.g. mongodb.atlas.stats.collection.count) that would really help in monitoring.
E.g. spikes in different parts of the application could be tied to the number of documents on a glance. Without having that metric available, it is hard to pinpoint if a recent change had a negative impact on performance.
If the metrics can't be made available on the collection but only on the database level, this…
8 votes -
Keep context of Metrics dashboard
on MongoAtlas metrics dashboard, I can organize graphically different metrics as I want to display first "insert" metric for example, then "cpu" metric, etc ...
If I change this actual order, then I go to another screen (for example "network access") and finally I come back to this metric dashboard, my new metrics added (insert, cpu, ...) in first position are not in wanted first position (at the top of metrics list), but at the end of this metrics list.It would be great to keep metrics context (displayed metrics and ordering) on Metrics dashboard.
8 votes -
Provide offending query shape in Query Targeting alert notifications
It would be ideal if the alert notifications for Query Targeting ratio alerts included a reference to the query shape that caused the alert to fire. This would assist customers in locating the exact query/queries with poor targeting ratios so that they can be optimized in a more expeditious manner.
8 votes -
Ability to "Mass Kill" slow running queries
Currently, Atlas has a "Kill Op" option which is useful to kill single long-running queries.
When upgrading to MongoDB 7.0, we were faced with a situation where the Slot-Based Query Engine (SBE) was causing 1000s of queries to execute slowly, we wanted to kill them all, but it was more than a human could do by clicking "Kill Op" 1-by-1. Hence a "Mass Kill" feature which kills queries longer than X seconds (X is configurable) would have helped us greatly in an outage scenario. We ultimately rebooted our cluster to kill queries, then manually implemented a script which did this…
7 votes
- Don't see your idea?