Atlas
- A brief description of what you are looking to do
- How you think this will help
- Why this matters to you
171 results found
-
Disk queue length metric
Atlas exposes a few hardware metrics including Util% (presumably obtained from iostat or similar). However this metric is not very significant in the age of SSDs - a volume could be 100% utilized and still have spare capacity. A more useful storage-level metric is Average queue length - this is easier to interpret (high queue length = storage contention).
Would it be possible to add this metric to Atlas monitoring?
11 votes -
Graph connections per user (or per database)
Show a graph of connections per user.
It would be very useful to see how many connections each user has (or also, each db) over time.
It would allow us to see more clearly and faster which service uses how many connections.
11 votes -
Alerts based on Activity Feed - Rollback
Our main concern is 'Host experienced a rollback' is not an alert option!
Ideally, anything that shows up in the activity feed should be available as an alert.
11 votes -
Profiler should only watch queries from certain users
It would be useful to provide the profiler an "allowlist" and "denylist" of users to watch queries from. Essentially, it is only useful to receive alerts and profiler reports for queries made by actual applications. It is not useful to have alerts show up for one-off queries made by a DB admin using a DB explorer.
11 votes -
Rename Hardware Metric "Util %"
Under hardware metrics for a given replica set, there is a metric for "Util %". It is unclear on what this represents. After careful digging through the documentation, it appears to be a metric for Disk Bandwidth Utilization. I believe the metric name should be updated to reflect at least "Disk Util %", if not something more specific.
11 votes -
Show a graph of BANKED iops (AWS)
Atlas monitoring is great... but it would be super helpful to see a graph of banked iops. (or an approximation of this)
Suppose my iops limit is 100... and during the night my bank gets fully charged to 5.4MM.
During the heavy loads on my server, let's say my iops jump to a steady 500. This means that I am drawing 400 units from my bank. I can do this for 3.75 hours until my bank is exhausted.It would be so helpful to see an iops bank balance drawing down during peak, recharging during off-peak... and thus getting insight…
11 votes -
Allow threshold on "System Memory: Available" alert condition to be a percentage
In Atlas, it would be ideal if you could specify a percentage of total memory as the threshold for the "System Memory: Available Is..." rather than a literal number value. If it was possible to set this as a percentage, then the alert could be applied to all hosts in a project, rather than having to set hostname conditions and create a separate alert for each cluster.
10 votes -
Data Transfer Limit
since as documented there are limitation on network traffic in M0/M2/M5 instances, it is mandatory to have a metrics/alerts (also for free ones) to monitor this value in a 7d sliding window
10 votes -
Allow log level to be configured per cluster/node
Atlas clusters don't support the setParameter command and, as a result, users aren't able to configure log levels. I understand the reasoning behind not exposing permissions to run setParameter to DB users so, in lieu of that, it would really helpful if Atlas users were able to configure log levels through the Atlas UI, preferably at the Node or Cluster level.
Thanks!
10 votes -
Webhook
Hi MongoDB Atlas Team,
Some of enterprise customers are left out with improper monitoring via Webhook to ServiceNow (an ITSM tool). Can you please improve on it so right set of fields can be included like "Priority", "Service", "Assignment Group" etc. other details which can be filled up via dropdown or entered manually so alerts to generate incidents via ServiceNow.
Regards,
Varun
Toyota Europe Database Team9 votes -
Show in the UI when an index build is ongoing, and when it completes
When indexes are built from an application or a mongorestore there's no way to see if it's ongoing in the UI.
There should be an indication in the "Real-time" tab saying that an index on collection X is in progress. This would explain why performance is currently impacted.
It would be good to see index start and end marked on the "Metrics" visualizations so we can see the impact of index builds on IOPS, CPU, memory etc. This could help understand if we need to upscale the cluster.
9 votes -
Stackdriver Integration
Atlas Monitoring UI is great, but to ease centralization of alerts and dashboards..., it would be nice to have all atlas metrics in Cloud Monitoring too.
9 votes -
Profiler window should auto zoom to sampling period and show sampling period range
Atlas documentation states that the Query Profiler shows up to 10,000 queries within the past 24 hours: https://docs.atlas.mongodb.com/tutorial/profile-database/index.html#data-display-limitations
However, it is confusing to see that the Profiler cannot show more than a couple hours of data, likely because it is hitting the 10,000 entry limit.
The plot still shows a view showing 24 hours of time, but only the past couple hours have data plotted, misleadingly indicating that there are no slow queries before a couple hours ago – here's an example: https://p-37FYgJ.b1.n0.cdn.getcloudapp.com/items/JruWZDYK/Image%202020-03-25%20at%2010.49.32%20AM.png?v=7f79362e62c8d15a9f91f8ba4d5aecaf
Atlas should make the sampling time window clear in the Query Profiler graph so that we…
9 votes -
metrics
There might already be a way to do this, but I cannot find it. Please provide a way to combine "all primary" metrics into a single chart.
I love your metrics, but I hate that when primary moves from one server to another I get "data gaps" in my graphs. So then it becomes exceedingly difficult to look at temporal variations... requiring splicing together multiple segments from 2 or 3 different graphs.
I have attached a picture of what I am talking about. You can see that primary moved over for a few days so I get a graph with…
9 votes -
Add document count to Datadog metrics
We'd like to monitor the number of documents in a collection via DataDog.
For On-Premise MongoDB the stats are already reported via mongodb.collection.count, mongodb.collection.size and mongodb.collection.avgobjsize.
If the same metrics could be made available for Atlas (E.g. mongodb.atlas.stats.collection.count) that would really help in monitoring.
E.g. spikes in different parts of the application could be tied to the number of documents on a glance. Without having that metric available, it is hard to pinpoint if a recent change had a negative impact on performance.
If the metrics can't be made available on the collection but only on the database level, this…
8 votes -
Keep context of Metrics dashboard
on MongoAtlas metrics dashboard, I can organize graphically different metrics as I want to display first "insert" metric for example, then "cpu" metric, etc ...
If I change this actual order, then I go to another screen (for example "network access") and finally I come back to this metric dashboard, my new metrics added (insert, cpu, ...) in first position are not in wanted first position (at the top of metrics list), but at the end of this metrics list.It would be great to keep metrics context (displayed metrics and ordering) on Metrics dashboard.
8 votes -
Provide offending query shape in Query Targeting alert notifications
It would be ideal if the alert notifications for Query Targeting ratio alerts included a reference to the query shape that caused the alert to fire. This would assist customers in locating the exact query/queries with poor targeting ratios so that they can be optimized in a more expeditious manner.
8 votes -
Ability to "Mass Kill" slow running queries
Currently, Atlas has a "Kill Op" option which is useful to kill single long-running queries.
When upgrading to MongoDB 7.0, we were faced with a situation where the Slot-Based Query Engine (SBE) was causing 1000s of queries to execute slowly, we wanted to kill them all, but it was more than a human could do by clicking "Kill Op" 1-by-1. Hence a "Mass Kill" feature which kills queries longer than X seconds (X is configurable) would have helped us greatly in an outage scenario. We ultimately rebooted our cluster to kill queries, then manually implemented a script which did this…
7 votes -
Alert for WiredTiger Cache
Hi,
Can you please create an alert for WiredTiger metrics, such as used cache?
We had several cluster instances going over the 5% of used cached (dirty data) and would like to be notified when it happens.
Regards,
SergeiThis is needed in order to determine whether
7 votes -
Integrar la alerta de Replication lag de Atlas en el API de Prometheus
Se solicita incluir métricas en el API de prometheus para poder implementar la alarma de Replication lag
6 votes
- Don't see your idea?