Atlas
- A brief description of what you are looking to do
- How you think this will help
- Why this matters to you
-
Disk throughput in monitoring
Currently we have disk IOPS in monitoring (both read and write).
One of the metrics that play a role to decide whether to have a provisioned disk or not, at least with AWS hosting, is the disk bandwidth.
For instance with a large enough disk, like 2000GB, I have max 250MB/s bandwidth with an unprovisioned (gp2) disk (the maximum), but could go to 500MB/s with a provisioned (io1) disk of that size.1 vote -
Alert setup for failed logins
There is no good monitoring or alert functionality for dealing with potential invalid login attempts.
I suggest an alert setup to handle events when known users fail more than 3 times to login (either in Atlas or via driverlogin to db).
We have had an external audit done and this is something they have found as a shortage in the MongoDB Atlas environment.
Log files are good but not to monitor potential security issues
4 votes -
Credits by period
Provide the ability to display and export credits by period (from date - to date) instead of selecting each single month, so that uers can easily display credit trends, etc. without downloading a lot of cvs and merge them togheter.
1 vote -
Allow log level to be configured per cluster/node
Atlas clusters don't support the setParameter command and, as a result, users aren't able to configure log levels. I understand the reasoning behind not exposing permissions to run setParameter to DB users so, in lieu of that, it would really helpful if Atlas users were able to configure log levels through the Atlas UI, preferably at the Node or Cluster level.
Thanks!
8 votes -
More detailed update status
It would be amazingly helpful to see more detailed information on recovering nodes. Just knowing that the node is, for example, "81% of the way on initial sync" is much more informational (and lets users know that it isn't stuck) as compared to the node being in "Startup2" recovery.
2 votes -
Graph connections per user (or per database)
Show a graph of connections per user.
It would be very useful to see how many connections each user has (or also, each db) over time.
It would allow us to see more clearly and faster which service uses how many connections.
2 votes -
Add a Transaction Commits / Sec Metric Graph to Atlas
Add a graph in Atlas to display "transactions.totalCommitted" on a per second basis to the Atlas metrics UI. Customers that are using transactions are often more interested in the # of transaction commits per second than opcounters.
I have had to use mongostat on a number customer evaluations because this metric is not available.
1 vote -
Allow custom date range to be submitted in Query Profiler
Currently the Query Profiler can plot queries that were logged up to 24 hours in the past.
It would be helpful to allow for visualization of a custom date range older than 24 hours ago, rather than only queries logged within the past 24 hours. This could help with RCAs for events that occurred more than 1 day ago, and also help teams who collaborate to investigate queries over a time period longer than 1 day.
3 votes -
Extend sub-hourly metrics retention to 72 hours
Right now the 1- and 5-minute metric data is lost after 48 hours (when it is combined into the hourly data). This makes it impossible to take a close look, on a weekday, at an event that occurred over the weekend.
It would be nice to be able to look at a problem on the weekend and say "I'll look at this more closely on Monday", and then have the ability to actually investigate it on Monday.
2 votes -
Allow to resend alert to a PagerDuty service
Like with other type of alert targets (Send to), we would like to be able to resend an alert to PagerDuty.
The reason is that an alert sent to PagerDuty can be mistakenly resolved but the real issue is still there, and the alert is still firing in Atlas.
Because it is already firing, it won't fire again so from that moment on, there is no way to get a notification of that alert so we might be missing a real issue.
2 votes -
Publish statistics in Atlas to analyze what is filling oplog
It would be very useful to be able to see metrics/statistics about the contents of oplog. There are open-source tools like oplog analyzer (https://github.com/mhelmstetter/oplog-analyzer) that can be used, but it's a hassle to have to install it and run it in the same datacenter where the database is running (for performance).
The statistics I'm most interested is what collections have most oplog documents, what kind of operations they have been and what is the total size that each collection currently utilizes from the oplog. This will help improving code to use less oplog.
We've seen cases where bad…
2 votes -
Precautionary Change/Recommendations for Shard Drops
If a user submits a configuration change to drop a shard, Atlas will sift through the cluster metrics and advise any necessary changes needed in order to complete the change without any errors/delays. For example, it will share the correct number of IOPS/cluster tier that needs to be used since the workload to drain a shard will increase; as well as other precautionary measures that should happen prior to dropping the shard (i.e. balancer needs to be turned on, etc).
2 votes -
The page fault metric is not available in the Datadog integration
The page fault metric is not available in the Datadog integration
1 vote -
Better message for "Removed Indexes" in Activity Feed during Atlas rolling index build
During a rolling index build, it is expected to have 2 "Deployment configuration published" notifications - 1 for Added Indexes and another 1 for Removed Indexes. This is because an entry for this desired index is added to the automation config and then the automation agent builds it accordingly. Once the agent is done building it, we remove the entry for that index from the automation config. Indeed, this does not drop the index.
However, from the term of "Removed Indexes", it can cause confusion that the index is dropped. Hence, this feature request is filed for clearer message so…
1 vote -
Identification and labelling of MongoDB connections
It'd be REALLY useful if there was an ability to provide a label (basically a string) when creating a new connection to MongoDB from an application, to say where the connection is coming from.
For example we use a bunch of microservices and each API sets up a new connection. It'd be very useful to be able to see how many connections each API has at any given point, as it'd allow us to determine if the microservice is misbehaving with the database etc.
2 votes -
Add alerts on disk IOPS and CPU IOWAIT %
To help catch heavy disk workloads before they become problematic, it would be great to have alerts on:
- disk IOPS percentage utilization for disks without burstable IOPS
- burst credit utilization for disks with burstable IOPS
- CPU IOWAIT %27 votes -
Add Elastic integration to Atlas
Add support to send cluster metrics to Elastic. Elastic has recently been putting a lot of work into their APM and logging solution, this is now a viable alternative to NewRelic and Datadog and is gaining traction. It would be great if Atlas supported sending metrics to Elastic. Please see below for references
https://www.elastic.co/apm
https://www.elastic.co/siemThe use case for this is the same as NewRelic or Datadog. I wish to monitor my Atlas clusters from inside Elastic, create dashboards and set up alerting
13 votes -
Ability to search for filter conditions in Atlas Alert builder
It would be helpful to be able to search for a trigger condition in the Alert builder when creating/editing an alert.
Currently, you need to scroll to find specific conditions, but the list is long so it can take some time to find the particular one needed.
2 votes -
Show driver/application/user metadata in Query Profiler query summaries
It would be helpful if the Query Profiler showed the driver and user metadata associated with the connection over which a particular query was run.
After clicking on a particular plotted operation in the Query Profiler, the sidebar pops up and shows execution statistics and structure of the plotted op, but doesn't show anything about the driver/application/user that issued the command.
This query summary would be more actionable if the driver, driver version, application name, and user were also displayed in the query sidebar (these details may be recorded in a separate log entry from the command itself).
This information…
5 votes -
Add horizontal scroll bar to metrics tab
I can't see metrics for all of my nodes while in the metrics tabs unless I make the window very large. It would be much easier to analyze the health of my cluster if I could review metrics for all the nodes by scrolling horizontally, rather than having to switch between a grouping of nodes.
5 votes
- Don't see your idea?