Ops Tools
38 results found
-
Allow conditions for all alerts based on DB/cluster name
We have a need to route alerts based on our database/cluster's impacted instead of having them apply to any in the current project. I know some of them can condition based on host name, which uses the cluster name, but most alert types have no similar option.
Examples would be:
Search Index Build Complete
Search Index Build FailedThese are sent without conditions and even the message does not indicate what cluster they are for (although you can guess based on the collection name provided).
We do not have the option to route these ourselves with a webhook, as the…
7 votes -
1 vote
-
disable auth on metrics
Opentelemtry-Collector does not support secrets for ServiceMonitor/PodMonitor resources, which generates authorization issues while trying to scrape the metrics endpoint of MongoDB.
I'm looking for a way to disable the basic_auth on the metrics endpoint of MongoDB, I already tried a lot of ways, including an empty username/password, but nothing worked, any help would be highly appreciated.
1 vote -
Change the "Replica set has a late snapshot" Alert from Global to Project level
Currently the "Replica set has a late snapshot" Alert is a Global Alert. It would be useful to have this changed to a Project level Alert so that the Alert can be tuned for each specific deployment to provide better customization of the Alert.
2 votes -
Make all metrics used in the Atlas dashboard available for the prometheus integration
Make all metrics used in the Atlas dashboard available for the prometheus integration (https://www.mongodb.com/docs/cloud-manager/tutorial/prometheus-integration/#mongodb-metric-labels).
Also describe how the current Atlas dashboard metrics are build from those.
I'm looking especially for the metrics:
- Max Disk IOPS
- Queues3 votes -
mongod startupWarnings
Create an "alert" to send notificactions when a mongod proccess has, for any reason, startup warnings.
e.g.
1)
The configured WiredTiger cache size is more than 80% of available RAM. See http://dochub.mongodb.org/core/faq-memory-diagnostics-wt2)
/sys/kernel/mm/transparent_hugepage/defrag is 'always'.3) Others.
1 vote -
Replica Set size Alert
Have an Alert in Ops Manager to notify that a Replica Set is approaching the maximum recommended size (ie: 2TB) and that it should be converted into a Sharded Cluster.
1 vote -
Allow drag-and-drop of metric graphs from different replica set members
Our use case is we have a replica set, but the east nodes and west nodes are on disk mounts with different names, so they won't appear on the same line in the Metrics tab. We should be able to drag and drop on a replica set member level, not just the metrics level. This allows more customization of metric graph layout.
2 votes -
Use different method for Slack notifications
At the moment in integration manager for Slack there is only option to use obsolete webhook method which allows sending notification to single Slack channel configured for this webhook. There is (not that) new API method https://api.slack.com/methods/chat.postMessage which allows sending notification to multiple channels. This is extremely useful if you for example want to differentiate alerts based on its kind or severity. There is also option to use Webhook method in MongoDB but it doesn't support Slack. So please either add support for new API method or make Webhook method supporting Slack so at least two Slack channels will be…
2 votes -
Providing a grafana dashboard for an on-premise cluster
It would be interesting to provide a grafana dashboard when integrating with prometheus.
The documentation is indeed very limited concerning metrics when you're on a mongoDB on premise cluster.
1 vote -
Send Alerts When Network Access is Updated
Create an alert when IP Addresses are added or removed from a cluster network access whitelist.
1 vote -
Add "Cluster Tier" and Provisioned "IOPS" as options in MongoDB Metrics Charts in Atlas
If these charts were available, it would enable to the user to visualize the Tier and IOPS of the cluster during specific time ranges, and compare to other metrics such as CPU, iowait, etc.
In my team's experience, we use Atlas auto-scaling to allow a cluster to scale up/down based on load, but when looking at Metrics it is not clear which Tier the cluster was in (e.g. "M30") when evaluating other metrics like CPU utilization. We are able to manually track Cluster Tier by viewing the Project Activity Feed, but if this data was integrated into Metrics it would…
1 vote -
Do not trigger spurious COLLSCAN alerts for getmore commands during watch
Here's what MongoDB support summarizes about the current behavior: "Upon consulting with the team, they have confirmed that sometimes change streams can trigger collection scans, but these alerts are an artifact of how we calculate the metric today. Unfortunately at the moment, there is no fix for these alerts"
My suggestion is to fix how the metric is calculated. What is happening in the getmore is not a real COLLSCAN, and it should not be reported (and alerted) as such.
(Bonus points for including context information in the COLLSCAN threshold alert showing the collection and operation. Just knowing there has…
1 vote -
Export Reports and Graphs
OPtions to export the reports and graphs into a PDF or office tool will help Incident Management process to a great extend.
1 vote -
Include "Fetch Time" in Profiler timings
When we run the Profiler, the runtime of "select" type queries are dramatically understated. This is because the Profiler only counts the time of the "query", and doesn't include how long it took to "fetch" the result set.
In one of our test cases, we "tuned" the query so it only shows as running 82 ms in the Profiler. However, when we actually run this same query in JavaScript, the runtime is 10 seconds. This is a very slow query which our end users experience many times a day.
Is there a way to configure Profiler to be more realistic,…
1 vote -
connection
Ops Manager only shows number of connection. Most DB monitoring tool shows where the connection is coming from and whats being run from the session. This needs to be a part of Ops Manager
1 vote -
export & import alerts settings from one project to another project
export & import alerts settings from one project to another project
4 votes -
Update MongoDB driver in Elastic metricbeat and validate operability with Atlas
The metricbeat data collection agent from Elastic currently has a MongoDB module for capturing low level metrics from a MongoDB instance. It's using a very old MongoDB driver that doesn't work with recent versions, nor work with Atlas.
I realize metricbeat isn't a MongoDB product or supported integration but it seems like it'd be low effort from an experienced Golang developer (which I am not) and remove an impediment from potential requirements around MongoDB metrics having to be captured using company-standard observability solutions. Thanks!
1 vote -
Add memory monitoring metrics
Atlas -> please make buffers, cached, and MEM shared available under system memory metrics on Atlas for end users so we can calculate the criteria for auto scaling .
Currently only Mongodb support can see these three metrics
9 votes -
disk iops
Can you please revert the change to metrics view for disk iops? It it completely unreadable and meaningless now.
There used to be 2 lines that made sense. Now there are 4... but looks like a bar chart. Anyway. Can't read it.
I would suggest.
1. have 1 view with 3 lines: 1 for the average read and 1 average write and 1 average BOTH.- have a second view showing the "burst" performance. Draw this as a LINE or scatter plot. Whatever you are drawing now is inscrutable.
thanks!
1 vote
- Don't see your idea?