Ops Tools

← MongoDB Feedback Engine

How can we improve the Operational Tooling MongoDB provides?

Enter your idea

(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

Send `Monitoring is down` and `Backup is down` alerts for each MongoDB Agent (Monitoring/Backup Module) and include hostname information in

What is the problem that needs to be solved? Monitoring is down and Backup is down alerts needs to be sent for each individual MongoDB Agent (Monitoring/Backup Module) which become down, Monitoring is down and Backup is down alerts should include hostname information in them.

Why is it a problem? (the pain) Customer can't easily identify which MongoDB Agent (Monitoring/Backup Module) become down without hostname information available in the alert (in multi-project environment this becomes operational pain for the customer).

12 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Add memory monitoring metrics

Atlas -> please make buffers, cached, and MEM shared available under system memory metrics on Atlas for end users so we can calculate the criteria for auto scaling .

Currently only Mongodb support can see these three metrics

9 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Collect hardware metrics even if there's no managed mongo process

Collect hardware metrics even if there's no managed mongo process

Have Automation Agent collect hardware metrics on unmanaged mongo hosts.

Automation agents doesn't collect hardware metrics unless there's a managed mongo process. This means we can't provide centralized system monitoring for a heterogeneous environment, where some clusters are running on their own and others are under automation, or on any non-managed host.

8 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Add `serverStatus.uptime` counter info into Metrics

What is the problem that needs to be solved? We already collect serverStatus.uptime counter info from each and every MongoDB Server process, so we just need to add serverStatus.uptime counter info into Metrics so that it will be possible to track serverStatus.uptime changes through the time.

Why is it a problem? (the pain) If you'd like to calculate MongoDB Server process availability to know for how long your MongoDB Server process(es) was/were up and running, you'll need to analyze MongoDB Server process logs (in case if they are ever available for required period of time) to see last time MongoDB Server process was stopped/started. With serverStatus.uptime counter info available in Metrics at Ops Manager side it will be very easy (also programmatically via Public API call) to identify down time of MongoDB Server process since serverStatus.uptime counter will be reset every time MongoDB Server process is restarted (i.e. a MongoDB Server process restart will always be indicated easily).

What is the problem that needs to be solved? We already collect serverStatus.uptime counter info from each and every MongoDB Server process, so we just need to add serverStatus.uptime counter info into Metrics so that it will be possible to track serverStatus.uptime changes through the time.

Why is it a problem? (the pain) If you'd like to calculate MongoDB Server process availability to know for how long your MongoDB Server process(es) was/were up and running, you'll need to analyze MongoDB Server process logs (in case if they are ever available for required period of time) to see last time MongoDB…

7 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Allow conditions for all alerts based on DB/cluster name

We have a need to route alerts based on our database/cluster's impacted instead of having them apply to any in the current project. I know some of them can condition based on host name, which uses the cluster name, but most alert types have no similar option.

Examples would be:
Search Index Build Complete
Search Index Build Failed

These are sent without conditions and even the message does not indicate what cluster they are for (although you can guess based on the collection name provided).

We do not have the option to route these ourselves with a webhook, as the relevant information is not provided there either.

We have a need to route alerts based on our database/cluster's impacted instead of having them apply to any in the current project. I know some of them can condition based on host name, which uses the cluster name, but most alert types have no similar option.

Examples would be:
Search Index Build Complete
Search Index Build Failed

These are sent without conditions and even the message does not indicate what cluster they are for (although you can guess based on the collection name provided).

We do not have the option to route these ourselves with a webhook, as the…

6 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Allow to configure `maxTimeMS` for commands executed from Ops Manager's Data Explorer

What is the problem that needs to be solved? Allow to configure maxTimeMS for MongoDB commands which are executed from Ops Manager's Data Explorer.

Why is it a problem? (the pain) A) Ops Manager's Data Explorer cannot work with views in case if the view is taking >15000 ms to be load. Data Explorer cannot work with find operations in case if that find operation is taking >15000 ms to be completed.

5 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
export & import alerts settings from one project to another project

export & import alerts settings from one project to another project

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Grant permission to access Real Time tab to Project Read Only users

Accessing the Real Time metrics tab requires at least the Project Monitoring Admin role but this role has other privileges to administer alerts and manage hosts as well.

It is more appropriate to enable the read-only access user (Project Read Only role) to access the Real Time metrics tab.

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Send alert when RECOVERING node has failed due to being too stale to sync from any available node

Ops Manager users with hundreds or even thousands of replica set members (hosts/nodes) need an alert that indicates a nodes is in RECOVERING state and is too far behind the oplog to recover without manual intervention. This information is present in the mongod log file. However, Ops Manager should generate a separate alert for this unique and important state. Without this alert, it is not immediately clear when a user needs take action to bring a replica set back to a healthy state.

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Enable Ops Manager alerts on any FATAL or ERROR lines in the mongod/mongos logs

What is the problem that needs to be solved?

All possible error states and failures reported in the mongod and mongos log files are not raised as alerts in the Ops Manager alerting system. This prevents users from configuring alerts on important events in MongoDB deployments.

Why is it a problem? (the pain)

For some users specific errors such as FATAL or ERROR lines in the mongod log need to be alerted and addressed with urgency. Since the specific high priority event is different for different users, a configurable, string matching (regex) driven alerting system that constantly monitors the mongod and mongos log files is needed for maximum flexibility.

What is the problem that needs to be solved?

All possible error states and failures reported in the mongod and mongos log files are not raised as alerts in the Ops Manager alerting system. This prevents users from configuring alerts on important events in MongoDB deployments.

Why is it a problem? (the pain)

For some users specific errors such as FATAL or ERROR lines in the mongod log need to be alerted and addressed with urgency. Since the specific high priority event is different for different users, a configurable, string matching (regex) driven alerting system that constantly monitors the mongod…

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Allow enable/disable for agent alerts

When we do server patching, we end up receiving agent down alerts for automation, monitoring and backup agents. Those create unnecessary noise and create a real risk of us missing a real alert. We should have ability to disable the agent alerts as part of server shutdown and enable the agent alerts are part of server startup.

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Ability to configure all destinations for SNMPv2c Alert Traps in a single place

What is the problem that needs to be solved? Ops Manager needs to have ability to configure all destinations for SNMPv2c Alert Traps in a single place (so that single place needs to be updated instead of dozens of individual Ops Manager Alerts).

Why is it a problem? (the pain) In case if there's a change in SNMPv2c Alert Trap destination(s), it becomes effort to change the respective hosts for each of the alert. This process requires some time (unless customer script it via Ops Manager's API) if amount of configured Ops Manager Alerts is high, and the process itself is prone to human errors (typos, etc.).

What is the problem that needs to be solved? Ops Manager needs to have ability to configure all destinations for SNMPv2c Alert Traps in a single place (so that single place needs to be updated instead of dozens of individual Ops Manager Alerts).

Why is it a problem? (the pain) In case if there's a change in SNMPv2c Alert Trap destination(s), it becomes effort to change the respective hosts for each of the alert. This process requires some time (unless customer script it via Ops Manager's API) if amount of configured Ops Manager Alerts is high, and the process itself…

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Filter/Sort Process List View

In previous versions of Ops Manager, you could filter the process list page. It would be nice to bring that back, so we could quickly identify processes which do not have recent pings, etc.

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
List shards in Deployment > Metrics' shard list in alphabetical order

List shards in Deployment > Metrics' shard list in alphabetical order in Cloud Manager UI.

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Change the "Replica set has a late snapshot" Alert from Global to Project level

Currently the "Replica set has a late snapshot" Alert is a Global Alert. It would be useful to have this changed to a Project level Alert so that the Alert can be tuned for each specific deployment to provide better customization of the Alert.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Make all metrics used in the Atlas dashboard available for the prometheus integration

Make all metrics used in the Atlas dashboard available for the prometheus integration (https://www.mongodb.com/docs/cloud-manager/tutorial/prometheus-integration/#mongodb-metric-labels).
Also describe how the current Atlas dashboard metrics are build from those.
I'm looking especially for the metrics:
- Max Disk IOPS
- Queues

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Use different method for Slack notifications

At the moment in integration manager for Slack there is only option to use obsolete webhook method which allows sending notification to single Slack channel configured for this webhook. There is (not that) new API method https://api.slack.com/methods/chat.postMessage which allows sending notification to multiple channels. This is extremely useful if you for example want to differentiate alerts based on its kind or severity. There is also option to use Webhook method in MongoDB but it doesn't support Slack. So please either add support for new API method or make Webhook method supporting Slack so at least two Slack channels will be available to use.

At the moment in integration manager for Slack there is only option to use obsolete webhook method which allows sending notification to single Slack channel configured for this webhook. There is (not that) new API method https://api.slack.com/methods/chat.postMessage which allows sending notification to multiple channels. This is extremely useful if you for example want to differentiate alerts based on its kind or severity. There is also option to use Webhook method in MongoDB but it doesn't support Slack. So please either add support for new API method or make Webhook method supporting Slack so at least two Slack channels will be…

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Document list of Alert conditions that post to the Activty Feed but not Alerts page

Document list of Alert conditions that post to the Activty Feed but not Alerts page
It seems like some "non-actionable" alert/event conditions (example: Host has Restarted) post to the Activity Feed but not the Alerts Page.
Notifications are sent.
I cannot find a list of these that post only to the Activity Feed so it would be nice to have them documented.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
avoid generating alert with error message if one oplog node in replicaset got rebooted.

currently if one of the node in appdb/oplogdb goes down for any reasons (for example, linux patch rebooting the node), ops manager generates alert

"Ops Manager was unable to connect to this database and run the ping command. The database could be down, unreachable, or running with authencation and Ops Manager does not have adequate permissions."

there are still 2 other running nodes in replicaset. so this alert is misleading and generates false alarms.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Option to clear deleted alerts

Deleted alert definitions pile up in the "deleted alerts" tab of Ops Manager.

This information may be useful for auditing purpose, but in the long run, the number of deleted alerts may grow too large. Especially in our use case, where alert configurations are deployed through a script that deletes/recreate all alerts.

Feature suggestion: add an action to clear all deleted alerts (or better: clear all deleted alerts older than N days).

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Monitoring / Alerts · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

← Previous 1 2 Next →

Don't see your idea?

Ops Tools

How can we improve the Operational Tooling MongoDB provides?

Send `Monitoring is down` and `Backup is down` alerts for each MongoDB Agent (Monitoring/Backup Module) and include hostname information in

Add memory monitoring metrics

Collect hardware metrics even if there's no managed mongo process

Add `serverStatus.uptime` counter info into Metrics

Allow conditions for all alerts based on DB/cluster name

Allow to configure `maxTimeMS` for commands executed from Ops Manager's Data Explorer

export & import alerts settings from one project to another project

Grant permission to access Real Time tab to Project Read Only users

Send alert when RECOVERING node has failed due to being too stale to sync from any available node

Enable Ops Manager alerts on any FATAL or ERROR lines in the mongod/mongos logs

What is the problem that needs to be solved?

Why is it a problem? (the pain)

What is the problem that needs to be solved?

Why is it a problem? (the pain)

Allow enable/disable for agent alerts

Ability to configure all destinations for SNMPv2c Alert Traps in a single place

Filter/Sort Process List View

List shards in Deployment > Metrics' shard list in alphabetical order

Change the "Replica set has a late snapshot" Alert from Global to Project level

Make all metrics used in the Atlas dashboard available for the prometheus integration

Use different method for Slack notifications

Document list of Alert conditions that post to the Activty Feed but not Alerts page

avoid generating alert with error message if one oplog node in replicaset got rebooted.

Option to clear deleted alerts

Feedback

Ops Tools

Feedback and Knowledge Base

Searching…

Give feedback

How can we improve the Operational Tooling MongoDB provides?

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

What is the problem that needs to be solved?

Why is it a problem? (the pain)

What is the problem that needs to be solved?

Why is it a problem? (the pain)

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

Ops Tools

Categories

Searching…