Ops Tools
236 results found
-
Show time it took to complete a snapshot
In order to better troubleshoot snapshot performance issues, or maintain an understanding of how snapshots are performing and whether one is approaching a threshold where they wont be able to keep up, its a good idea to expose how long a snapshot took in the View All Snapshots page. This would also aid in quickly identifying which snapshots are full vs incremental.
41 votes -
OpsManager Application DataBase Backup and Restore
Ops Manager Application needs a way to backup and restore AppDB without Ops Manager Downtime to recover from infrastructure failures or fatal errors.
39 votes -
Support migrations to different snapshot stores
Currently it is not possible to transition between snapshot store types.
There are two options currently when transitioning to the new one.
- Terminate backups (deleting all previous snapshots)
- Create a new project and abandon the previous one to allow automated restores at a later time
Both of these options are difficult to manage for large deployments. The first option requires you to store the snapshots elsewhere and disallows automated restores. The second option requires many operations and clutters the Ops Manager project list.
Ideally we should be able to transition from any store location/type to any other location/type. One of…
31 votes -
Configure Ops Manager LDAP Auth via an API call
Currently, there is no way to enable LDAP Auth for the Ops Manager Users via an API call.
This essentially means that one would not be able to use LDAP and ci/cd simultaneously with Ops Manager.
Mongodb enterprise support has confirmed that in the event of disaster recovery or a deployment of a new cluster, manual steps must be done to enable LDAP during a ci/cd deployment.
It should not be expected to sign in and manually do anything in a web gui in an enterprise solution. It is simply not scalable.
22 votes -
Ops Manager: Test Failover
The ability to Test Failover was added to Atlas https://docs.atlas.mongodb.com/tutorial/test-failover.
Please add this functionality to Ops Manager in order to facilitate failover testing. This is especially useful in multi-tenant Ops Manager setups.
19 votes -
Add the ability for the backup daemon to download and validate snapshots
In order to test snapshots automatically, create a new job type that allows the Backup Daemon to download the snapshot from the snapshot store, then run validate on each collection.
If any collection fails validation, send an alert to the Backup Admin with the list of corrupted data.
18 votes -
Ops Manager API call for Host Mappings
This feature request is for a new Group/Project API call for the management of the host mappings created in Ops Manager from monitoring information. Whether it be an endpoint for "Reset Duplicates" or full featured create/delete of individual host mappings, both will alleviate issues where overlapping mappings affect monitoring in Ops Manager.
Note: The issue of overlapping mappings may occur when a mongod process has moved/changed IP addresses multiple times. With enough cycling (as seen in Kubernetes clusters with frequent pod restarts), eventually a previously mapped IP address may now be associated with a different mongod process.
17 votes -
SNMP traps for `AUTOMATION_AGENT_DOWN`, `MONITORING_AGENT_DOWN`, `BACKUP_AGENT_DOWN` alert types does not contain hostname information
What is the problem that needs to be solved? SNMP traps for
AUTOMATION_AGENT_DOWN
,MONITORING_AGENT_DOWN
,BACKUP_AGENT_DOWN
alert types does not contain hostname information in.1.3.6.1.4.1.41138.1.1.1.4
(.iso.org.dod.internet.private.enterprises.mms.server.serverMIBObjects.mmsAlertObject.mmsAlertHostAndPort
) OID.Why is it a problem? (the pain) User is blocked to act quickly on the alert and identify the host where Ops Manager's Automation/Monitoring/Backup Agent is in
DOWN
state. Missing<HOSTNAME>:<PORT>
information at.1.3.6.1.4.1.41138.1.1.1.4
SNMP OID does not allow user to mapAUTOMATION_AGENT_DOWN
,MONITORING_AGENT_DOWN
,BACKUP_AGENT_DOWN
alert types into a particular hostname.17 votes -
Deploy Changes without restarting mongod/mongos instance immediately.
Whenever we want to make changes, eg. set a new parameter or add new parameter in configuration (advance configuration options), after we save changes, review and deploy, automation immediately starts applying that change and does a rolling restart.
We need flexibility in restart, means one should have an option to perform immediate rolling restart or defer it to later time. We may apply multiple changes at different times and set one preferred window to restart instance instead of doing multiple restarts.16 votes -
Do not download EOL releases
Currently our Ops Mgrs keep re-downloading EOL releases of mongodb, e.g. mongodb-linux-x8664-2.6.12, mongodb-linux-x8664-3.0.15, etc. Ops Manager should automatically exclude EOL releases - they are taking up unnecessary disk space.
15 votes -
Apply Project Maintenance Windows to Global Alerts
Apply Project Maintenance Windows to Global Alerts
Make the existing Project-level maintenance windows also silence Global alerts associated with the Project.
We have many projects that share the Global Alerts and do not want to turn off all global alerts as we still need that alerting for the projects that are not being worked on.
In combination with the existing API for Project Maintenance Windows, this really allows full control.
14 votes -
Ops Manager API - System Alerts
The ops manager api documentation specifies endpoints for polling group alerts and global alerts but is missing system alerts. We should be able to poll for system alerts via the ops manager api as well.
12 votes -
Add Timezone support to Ops Manager Application logs
All of our hosts are in the same TZ. We would like to be able to set the Ops Manager related logs timestamps to our local TZ.
11 votes -
Add Ops Manager's Org ID/Name into all SNMP Alert Traps
What is the problem that needs to be solved? Ops Manager's Org ID/Name is not included into any of SNMP Alert Traps sent from Ops Manager's Application Server.
Why is it a problem? (the pain) Operator who watch Monitoring System (the one that receive SNMP Alert Traps from Ops Manager) needs to see Ops Manager's Organization ID/Name in order to quickly understand to where that Ops Manager's Alert is related to. Monitoring System (the one that receive SNMP Alert Traps from Ops Manager) needs to do additional work for each SNMP Alert Trap received (via
GET /groups/{PROJECT-ID}
/GET /orgs/{ORG-ID}
…11 votes -
Make snapshot retention policy more customisable
Make the retention policy of Ops Manager snapshots customisable so we can choose custom values (like 21) to be more flexible with the settings.
11 votes -
Allow multiple authentication sources simultaneously
Currently Ops Manager authentication supports either the Application Database, LDAP, or SAML, but these methods cannot be combined. Ideally we would like to move to LDAP, but we are stuck with the local authentication method as we depend on a local admin user which is used when first deploying and configuring the Ops Manager ecosystem. We also do not want to depend solely on the availability of the LDAP servers regarding an admin user. The MongoDB cluster deployments do support multiple authentication methods at the same time (we have local admin and monitoring accounts while users are authenticating via LDAP),…
10 votes -
Allow scheduling grooms
Add the ability to schedule groom jobs at a specific point in time. Also expose this functionality through the API for easy modifications through configuration management tools.
10 votes -
Alerts: Webhook integration authentication with basic auth
In secure environments, it is required that webhook endpoints are secured with basic authentication at a minimum. Currently, Webhook alerts only provides an HMAC-SHA-1 signature.
9 votes -
Disable Query Targeting: Scanned Objects / Returned alerts on specific, recurring aggregations.
We run routine, recurring aggregation pipelines (essentially, summing up the values of different categories of transactions) on a 5-minute interval. These aggregation pipelines scan for all objects that match a certain type, then sum the cumulative value of certain values of those objects based on category. This means that we regularly have queries that scan >500,000 objects and consolidate them down to ~12 or so objects that are returned.
In this specific case, I'm alright with the scan/return ratio being very high, and I don't want to be spammed with alerts every five minutes. However, I don't want to disable…
9 votes -
Ops Manager Prometheus metrics
MongoDB Ops Manager would need to expose endpoints for Prometheus for MongoDB Clusters. There are a number of metrics that popular MongoDB Exporters do not provide, for example:
1. Enterprise Backup status
2. Replication Alarms
3. Agent statuses
4. A number of Cluster statuses
etcThere is no way to get this data into Prometheus at the moment
9 votes
- Don't see your idea?