"Chaos testing" for Atlas - simulate node(s) down

The current "Test Failover" feature supports testing application/driver resiliency in case of elections. For additional testing, we want to be able to cause a node or nodes to be shut down and started up in a cluster. There should be selectivity allowing the entire node or just the mongod or mongos process to be shut down and started up.

33 votes

Spencer shared this idea · Apr 9, 2020 · Report… · Admin →

completed · May 30, 2023

An error occurred while saving the comment

AdminZuhair (Admin, MongoDB) commented · May 30, 2023 8:47 PM · Report

Hi Spencer, we have support now for Simulate Regional Outage in MongoDB Atlas! See https://www.mongodb.com/docs/atlas/tutorial/test-resilience/simulate-regional-outage/ for more details.

Submitting...
AdminAndrew Davidson (VP, Cloud Products, MongoDB) commented · May 6, 2022 9:51 AM · Report

We're starting to scope out the ability to test region level outages: expect an update later this year!

Submitting...
Rob Powell commented · September 29, 2021 3:47 AM · Report

Had customer requests for this during POC to test these scenarios.

Submitting...
Geoffrey commented · July 21, 2021 3:21 PM · Report

Have the possibility to test a DR scenario like lost a cloud region.

Submitting...
Charlie commented · October 19, 2020 9:07 PM · Report

This may also be useful for 'stuck' secondaries that need rebooting or to trigger the bouncing of a mongod, which can only be done by a Cloud TS at this time.

Submitting...
Christian commented · October 2, 2020 6:45 PM · Report

Along this line of thinking, it would be nice to add functionality to enact, specifically, the standard processes used when deploying maintenance to Atlas clusters to test that applications are resilient to more than just the election process that occurs during Atlas maintenance.

Submitting...