Release Cadence to allow pegging Minor Version Upgrades prior to GA of next Major Release
Issue:
Current requirement to enable 'auto upgrades' to test and take advantage of Minor version updates (Example: 6.x.x >> 6.y.x). As a larger company with a 2-4 week release cycle in our products, we have seen need to start testing a Minor version upgrade to the current Major GA Release (Recent example below).
Scenario where this could be a problem in summary (See detailed real world story further down):
- Customer sees a new feature they wish to try.
- Customer is pegged to last major version upgrade (currently 6.0.x) to allow for pegging the version in production
- Customer cannot allow for 'auto upgrades' in production due to internal Release Management requirements, and must wait the full year cycle for features added in later minor versions of the current major release.
The Ask: We respectfully request a resolution to a state where we can choose to peg any of our clusters to a specific minor version (i.e. 6.1.x, or 6.2.x...) and allow only auto upgrades in the patch/bug release numbers rolled out by Atlas. Thank you for your time and response to this critical upgrade in scheduling & control over Minor version upgrades.
More detailed summary of need for control over Minor Release Update Cycles/Scheduling:
As we have our own release management requirements, any change to a Production environment must go through UAT and Performance testing and approval before the release. Currently, with requirement of enabling 'auto upgrades' to move up to available Minor Release, we are hamstrung until late in the Major release cycle to begin testing and releasing a minor version in lower environments to then implement in Production. this is due to our internal Release Management requirement to peg the MongoDB version and driver in Production to a fully tested release, and to have lower environments testing and validating on same IaaS release version as production. And pegging a version is only allowed at the Major version going to GA. We wish to ask that Minor versions be left to live for 12-15 months to allow for pegging a minor version upgrade during the year. This would allow our teams to implement a Minor version upgrade early... and peg to that version pending the annual GA release to come.
Below I highlight the real world example that caught our team in a catch-22 with v5.0 not meeting requirements of our application in sharded configuration, requiring us to enable 'auto upgrades' to get to 5.1.x or later.
Resulting Request:
We would like the ability to peg our clusters to specific minor releases. Example: to 5.3.x where patch releases are acceptable, but minor version updates to 5.4.x are not allowed until we have tested and approve within same our own CICD pipeline and process.
Allowing for 'auto upgrades' in our clusters based on Atlas rollout schedules poses unwarranted risk to the production assets of the End Users
The ability to be alerted of available minor updates with option to choose when to begin our own test/release and schedule same is requested.
Recent situation/scenario:
With requirement for 'auto updates' we were in a catch-22 with 5.0.x And 5.3.x, where we needed the features of graph lookup with sharded configuration in production.
- We are rapidly reaching critical resource issues in Production with a Primary/Replica configuration.
- We must proceed with upgrade to Sharded configuration to 1) allow for horizontal scaling to relieve pressure on Primary nodes for Read Preference Primary for mission critical operations requiring high idempotency, and that do not allow for eventual consistency within the SLAs and 2) relieve pressure on Oplogs due to high throughput and rapid data changes during normal operations. We have operations that lead to a requirement on a primary/replica set to be now 1.2TB to prevent OpLog window from dropping below 12hrs. - With horizontal scaling, we can spread the load on read primary across multiple nodes, which will alleviate the need for such a large Oplog Size, leading to better management and availability of storage.
Without the 5.3.2 upgrade and change in our current primary/replica >> sharded configuration. We could not proceed with the much needed upgrades without accepting release management risks associated with enabling 'auto upgrades'. Although the recent scenario is resolved as 6.0 is now GA, if this were earlier in the GA release cycle, we would either 1) be blocked from the above critical upgrades, or 2) Forced to accept the risks of an 'auto upgrade' being implemented in Production ahead of our NonProd clusters. Going against best practices for Release Management.