Aggregate $accumulator 'Combine' stage for sharded collections
After the ‘accumulate’ stage has run against each document on a shard it would be extremely useful to run a script on the shard against the state to reduce the state down to something smaller before passing the final state over the network back to mongos for the ‘merge’. (Over in the Elastic camp they call this the ‘combine_script’).
Scenario:
I have a customer orders database sharded by the Customer ID (meaning all data relating to any specific customer is kept on the same shard - so customers don’t need to be ‘merged’ between shards). I can write an accumulator to perform processing across all customer orders on the shard and the wish to perform further processing to reduce the state (containing data spanning various customer orders) down to a minimal state (mostly aggregated customer stats) to then be aggregated by mongos. I wish to avoid copying multiple large state objects (from each shard) across network to the ‘merge’ script on mongos.