Unique Indexes and Bulk Upserts for Time Series Collections
We would like to insert data in bulk into time series collections and identify the new data that has been inserted without the possibility of duplicates being inserted.
For regular collections this is achievable by adding a unique index and performing a bulk upsert (as any duplicates will be rejected due to the unique index).
For time series collections however unique indexes are not currently supported.
In addition performing an upsert with $setOnInsert option which should only action insert operations is also not currently supported for time series collections.
At the moment the only options appear to be:
(1) to process records one at a time, checking for a matching record by performing a find operation and inserting the record if no match is found.
(2) to allow duplicates to be inserted and to create an aggregation pipeline using a $group stage to filter out the duplicates from users accessing the data.
(3) to use regular collections so that unique indexes and upserts are supported.
None of the options available at the moment are ideal.
- Option (1) isn't particularly efficient.
- Option (2) doesn't allow duplicates to be identified as part of the ingestion.
- Option (3) doesn't allow us to take advantage of the benefits and optimizations that come with timeseries collections such as optimized internal storage and improved query efficiency.
The preference would be for MongoDB to support Unique Indexes and Bulk Upserts for Time Series Collections.