Online archive delete possibility
To free up space in the Online Archive (OA) and comply to GDPR . It would be good to delete not only by "age" the data in an OA, but also by a customize query, similar like the option the data is transferred to the OA from a collection. (A custom query. Atlas runs the query specified in the archiving rule to select the documents to archive. - https://www.mongodb.com/docs/atlas/online-archive/manage-online-archive/)
I want to add a usecase other than GDPR that we are facing right now:
We are running ETL pipelines for financial transactions in which we transform and aggregate data. Once in a while, we notice a bug in the first stage of the pipeline and we need to re-run the aggregation stage of our pipeline. However, the data source, i.e. the raw/detailed data, is archived after 3 months.
We want to do a full re-import of the data source for a particular timeframe into the cluster. That data would be archived after 3 months and cause duplicates inside the Online Archive. It would be nice to have a built-in solution for such a pipeline scenario.
One solution would be to tag the data somehow to differentiate the duplicate items inside the Online Archive. Another solution would be to allow delete queries for specific Online Archive paths in S3.