Database
7 results found
-
Unique Indexes and Bulk Upserts for Time Series Collections
We would like to insert data in bulk into time series collections and identify the new data that has been inserted without the possibility of duplicates being inserted.
For regular collections this is achievable by adding a unique index and performing a bulk upsert (as any duplicates will be rejected due to the unique index).
For time series collections however unique indexes are not currently supported.
In addition performing an upsert with $setOnInsert option which should only action insert operations is also not currently supported for time series collections.
At the moment the only options appear to be:
(1) to…
11 votes -
Nanosecond timestamp support
I've put this under the time series category since that's where it's most applicable, but it's really a data model / BSON issue.
The topic of higher resolution timestamps have been surfaced from time to time for at least a decade (https://jira.mongodb.org/browse/SERVER-1460), and usually prompts a response like "just use integers". With the addition of time series collections however, where the concept of time is integral to database functionality, I think it's time to reconsider adding a type with at least nanosecond precision timestamp support. Date's millisecond resolution is woefully inadequate for a number of relevant use cases,…
9 votes -
Introduce a new field BucketLifeSpan Optional along with Granularity
An enhancement to MongoDB's management of time series collections could involve the introduction of a BucketLifeSpan attribute, in addition to the existing Granularity setting. This new, optional attribute would automate the duration a bucket can remain open, with the condition that Granularity should be less than or equal to BucketLifeSpan.
Consider a use case involving a time series collection for tracking data from 70,000 socket devices daily, with DeviceId as the metafield. Assuming data is organized into daily collections and granularity is set to minutes to optimally fill the buckets unless they reach their size limit.
For a collection named…
3 votes -
Allow to decrease time series granularity and custom bucketing values
In our IoT use case, we are leveraging MongoDB’s time series functionality. Due to high write volume, we need to adjust the timeseries.granularity and bucketMaxSpanSeconds parameters to manage the write load. However, after increasing the bucketMaxSpanSeconds, we need to run the system for several days to observe stability. If the value is set too high, MongoDB does not support decreasing the bucketing value, and we are forced to create a new collection instead.
It would greatly simplify our testing process and increase flexibility in adapting to business changes if MongoDB allowed the decrease of bucketMaxSpanSeconds after it has been increased.
2 votes -
Change Timeseries Bucket Memory Limit
Currently, there's no way to set a memory threshold for bucket allocation in MongoDB's time series collections. As the bucket size increases and more collections are opened day after day, a limiting mechanism triggers for open buckets, leading to cache pressure and the premature closing of buckets under high load. It would be beneficial for users to have the ability to set a memory threshold for the timeseries bucket memory limit ( I guess limit is around 3GB). It enabling us to prevent early bucket closure in production environments. Alternatively, providing the option to manually close buckets could help manage…
2 votes -
Add pipeline stage for "downsampling" data
Down sampling is an extremely common operation used when plotting time-series data on graphs when there is too much data to get a good looking/meaningful graph. This would pick and choose "important" data points based on an algorithm such as "Largest-Triangle-Three-Buckets" (https://skemman.is/bitstream/1946/15343/3/SS_MSthesis.pdf) instead of returning the entire data set.
Not only would this make prettier graph but it will also reduce the overall payload returned from the data thus reducing network related latency.
This would be an awesome addition to timeseries!
2 votes -
Tiered TTL for time series collection based on granularity
Currently time series collections have a single TTL across all inherent granularities. It would be great to specify a TTL for each granularity. For example:
For seconds: 1 week
For hours: 1 month
Others: neverCourse information should be held longer than finer information in some cases - currently they all fall under the main TTL specified.
2 votes
- Don't see your idea?