Database

← MongoDB Feedback Engine

To report bugs, please use our SERVER JIRA project.

How can we improve the MongoDB Database?

Enter your idea

(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

Unique Indexes and Bulk Upserts for Time Series Collections

We would like to insert data in bulk into time series collections and identify the new data that has been inserted without the possibility of duplicates being inserted.

For regular collections this is achievable by adding a unique index and performing a bulk upsert (as any duplicates will be rejected due to the unique index).

For time series collections however unique indexes are not currently supported.

In addition performing an upsert with $setOnInsert option which should only action insert operations is also not currently supported for time series collections.

At the moment the only options appear to be:

(1) to process records one at a time, checking for a matching record by performing a find operation and inserting the record if no match is found.

(2) to allow duplicates to be inserted and to create an aggregation pipeline using a $group stage to filter out the duplicates from users accessing the data.

(3) to use regular collections so that unique indexes and upserts are supported.

None of the options available at the moment are ideal.

- Option (1) isn't particularly efficient.
- Option (2) doesn't allow duplicates to be identified as part of the ingestion.
- Option (3) doesn't allow us to take advantage of the benefits and optimizations that come with timeseries collections such as optimized internal storage and improved query efficiency.

The preference would be for MongoDB to support Unique Indexes and Bulk Upserts for Time Series Collections.

We would like to insert data in bulk into time series collections and identify the new data that has been inserted without the possibility of duplicates being inserted.

For regular collections this is achievable by adding a unique index and performing a bulk upsert (as any duplicates will be rejected due to the unique index).

For time series collections however unique indexes are not currently supported.

In addition performing an upsert with $setOnInsert option which should only action insert operations is also not currently supported for time series collections.

At the moment the only options appear to be:

(1) to…

15 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Time Series · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Nanosecond timestamp support

I've put this under the time series category since that's where it's most applicable, but it's really a data model / BSON issue.

The topic of higher resolution timestamps have been surfaced from time to time for at least a decade (https://jira.mongodb.org/browse/SERVER-1460), and usually prompts a response like "just use integers". With the addition of time series collections however, where the concept of time is integral to database functionality, I think it's time to reconsider adding a type with at least nanosecond precision timestamp support. Date's millisecond resolution is woefully inadequate for a number of relevant use cases, be it sensor data, financial data, log data, etc.

I've put this under the time series category since that's where it's most applicable, but it's really a data model / BSON issue.

The topic of higher resolution timestamps have been surfaced from time to time for at least a decade (https://jira.mongodb.org/browse/SERVER-1460), and usually prompts a response like "just use integers". With the addition of time series collections however, where the concept of time is integral to database functionality, I think it's time to reconsider adding a type with at least nanosecond precision timestamp support. Date's millisecond resolution is woefully inadequate for a number of relevant use cases,…

10 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Time Series · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Add pipeline stage for "downsampling" data

Down sampling is an extremely common operation used when plotting time-series data on graphs when there is too much data to get a good looking/meaningful graph. This would pick and choose "important" data points based on an algorithm such as "Largest-Triangle-Three-Buckets" (https://skemman.is/bitstream/1946/15343/3/SS_MSthesis.pdf) instead of returning the entire data set.

Not only would this make prettier graph but it will also reduce the overall payload returned from the data thus reducing network related latency.

This would be an awesome addition to timeseries!

6 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Time Series · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Introduce a new field BucketLifeSpan Optional along with Granularity

An enhancement to MongoDB's management of time series collections could involve the introduction of a BucketLifeSpan attribute, in addition to the existing Granularity setting. This new, optional attribute would automate the duration a bucket can remain open, with the condition that Granularity should be less than or equal to BucketLifeSpan.

Consider a use case involving a time series collection for tracking data from 70,000 socket devices daily, with DeviceId as the metafield. Assuming data is organized into daily collections and granularity is set to minutes to optimally fill the buckets unless they reach their size limit.

For a collection named FirstDayCollection, by the end of the day—before data ingestion stops—we could have a minimum of 70,000 open buckets in memory, awaiting data. However, as we transition to SecondDayCollection and create a new set of at least 70,000 buckets for the next day's data ingestion, the buckets from the first day may still not be filled to their size capacity, leaving them in memory awaiting more data.

To address this issue without necessitating a server restart or finding other ways to clear the cache, we could utilize the proposed BucketLifeSpan field. This would allow for the automatic closure and data writing of buckets once they reach their lifespan, facilitating the setup of daily collections without forcing device restarts or manual cache management.

Implementing a BucketLifeSpan feature could significantly enhance the management of time series data in MongoDB, especially for use cases with high-volume, time-sensitive data ingestion.

I guess this kind of behavior would really helpful and should not that hard to be to implement and maybe making it optional.
Cheers.

An enhancement to MongoDB's management of time series collections could involve the introduction of a BucketLifeSpan attribute, in addition to the existing Granularity setting. This new, optional attribute would automate the duration a bucket can remain open, with the condition that Granularity should be less than or equal to BucketLifeSpan.

Consider a use case involving a time series collection for tracking data from 70,000 socket devices daily, with DeviceId as the metafield. Assuming data is organized into daily collections and granularity is set to minutes to optimally fill the buckets unless they reach their size limit.

For a collection named…

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Time Series · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Allow to decrease time series granularity and custom bucketing values

In our IoT use case, we are leveraging MongoDB’s time series functionality. Due to high write volume, we need to adjust the timeseries.granularity and bucketMaxSpanSeconds parameters to manage the write load. However, after increasing the bucketMaxSpanSeconds, we need to run the system for several days to observe stability. If the value is set too high, MongoDB does not support decreasing the bucketing value, and we are forced to create a new collection instead.

It would greatly simplify our testing process and increase flexibility in adapting to business changes if MongoDB allowed the decrease of bucketMaxSpanSeconds after it has been increased.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Time Series · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Change Timeseries Bucket Memory Limit

Currently, there's no way to set a memory threshold for bucket allocation in MongoDB's time series collections. As the bucket size increases and more collections are opened day after day, a limiting mechanism triggers for open buckets, leading to cache pressure and the premature closing of buckets under high load. It would be beneficial for users to have the ability to set a memory threshold for the timeseries bucket memory limit ( I guess limit is around 3GB). It enabling us to prevent early bucket closure in production environments. Alternatively, providing the option to manually close buckets could help manage data growth in daily-created collections more effectively. Without such features, we are left with no choice but to restart our servers daily to prevent reaching the limit that causes the WiredTiger engine to process the open buckets into storage.

Currently, there's no way to set a memory threshold for bucket allocation in MongoDB's time series collections. As the bucket size increases and more collections are opened day after day, a limiting mechanism triggers for open buckets, leading to cache pressure and the premature closing of buckets under high load. It would be beneficial for users to have the ability to set a memory threshold for the timeseries bucket memory limit ( I guess limit is around 3GB). It enabling us to prevent early bucket closure in production environments. Alternatively, providing the option to manually close buckets could help manage…
- bothside.png 194 KB
2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Time Series · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Tiered TTL for time series collection based on granularity

Currently time series collections have a single TTL across all inherent granularities. It would be great to specify a TTL for each granularity. For example:

For seconds: 1 week
For hours: 1 month
Others: never

Course information should be held longer than finer information in some cases - currently they all fall under the main TTL specified.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Time Series · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Option Renaming Time Series Collections

Ability to rename TS collection . We need it for migration history data to TS without downtime during conversion.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Time Series · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

Don't see your idea?

Database

How can we improve the MongoDB Database?

Unique Indexes and Bulk Upserts for Time Series Collections

Nanosecond timestamp support

Add pipeline stage for "downsampling" data

Introduce a new field BucketLifeSpan Optional along with Granularity

Allow to decrease time series granularity and custom bucketing values

Change Timeseries Bucket Memory Limit

Tiered TTL for time series collection based on granularity

Option Renaming Time Series Collections

Feedback

Database

Feedback and Knowledge Base

Searching…

Give feedback

How can we improve the MongoDB Database?

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

Database

Categories

Searching…