Database
324 results found
-
Implement $bucket and $group on indexed values with sub-linear runtime
We noticed that sum $bucket and $group aggregations such as $min, $max, $count are unexpectedly slow even when fully covered by an index, (partially) because the DB scans through the entire index rather than employing optimization approaches such as binary search.
An example pipeline that should return instantaneous but scans through the entire index (confirmed on v4.4 and v5):
[
{
$match: {
status: "DELIVERED",
},
},
{
$group: {
id: {
status: "$status",
},
min: {
$min: "$modifytime",
},
},
},
]
with an index { status: 1, modify_time: 1}Another example is $bucket (same index):
[
{…6 votes -
Support compound/multiple grouping keys in $bucket
We often need to compute statistical/summarizing aggregations grouped by more than one field where all fields are of a $bucket-able type.
An example, would be to count all orders grouped by their status and some custom time ranges of their creation date.
This can be achieved by using $group in combination with a $switch expression (sometimes simplified with $trunc), however, that is cumbersome and prevents efficient grouping since e.g. no binary search can be employed to identify the bucket boundaries efficiently.The query syntax of $bucket would not need to change much. It would simply need to allow for nested…
6 votes -
sharding error shardsvr
Make it clear which node is causing the "shardsvr" error.
Spawned from support case 01042995
Our error occurred when the user tried to connect using Compass. The failure was to list the collection names on one database.
The error presented back to the user was merely
Cannot accept sharding commands if not started with --shardsvr
We found eventually that the primary changed on one of the shards, and that primary did not have the appropriate
clusterRole
in the mongod.conf file. My concerns are that this took too long to track down and would be impossible in a 100-shard environment.- Nothing…
6 votes -
Add pipeline stage for "downsampling" data
Down sampling is an extremely common operation used when plotting time-series data on graphs when there is too much data to get a good looking/meaningful graph. This would pick and choose "important" data points based on an algorithm such as "Largest-Triangle-Three-Buckets" (https://skemman.is/bitstream/1946/15343/3/SS_MSthesis.pdf) instead of returning the entire data set.
Not only would this make prettier graph but it will also reduce the overall payload returned from the data thus reducing network related latency.
This would be an awesome addition to timeseries!
6 votes -
`$getField` to work with a dynamic `field`
Currently
$getField
works only whenfield
resolves at query-compile-time to a string. It would be nice if it worked also whenfield
resolves to a string at runtime.See this Jira ticket - https://jira.mongodb.org/browse/SERVER-67030
6 votes -
Unique index in sharded cluster
For enforcing uniqueness in a sharded cluster, the officially recommended approach provided here https://docs.mongodb.com/manual/tutorial/unique-constraints-on-arbitrary-fields/#std-label-shard-key-arbitrary-uniqueness is simplistic and in production environment it brings non-trivial amount of work. Some considerations:
- Ephemeral issues might cause inconsistencies between the two collections (for example, unique index collection update succeeded but not the main collection) and make some unique keys not useable.
- There are many changes needed (we're using ORM Mongoose, there are many hooks for it to change) for enforce this universally.
What we ended up doing is to use distributed ephemeral locks (a TTLed MongoDB collection) to lock on the unique keys before adding…
6 votes -
NoTableScan at the collection level
NoTableScan at the collection level instead of mongod level.
6 votes -
Ability to see historical `serverStatus.uptime` counter info on MongoDB Server process
What is the problem that needs to be solved? Store (historically)
serverStatus.uptime
counter info on MongoDB Server process, so that it will be possible to trackserverStatus.uptime
changes through the time.Why is it a problem? (the pain) As of now (2020-02-25) there's no way to see historical info of MongoDB Server process restarts since
serverStatus.uptime
counter is restarted every time MongoDB Server process is restarted. There's no other way (other than going into MongoDB Server process logs) to know if the process was restarted and when it was restarted. If you'd like to calculate MongoDB Server process availability, you'll…6 votes -
Get metadata about source client connection that submitted a given change
Currently with change streams it is impossible to know who or what connection initiated the changes.
It would be a good feature to have to be able to receive some data about the source client connection that initiated a change.
My particular use case is the following:
I have an app that connects to Atlas. (source client connection)
I can subscribe to change streams and then execute some logic when it applies.That app can scale to multiple instances.
Each instance subscribes to the change streams.
But I only want each instance to execute the logic that applies to only…5 votes -
Add expression indexes
An expression index is one where the value being indexed is the result of an expression, like lower casing a string.
http://en.wikipedia.org/wiki/Expression_index
http://www.postgresql.org/docs/8.1/static/indexes-expressional.html5 votes -
Support for Ubuntu 20.4 in MongoDB Server version 4.2
Per the Server Support Matrix https://docs.mongodb.com/manual/installation/ support for Ubuntu 20 is in MongoDB Server version 4.4+ but not 4.2.
We would like to see the currently supported MongoDB Server version 4.2 available on the Ubuntu 20.4 LTS distribution.5 votes -
log connection string used by application to connect
there are multiple options to connect to mongo: you can connect to specific node or you can connect to the whole replicaset etc...
if DBA does not have access to source code - it's not possible to validate if application properly configured and connects to replicaset.it would be nice to let mongoDB dump to mongod.log used connection string and/or details how exactly client sessions is connected to mongo.
5 votes -
Deny Privilege
Provide the ability to explicitly deny a privilege on a specific resource.
Example: Grant the privilege to perform the find action on all collections in the test database except "test.secrets".
5 votes -
Reduce the minimum value for watchdogPeriodSeconds
The storage watchdog attempts to create, write, and read a test file in critical directories every 10 seconds.
The watchdogPeriodSeconds parameter controls how often these a thread checks to ensure at least one check has succeeded since the last check.
The minimum value for watchdogPeriodSeconds is 60 seconds. This means that in the worst case, the mongod could be unable to write for up to 2 minutes before the watchdog asserts and kills the stalled node. That is a very long time for a primary node to be stalled in a busy cluster.
It does make sense that watchdogPeriodSeconds must…
5 votes -
Kafka audit event streaming
Provide Kafka Topic as a write target for database auditing and database message logging.
https://docs.mongodb.com/manual/core/auditing/
Auditing is currently limited to a local and editable JSON/BSON file or the system console log.
The SYSLOG is not recommended by MongoDB. "The syslog message limit can result in the truncation of the audit messages. The auditing system will neither detect the truncation nor error upon its occurrence."5 votes -
Collection Comments
I would like the ability to attach comments to a collection so that other people using the data can get some understand of context or important Readme/FAQ that I would need to share.
5 votes -
Include the _ids of existing documents in BulkWriteResult when performing upserts
When performing a bulk operation, it is possible to obtain the _ids of upserted documents via BulkWriteResult. For example:
db.getCollection("test").find({})
db.test.drop()
var bulk = db.test.initializeUnorderedBulkOp();
bulk.find({name: "huey"}).upsert().updateOne({name: "huey"});
bulk.execute();
```The BulkWriteResult contains the upserted _id:
BulkWriteResult({
"writeErrors" : [ ],
"writeConcernErrors" : [ ],
"nInserted" : 0,
"nUpserted" : 1,
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [
{
"index" : 0,
"_id" : ObjectId("5ec77b5cc4a955ce03a4cd2e")
}
]
})However, when a document already exist, the _id is not returned:
db.test.find()
var bulk = db.test.initializeUnorderedBulkOp();
bulk.find({name: "huey"}).upsert().updateOne({name: "huey", outfit: "red"});
bulk.find({name: "luey"}).upsert().updateOne({name: "luey", outfit:…5 votes -
Allow Pinning Query Plan Cache Key to a Fixed Plan for a Given Query Shape Hash
Allow Pinning Query Plan Cache Key to a Fixed Plan for a Given Query Shape Hash
n MongoDB 8.0, the new setQuerySettings command allows administrators to enforce index hints and other behavior based on the query shape hash. This gives users partial control over query plan selection.
However, the current implementation still allows the optimizer to re-evaluate multiple plans under certain conditions (e.g., plan cache eviction, plan ranking strategy). Additionally, the planCacheKey, which is the actual determinant for plan reuse, is generated based on more granular factors than the query shape hash (e.g., sort, collation, or exact index hint details).
…
4 votes -
Maintain database ID/Password profile inside database
Requirement is to maintain database id/passwords with standard details inside database like (Lastlogin date/ passwordchangedate) and control of passwords related how many failedloginattempts/passwordlifetime/passwordcomplexity/passwordlocktime/passwordreusemax etc
4 votes -
4 votes
- Don't see your idea?