Database
42 results found
-
Allow configuration of 100mb memory limit per aggregation pipeline stage
In this old thread from 2016 (https://groups.google.com/forum/#!topic/mongodb-user/LCeFZZRz5EY) it was asked whether there was a way to increase the 100mb in memory limit of each stage of an aggregation pipeline. The responses centered around two points:
- If too much memory is used per aggregation pipeline stage then it will reduce performance for the overall MongoDB database, impacting other queries negatively.
- You can set allowDiskUse: true and revert to performing these pipeline stages on disk when they exceed 100mb.
I believe this subject needs to be revisited for the following reasons:
- “Too much memory” is very subjective, and the 100mb…
26 votes -
Add operator that would calculate distance between 2 geolocation points
It would be great to have operator that would calculate distance between 2 geolocation points, and not to do it manually with big aggregate queries.
I suggest to add 2 new operators that would calculate distance in two different ways, as discussed in this Community Post: https://www.mongodb.com/community/forums/t/how-to-calculate-distance-between-two-geolocation-points/173045
19 votes -
geoContain
Dear all,
according to the attached image, I have some documents (in blue, with id from 1 to 5) having a geographic extent and a search area (in yellow).
I need to find all documents where search area is completly inside the document's geometry.
Using different words, I need to find all documents where geometry completly covers the given search area.
In my sample, the geo query should return the document with id 1.
This kind of query has a opposite logic than the $geoWithinCould you provide a $geoContain functionality in the next future?
10 votes -
Handle Daylight Saving Time when $densify is used on a date field
When using "day" as "unit" for a $densify pipeline stage on a date field, the date is always advanced of 24 hours. This is however not always the expected result in timezones in which the year has one 23-hour and one 25-hour long day, because of Daylight Saving Time.
It would be useful to have the possibility to pass an optional timezone parameter in the $densify stage and, when present, have the stage account for these exceptions when appropriate.
Here follows an example.
Assume we have a collection containing the following documents:
…db.densifyDateExample.insertMany([ {_id: "a", d: ISODate("2022-10-28T22:00:00Z")}, {_id: "b", d:
7 votes -
Raise the limit of 16 MB JSON between aggregation stages
When doing analytics, the 16 MB JSON limit between aggregation stages restricts the ability to process large amount of data. allowDiskUse does not help with all the various $operators and stages that we use. See ticket 00774514 for details.
7 votes -
Support compound/multiple grouping keys in $bucket
We often need to compute statistical/summarizing aggregations grouped by more than one field where all fields are of a $bucket-able type.
An example, would be to count all orders grouped by their status and some custom time ranges of their creation date.
This can be achieved by using $group in combination with a $switch expression (sometimes simplified with $trunc), however, that is cumbersome and prevents efficient grouping since e.g. no binary search can be employed to identify the bucket boundaries efficiently.The query syntax of $bucket would not need to change much. It would simply need to allow for nested…
6 votes -
`$getField` to work with a dynamic `field`
Currently
$getField
works only whenfield
resolves at query-compile-time to a string. It would be nice if it worked also whenfield
resolves to a string at runtime.See this Jira ticket - https://jira.mongodb.org/browse/SERVER-67030
6 votes -
support parallel query executions include find(), aggregation()
To use multi-core environment and enhance the query performance w/ a large amount of documents, need a parallel execution.
Sharding or microsharding is not an alternative in this case.3 votes -
$merge
Report number of docs matched, merged, skipped, etc. from a $merge stage. Alternatively, return the merged doc results as a pipeline result to pass to additional stages.
3 votes -
Document scoped RBAC - Permission for collection document fields
Roles and accesses can be defined on the basis of collections that define roles for users.
It would be nice if these access permissions could be made within the scope of the fields under the collection and the query results would be returned accordingly.Current:
privileges: [
{ resource: { db: "users", collection: "user" }, actions: [ "find"] },
}
Expected:
{ resource: { db: "users", collection: "user", field: "email" }, actions: [ "find"] },
3 votes -
Support expressions in $densify range bounds
The $densify aggregation pipeline stage seems unable to evaluate range bounds expressions, requiring the range bounds to be constant.
See the following example (the collection testcoll contains a single documents with only the _id field):
…sometestdb> db.testcoll.aggregate([{$addFields: {a: 1}}, {$densify: {field: "a", range: {bounds: [0, 5], step: 1}}}]) [ { a: 0 }, { _id: ObjectId("6284a16d64553eaf74b1e189"), a: 1 }, { a: 2 }, { a: 3 }, { a: 4 } ] sometestdb> db.testcoll.aggregate([{$addFields: {a: 1}}, {$densify: {field: "a", range: {bounds: [{$toInt: "0"}, 5], step: 1}}}]) MongoServerError: A bounding array must be an ascending array of either two dates or
3 votes -
define the random seed manually, for $rand and $sample
It will be great if an additional paramater to define the seed for $rand and $sample could be use.
2 votes -
Flatten arrays in group stage
Have group operators to flatten document arrays into a single one with or without repeated elements.
So ->
doc1 = {arr: [1,2,3,4], gr: "group"}, doc2 = {arr: [5, 6, 7, 8], gr: "group"}
{$group: {id: "$gr", arrays: {$***: "$arr"} } }
=>
{id: "gr", arrays: [1, 2, 3, 4, 5, 6, 7, 8]}2 votes -
geo
It would be nice to get the length of an LineString of a geo-json object or the possibility to write an aggregation to calculate it.
2 votes -
hint support for $graphLookup
Currently you can supply a
hint
to theaggregation
call in order to tell MongoDB to use a specific index for the initial$match
. But there is currently no way to specify which index to use for a$graphLookup
later in the pipeline.I would like an optional
hint
property on the$graphLookup
stage.2 votes -
support $lookup for update aggregation
We frequently denormalise either full documents or subsets to different documents in order to speed up reading, create indexes or paginate/sort on fields.
Consider a user collection and a task collection, if a task can be assigned to the user, it makes sense to just put the user document on the task they are assigned. But an update to a user now requires you to update the user both in the user collection and all tasks with that user in the tasks collection.
This can be achieved but does introduce some complexity, however the introduction of updates using aggregation pipelines…
2 votes -
$sort should allow 0 as argument meaning "no sort"
The $sort operator and aggregation stage should allow 0 (possibly also null) as an argument for the field sort order, meaning, "Don't sort."
This allows a variable to be passed in that can conditionally skip the sort operation, in addition to the present specification, in which a variable can only choose ascending or descending sort.1 vote -
Feature to perform Machine Learning predictive analysis and classification in MongoDB
I want to bring the machine learning compute and predictive analysis into MongoDB atlas. Instead of ETL my data out of Atlas to achieve this, I will reduce my architectural complexity by having an aggregation operator that does this on my documents stored in Atlas.
1 vote -
Add Relaxed mode support for the $out operator
Add Relaxed mode support for the $out operator.
*and include as option in the existing drivers1 vote -
Aggregations should allow an empty sort stage instead of returning an error
When you run an aggregation pipeline that contains an empty sort stage (like
{"$sort": {}}
) MongoDB returns the error message "$sort stage must have at least one sort key". It would be really helpful if such a stage would work and simply not apply any sorting at all.For one this would be more consistent with a find operation (e.g.
db.runCommand({"find": "test", "sort": {}})
ordb.test.find({}, {}, {"sort": {}})
) which does not return an error but simply does not sort the results. More importantly it would also make it easier for developers and frameworks to dynamically generate the…1 vote
- Don't see your idea?