Support expressions in $densify range bounds
The $densify aggregation pipeline stage seems unable to evaluate range bounds expressions, requiring the range bounds to be constant.
See the following example (the collection testcoll contains a single documents with only the _id field):
sometestdb> db.testcoll.aggregate([{$addFields: {a: 1}}, {$densify: {field: "a", range: {bounds: [0, 5], step: 1}}}])
[
{ a: 0 },
{ _id: ObjectId("6284a16d64553eaf74b1e189"), a: 1 },
{ a: 2 },
{ a: 3 },
{ a: 4 }
]
sometestdb> db.testcoll.aggregate([{$addFields: {a: 1}}, {$densify: {field: "a", range: {bounds: [{$toInt: "0"}, 5], step: 1}}}])
MongoServerError: A bounding array must be an ascending array of either two dates or two numbers
The first aggregate() call gives the correct result, and I'd expect the second one to evaluate the $toInt and yield the same result. Of course $toInt isn't a particularly useful use case but there are indeed useful computations which an user may want to do on the range bounds; for example, when using dates, one may want to truncate the start and end date to the hour or to the day before they are used for the densify operation.
-
Daniele commented
I would need it too!
In my case I'm using prisma (prisma.io) and, for raw aggregations, the only way to use dates, is using $date or $dateFromString
The absence of expressions inside the bounds field prevents me from using the $densify stage -
Gianluca Nitti commented
Thanks for the clarification.
Yes, support for constant expressions would be enough for my use case.
In fact, what I need is exactly to $dateTrunc a date that comes from user input. I could do it in my application code, however the application is structured in a way where I have various pipelines saved as JSON with "placeholders" where parameters coming from the end user are replaced right before sending the pipeline to MongoDB for execution; these parameters are not always dates and when they are they don't always need to be truncated, so having the ability to truncate them directly in the pipeline would result in cleaner code on the application side since I wouldn't have to save somewhere else the information on which parameters must be "preprocessed" and handle them separately.
So, when the pipeline is sent to MongoDB it would contain an expression exactly like the one you posted, with two $dateTrunc operations with constant dates as input.
-
Expressions are not allowed because bounds have to be constant for the entire densification process. Potentially support for constant expressions can be added, e.g. range: {bounds: [{$dateTrunc: { date: ISODate("2022-05-13T07:00:00.000Z"), unit: "hour", binSize: 1 }}, ISODate("2022-05-20T07:00:00.000Z")], step: 1}. But something like this still will not be supported: range: {bounds: [{$dateTrunc: { date: '$dateFieldOne', unit: "hour", binSize: 1 }}, '$dateFieldTwo')], step: 1}
Would that work for your use case?