Handle Daylight Saving Time when $densify is used on a date field
When using "day" as "unit" for a $densify pipeline stage on a date field, the date is always advanced of 24 hours. This is however not always the expected result in timezones in which the year has one 23-hour and one 25-hour long day, because of Daylight Saving Time.
It would be useful to have the possibility to pass an optional timezone parameter in the $densify stage and, when present, have the stage account for these exceptions when appropriate.
Here follows an example.
Assume we have a collection containing the following documents:
db.densifyDateExample.insertMany([
{_id: "a", d: ISODate("2022-10-28T22:00:00Z")},
{_id: "b", d: ISODate("2022-10-29T22:00:00Z")},
{_id: "c", d: ISODate("2022-10-30T23:00:00Z")},
{_id: "d", d: ISODate("2022-10-31T23:00:00Z")},
{_id: "e", d: ISODate("2022-11-01T23:00:00Z")},
{_id: "f", d: ISODate("2022-11-02T23:00:00Z")}
])
Keep in mind that on 2022-10-30, in EU countires, the time switched from CEST to CET. So those are the UTC representations of the beginning of the days from Oct 29th to Nov 3rd.
Now, if we run the following pipeline:
db.densifyDateExample.aggregate([{
$densify: {
"field": "d",
"range": {
"bounds": [ISODate("2022-10-25T22:00:00.000Z"), ISODate("2022-11-06T23:00:00.000Z")],
"step": 1,
"unit": "day"
}
}
}])
We get the following result:
{ "d" : ISODate("2022-10-25T22:00:00Z") }
{ "d" : ISODate("2022-10-26T22:00:00Z") }
{ "d" : ISODate("2022-10-27T22:00:00Z") }
{ "_id" : "a", "d" : ISODate("2022-10-28T22:00:00Z") }
{ "_id" : "b", "d" : ISODate("2022-10-29T22:00:00Z") }
{ "d" : ISODate("2022-10-30T22:00:00Z") }
{ "_id" : "c", "d" : ISODate("2022-10-30T23:00:00Z") }
{ "d" : ISODate("2022-10-31T22:00:00Z") }
{ "_id" : "d", "d" : ISODate("2022-10-31T23:00:00Z") }
{ "d" : ISODate("2022-11-01T22:00:00Z") }
{ "_id" : "e", "d" : ISODate("2022-11-01T23:00:00Z") }
{ "d" : ISODate("2022-11-02T22:00:00Z") }
{ "_id" : "f", "d" : ISODate("2022-11-02T23:00:00Z") }
{ "d" : ISODate("2022-11-03T22:00:00Z") }
{ "d" : ISODate("2022-11-04T22:00:00Z") }
{ "d" : ISODate("2022-11-05T22:00:00Z") }
{ "d" : ISODate("2022-11-06T22:00:00Z") }
While - after somehow telling $densify in which timezone we are working - the following would be way more useful:
{ "d" : ISODate("2022-10-25T22:00:00Z") }
{ "d" : ISODate("2022-10-26T22:00:00Z") }
{ "d" : ISODate("2022-10-27T22:00:00Z") }
{ "_id" : "a", "d" : ISODate("2022-10-28T22:00:00Z") }
{ "_id" : "b", "d" : ISODate("2022-10-29T22:00:00Z") }
{ "_id" : "c", "d" : ISODate("2022-10-30T23:00:00Z") }
{ "_id" : "d", "d" : ISODate("2022-10-31T23:00:00Z") }
{ "_id" : "e", "d" : ISODate("2022-11-01T23:00:00Z") }
{ "_id" : "f", "d" : ISODate("2022-11-02T23:00:00Z") }
{ "d" : ISODate("2022-11-03T23:00:00Z") }
{ "d" : ISODate("2022-11-04T23:00:00Z") }
{ "d" : ISODate("2022-11-05T23:00:00Z") }
{ "d" : ISODate("2022-11-06T23:00:00Z") }
-
Stephen commented
Is there a fix for this? Important...