Implement a $hash operator for queries
Current State:
MongoDB currently lacks a native $hash
operator in the aggregation/query language. While certain helpers like convertShardKeyToHashed
or $toHashedIndexKey
exist, they are limited in scope:
- Not available as true query operators (e.g., inaccessible from MongoDB drivers).
- Return
Long
values, which aren’t suitable for use cases like generatingObjectId
-compatible hashes. - Not guaranteed to be stable across versions — limiting their use in persisted or version-sensitive scenarios.
- Lack of flexibility in algorithm choice (e.g., MD5 only).
The deprecation of $function
further limits users’ ability to implement custom hashing via JS.
Impact:
Developers working with large datasets (dozens of GB) are unable to:
- Perform deterministic hashing inside MongoDB queries/aggregations.
- Use hash values as
_id
or compact document identifiers for deduplication, fingerprinting, or efficient lookups. - Build predictable, cross-version-stable hashes within MQL.
- Avoid extra client-side computation or complex workarounds involving bitwise operations, slow modulo arithmetic, or full client-side aggregation to generate a hash.
Additionally, non-driver-accessible helpers lead to more application logic overhead and additional round-trips to the database, hurting latency and performance.
Potential Future State:
Introduce a first-class $hash
operator in the MongoDB query and aggregation language:
{
$hash: {
algorithm: "SHA384", // e.g., SHA256, SHA512, etc.
data: <expression>,
// Optional: key (for HMAC support in the future)
}
}
- Returns a hexadecimal string (or binary, or ObjectId-compatible format).
- Deterministic and version-stable.
- Accepts MQL expressions for the input.
- Could include support for standard NIST algorithms (see: https://csrc.nist.gov/projects/hash-functions).
Initial implementation could focus on non-HMAC hashes only to reduce scope.
Additional Info or Links:
- Use case: compact storage of hash-based
_id
s for large datasets. - Common workaround (not ideal): https://stackoverflow.com/questions/67426169/how-do-you-convert-a-hexadecimal-string-into-a-number-in-mongodb
- NIST-approved algorithms list: https://csrc.nist.gov/projects/hash-functions
- Workarounds using
convertShardKeyToHashed
introduce compatibility risks, limited hash size (64-bit), and type conversion issues (e.g., can’t easily convert Long to hex string/ObjectId). - Customer is open to discussing use cases in more depth.
3
votes
