Allow aggregate pipeline on input documents before Atlas Search indexing
If Altas Search allowed us to apply an aggregate pipeline to input documents before they are sent to Atlas Search for indexing, it would open up a TON of new possibilities not currently supported. We could do data transformations, type conversions, synthetic fields, etc.. I've opened up a number of other suggestions for these things, but an input aggregate pipeline could solve all of them in one "easy" solution. Since aggregate pipelines already exist and are deeply embedded in MongoDB, I'm hoping it would be trivial to implement as well. Simply pick up one or more new documents to be indexed, run them through the pipeline and then send them to the Atlas Search engine for indexing.
Here's what that might look like:
{
"mappings": {
"dynamic": false,
"aggregate": [
{ <stage> },
...,
{
"$addFields": {
"my-synthetic-field": <expression>
}
}
],
"fields": {
"my-synthetic-field": {
"type": "string"
}
}
}
}
Here are some of the other suggestions I (or others) have made that could be solved with this approach:
https://feedback.mongodb.com/forums/924868-atlas-search/suggestions/47049898-allow-indexing-of-synthetic-fields
https://feedback.mongodb.com/forums/924868-atlas-search/suggestions/47049757-data-type-coercion-when-document-field-type-doesn
https://feedback.mongodb.com/forums/924868-atlas-search/suggestions/47049775-allow-sort-by-boolean-fields
https://feedback.mongodb.com/forums/924868-atlas-search/suggestions/47049775-allow-sort-by-boolean-fields
https://feedback.mongodb.com/forums/924868-atlas-search/suggestions/40734700-support-decimal128

This feature is now Generally Available. Read the docs here.
-
David commented
It's also worth noting that the programmatic search index creation function:
db["my-search-view"].createSearchIndex()
MongoServerError[NamespaceNotFound]: Collection 'test.my-search-view' does not exist.Does not work with a view, but does with a collection.
-
David commented
The documentation for using views does not list this as a limitation, but when I tried to create a view that used projection, I get this error:
Error creating search index: View my-search-view is not valid for search indexing because $project stages are not supported.
I can use $addFields to add or modify fields without projection, but that's a bit of a bummer. It would be good if this limitation were called out in the docs.
I was hoping to limit fields in the view - yes, I know they won't be indexed unless I ask for them too, but I thought it would make my view cleaner if I could remove unnecessary fields.
I tried $unset as well, but that gives a different error:
Error creating search index: View my-search-view is not valid for search indexing because stage 1 is not configured with a document.
So it seems there are some significant, undocumented limitations on what a view being indexed with Atlas Search can and can't do! Be sure to test your use case before relying on this as your solution.
-
David commented
Hmm, that document is specifically mentioning Atlas Vector Search. Does this also work with standard Atlas Search?
EDIT: Nevermind, looks like it does, I just had to go look for the corresponding doc page: https://www.mongodb.com/docs/atlas/atlas-search/transform-documents-collections/
Also, I like the idea of using a view and indexing that. That's cleaner than my suggestion.
-
David commented
That documentation link appears to be protected and my standard MongoDB account signin doesn't work for it. Is there some other documentation you could link?
-
Tom commented
I like this idea, because it would also allow us not to have to add the entire collection to the search index, but only documents that meet a certain condition.
Typical scenarios would be, for example, archive products that are still in the collection but do not necessarily have to be in the search index. This would reduce the search index.