Atlas Search Custom Analyzers
We want the ability to create our own custom analyzers in Atlas Search. It seems like the option might be there as there is a Define Analyzers button but it is not documented and the syntax does not match what I would expect to be valid, resulting in an index that never finishes building.

This feature has been completed and released in the latest release of Atlas Search.
You can read the documentation here:
https://docs.atlas.mongodb.com/reference/atlas-search/analyzers/custom/
-
Richard commented
Its interesting as I am looking at this atm. I have a element in my document that stores JUST the email but it can be in upper or lower case
To solve the search problem I have performed the following
{
"analyzer": "lucene.keyword",
"searchAnalyzer": "lucene.keyword",
"mappings": {
"dynamic": false,
"fields": {
"enquiry": {
"fields": {
"customerId": {
"analyzer": "lucene.standard",
"ignoreAbove": 255,
"searchAnalyzer": "lucene.standard",
"type": "string"
}
},
"type": "document"
}
}
}then I do a search like this:
let chunks = term.split('@');
let result = await collection.aggregate([
{
$search: {
compound: {
must: [{
text: {
query: chunks[0],
path: 'enquiry.customerId'
}
},
{
text: {
query: chunks[1],
path: 'enquiry.customerId'
}
}]
}
}
},
{
$limit: 1000
}
]).toArray()The indexer seems to break the email address around the @ sign and I do a must match on each chunk.. seems to work well but will be load testing etc
-
Simon commented
This would be helpful for our situation where we want to use Atlas search for finding email addresses using case-insensitive search but matching the email address as a whole only and not divide into searchable terms
-
Yurii commented
This feature will definitely help to develop great features for developers who leverage Atlas search. For example now it is almost impossible to reproduce behaviour of $regex operator. https://docs.mongodb.com/manual/reference/operator/query/regex/ for multiword search strings. Keyword analyzer does not support case insensitiveness it would be great to create own analyzer based on keyword + "ignoreCase": true.
-
Mordechai commented
Allow creating a custom analyzer.
For example, for a field that contains a Linux file path I will want to create an analyzer that uses the char "/" as a token.
If I have the following path: /a/b/7/c
It will be split to a, b, 7, c...
Thanks