Welcome to the new MongoDB Feedback Portal!
{Improvement: "Your idea"}
We’ve upgraded our system to better capture and act on your feedback.
Your feedback is meaningful and helps us build better products.
We’ve upgraded our feedback system to better capture, track, and act on your feedback. Here’s what you need to know:
|
What problem are you trying to solve? Focus on the what and why of the need you have, not the how you'd like it solved. |
I want to tokenize an email. For example, here is an email address: jack.ma@gail-abc.ws.com, and can be tokenized as the below: jack jack.ma jack.m ma jack.ma@gail jack.ma@gail-abc.ws.com gail-abc.ws.com abc.ws.com ws.com com abc.ws.co such as these tokens, I can search them by any form of input. |
|
What would you like to see happen? Describe the desired outcome or enhancement. |
|
|
Why is this important to you or your team? Explain how the request adds value or solves a business need. |
|
What steps, if any, are you taking today to manage this problem? |
Hello,
Thank you for the detailed examples! Achieving that level of flexibility for email search—where you need to match partials like
jack.mas well as specific parts likews.com—is a very common requirement.To support all the token examples you listed, the best approach is to use a Multi-Field mapping in Atlas Search.
This simply means we will index the
emailfield in two different ways simultaneously:As Text: To capture whole words (handling
jack,ma,ws,com).As Autocomplete: To capture partial characters as the user types (handling
jack.m,abc.ws.co).Here is the JSON configuration to paste into your Atlas Search Index editor. This maps the field
emailAddressto use both the Standard Analyzer and the Autocomplete type.JSON
To search across both of these strategies at once, you can use the
compoundoperator in your aggregation pipeline.JavaScript
Input
jack,ma,ws, orcom: The Standard Analyzer splits emails on punctuation. It treatswsandcomas separate tokens, so searching for them works immediately.Input
jack.morabc.ws: The Autocomplete definition creates "edge n-grams" (partial text), allowing these partial inputs to succeed.Input
jack.ma@gail...: The search will match the tokens found within the full string.Helpful Resources
For a deeper dive into these configurations, here is the relevant documentation:
Defining Field Mappings: How to map one field multiple ways (Multi-field).
Autocomplete Operator: Details on nGrams and partial matching.
Standard Analyzer: How text is split into tokens.
I hope this unblocks you! Please let me know if you run into any issues applying this JSON to your index.