Process Data with Analyzers
On this page
You can control how Atlas Search turns a string
field's contents into searchable
terms using analyzers. Analyzers are policies that combine a tokenizer, which
extracts tokens from text, with filters that you define. Atlas Search
applies your filters to the tokens to create indexable terms that correct for differences
in punctuation, capitalization, filler words, and more.
To control how Atlas Search creates search terms, use an Atlas Search analyzer in the index definition. You may specify an analyzer when creating an index, executing a query, or both.
Analyzers
Atlas Search provides the following built-in analyzers:
Analyzer | Description |
---|---|
Uses the default analyzer for all Atlas Search indexes and queries. | |
Divides text into searchable terms wherever it finds a
non-letter character. | |
Divides text into searchable terms wherever it finds a
whitespace character. | |
Provides a set of language-specific text analyzers. | |
Indexes text fields as single terms. |
You can also create your own custom analyzer. You can specify alternate analyzers using multi analyzer.
If you don't specify an analyzer, MongoDB uses the default standard analyzer.
To learn more about analyzers, see Analyzing Analyzers to Build The Right Search Index For Your App in the MongoDB Developer Center.
Normalizers
Normalizers produce only a single token at the end of analysis. You can configure normalizers only in the field definition for the Atlas Search token type. Atlas Search provides the following normalizers:
Normalizer | Description |
---|---|
lowercase | Transforms text in string fields to lowercase and creates a
single token for the whole string. |
none | Doesn't perform any transformation, but still creates a single
token. |