Use Atlas Search for Full-Text Regex Queries
If your queries rely on inefficient regex matching, create and run an Atlas Search query with the $search aggregation pipeline stage to improve the performance of text queries which have more options for customizing your query parameters.
Avoid Inefficient Regex Matching
If you frequently run case-insensitive regex queries (utilizing the
i
option), we recommend using Atlas Search queries that use the $search
aggregation pipeline stage.
You can specify collation on an
index to define language-specific rules for string comparison, such as
rules for lettercase and accent marks. However, collation can cause some functionality loss compared to
Atlas Search queries. In non-Atlas Search
environments, case-insensitive indexes do not improve performance for
regex queries. The $regex
query operator is not
collation-aware, and cannot use case-insensitive indexes effectively.
Atlas Search indexes significantly improve
the performance of case-sensitive queries and offer more options for
customizing query parameters.
Example
Consider an employees
collection with the following documents. This
collection has no indexes besides the default _id
index:
// employees collection { "_id": 1, "first_name": "Hannah", "last_name": "Simmons", "dept": "Engineering" }, { "_id": 2, "first_name": "Michael", "last_name": "Hughes", "dept": "Security" }, { "_id": 3, "first_name": "Wendy", "last_name": "Crawford", "dept": "Human Resources" }, { "_id": 4, "first_name": "MICHAEL", "last_name": "FLORES", "dept": "Sales" }
If your application frequently queries the first_name
field, you may
want to run case-insensitive regex queries to more easily find matching
names. Case-insensitive regex also matches against differing data
formats, as in the example above where you have first_names
of both
"Michael" and "MICHAEL". However, we recommend
Atlas Search queries that use the $search aggregation pipeline stage.
If a user searches for the string "michael", the application may run the following query:
db.employees.find( { first_name: { $regex: /michael/i } } )
Since this query specifies the
$regex option i
, it is
case-insensitive. The query returns the following documents:
{ "_id" : 2, "first_name" : "Michael", "last_name" : "Hughes", "dept" : "Security" } { "_id" : 4, "first_name" : "MICHAEL", "last_name" : "FLORES", "dept" : "Sales" }
Although this query does return the expected documents, case-insensitive regex queries with no index support are not very performant. To improve performance, create an Atlas Search index:
{ "mappings": { "dynamic": true } }
Collation can cause some
functionality loss. When the strength
field of an index's
collation
document is 1
or 2
, the index is
case-insensitive. For a detailed description of the collation document
and the different strength
values, see Collation Document.
For the application to use the case-insensitive index, you must also
specify the same collation document from the index in the regex
query. While you can remove the $regex
operator from the previous
find()
method and use the newly
created index, we recommend that you use an Atlas Search query that uses the $search
aggregation pipeline stage.
Case-insensitive Query | Atlas Search Query | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
Important
Do not use the $regex
operator when using a case-insensitive index for your query. The
$regex
implementation is not collation-aware and cannot utilize
case-insensitive indexes. Instead, we recommend
Atlas Search queries that use the
$search aggregation pipeline stage.
Learn More
To learn more about Atlas Search queries, see Create and Run Atlas Search Queries.
To learn more about case-insensitive indexes with illustrative examples, see Case Insensitive Indexes.
To learn more about regex queries in MongoDB, see $regex.
MongoDB University offers a free course on optimizing MongoDB Performance. To learn more, see Monitoring and Insights.