Docs Menu
Docs Home
/
MongoDB Atlas
/ /

Use Atlas Search for Full-Text Regex Queries

On this page

  • Avoid Inefficient Regex Matching
  • Example
  • Learn More

If your queries rely on inefficient regex matching, create and run an Atlas Search query with the $search aggregation pipeline stage to improve the performance of text queries which have more options for customizing your query parameters.

If you frequently run case-insensitive regex queries (utilizing the i option), we recommend using Atlas Search queries that use the $search aggregation pipeline stage.

You can specify collation on an index to define language-specific rules for string comparison, such as rules for lettercase and accent marks. However, collation can cause some functionality loss compared to Atlas Search queries. In non-Atlas Search environments, case-insensitive indexes do not improve performance for regex queries. The $regex query operator is not collation-aware, and cannot use case-insensitive indexes effectively. Atlas Search indexes significantly improve the performance of case-sensitive queries and offer more options for customizing query parameters.

Consider an employees collection with the following documents. This collection has no indexes besides the default _id index:

// employees collection
{
"_id": 1,
"first_name": "Hannah",
"last_name": "Simmons",
"dept": "Engineering"
},
{
"_id": 2,
"first_name": "Michael",
"last_name": "Hughes",
"dept": "Security"
},
{
"_id": 3,
"first_name": "Wendy",
"last_name": "Crawford",
"dept": "Human Resources"
},
{
"_id": 4,
"first_name": "MICHAEL",
"last_name": "FLORES",
"dept": "Sales"
}

If your application frequently queries the first_name field, you may want to run case-insensitive regex queries to more easily find matching names. Case-insensitive regex also matches against differing data formats, as in the example above where you have first_names of both "Michael" and "MICHAEL". However, we recommend Atlas Search queries that use the $search aggregation pipeline stage.

If a user searches for the string "michael", the application may run the following query:

db.employees.find( { first_name: { $regex: /michael/i } } )

Since this query specifies the $regex option i, it is case-insensitive. The query returns the following documents:

{ "_id" : 2, "first_name" : "Michael", "last_name" : "Hughes", "dept" : "Security" }
{ "_id" : 4, "first_name" : "MICHAEL", "last_name" : "FLORES", "dept" : "Sales" }

Although this query does return the expected documents, case-insensitive regex queries with no index support are not very performant. To improve performance, create an Atlas Search index:

{
"mappings": {
"dynamic": true
}
}

Collation can cause some functionality loss. When the strength field of an index's collation document is 1 or 2, the index is case-insensitive. For a detailed description of the collation document and the different strength values, see Collation Document.

For the application to use the case-insensitive index, you must also specify the same collation document from the index in the regex query. While you can remove the $regex operator from the previous find() method and use the newly created index, we recommend that you use an Atlas Search query that uses the $search aggregation pipeline stage.

Case-insensitive Query
Atlas Search Query
db.employees.find( { first_name: "michael" } ).collation( { locale: 'en', strength: 2 } )
db.employees.aggregate([
{
$search: {
"index": "default",
"text": {
"path": "first_name",
"query": "michael"
}
}
}
])

Important

Do not use the $regex operator when using a case-insensitive index for your query. The $regex implementation is not collation-aware and cannot utilize case-insensitive indexes. Instead, we recommend Atlas Search queries that use the $search aggregation pipeline stage.

Back

Search Performance

Next

Hybrid Search