Document databases offer a variety of advantages, including:
Because of these advantages, document databases are general-purpose databases that can be used in a variety of use cases and industries.
Document databases are considered to be non-relational (or NoSQL) databases. Instead of storing data in fixed rows and columns, document databases use flexible documents. Document databases are the most popular alternative to tabular, relational databases. Learn more about NoSQL databases.
In this article, we'll explore answers to the following questions:
A document is a record in a document database. A document typically stores information about one object and any of its related metadata.
Documents store data in field-value pairs. The values can be a variety of types and structures, including strings, numbers, dates, arrays, or objects. Documents can be stored in formats like JSON, BSON, and XML.
Below is a JSON document that stores information about a user named Tom.
{
"_id": 1,
"first_name": "Tom",
"email": "tom@example.com",
"cell": "765-555-5555",
"likes": [
"fashion",
"spas",
"shopping"
],
"businesses": [
{
"name": "Entertainment 1080",
"partner": "Jean",
"status": "Bankrupt",
"date_founded": {
"$date": "2012-05-19T04:00:00Z"
}
},
{
"name": "Swag for Tweens",
"date_founded": {
"$date": "2012-11-01T04:00:00Z"
}
}
]
}
A collection is a group of documents. Collections typically store documents that have similar contents.
Not all documents in a collection are required to have the same fields, because document databases have flexible schemas. Note that some document databases provide schema validation, so the schema can optionally be locked down when needed.
Continuing with the example above, the document with information about Tom could be stored in a collection named users
. More documents could be added to the users
collection in order to store information about other users. For example, the document below that stores information about Donna could be added to the users
collection.
{
"_id": 2,
"first_name": "Donna",
"email": "donna@example.com",
"spouse": "Joe",
"likes": [
"spas",
"shopping",
"live tweeting"
],
"businesses": [
{
"name": "Castle Realty",
"status": "Thriving",
"date_founded": {
"$date": "2013-11-21T04:00:00Z"
}
}
]
}
Note that the document for Donna does not contain the same fields as the document for Tom. The users
collection is leveraging a flexible schema to store the information that exists for each user.
Document databases typically have an API or query language that allows developers to execute the CRUD (create, read, update, and delete) operations.
Document databases have the following key features:
Three key factors differentiate document databases from relational databases:
1. The intuitiveness of the data model: Documents map to the objects in code, so they are much more natural to work with. There is no need to decompose data across tables, run expensive joins, or integrate a separate Object Relational Mapping (ORM) layer. Data that is accessed together is stored together, so developers have less code to write and end users get higher performance.
2. The ubiquity of JSON documents: JSON has become an established standard for data interchange and storage. JSON documents are lightweight, language-independent, and human-readable. Documents are a superset of all other data models so developers can structure data in the way their applications need — rich objects, key-value pairs, tables, geospatial and time-series data, or the nodes and edges of a graph.
3. The flexibility of the schema: A document’s schema is dynamic and self-describing, so developers don’t need to first pre-define it in the database. Fields can vary from document to document. Developers can modify the structure at any time, avoiding disruptive schema migrations. Some document databases offer schema validation so you can optionally enforce rules governing document structures.
Developers commonly find working with data in documents to be easier and more intuitive than working with data in tables. Documents map to data structures in most popular programming languages. Developers don't have to worry about manually splitting related data across multiple tables when storing it or joining it back together when retrieving it. They also don't need to use an ORM to handle manipulating the data for them. Instead, they can easily work with the data directly in their applications.
Let's take another look at a document for a user named Tom.
Users
{
"_id": 1,
"first_name": "Tom",
"email": "tom@example.com",
"cell": "765-555-5555",
"likes": [
"fashion",
"spas",
"shopping"
],
"businesses": [
{
"name": "Entertainment 1080",
"partner": "Jean",
"status": "Bankrupt",
"date_founded": {
"$date": "2012-05-19T04:00:00Z"
}
},
{
"name": "Swag for Tweens",
"date_founded": {
"$date": "2012-11-01T04:00:00Z"
}
}
]
}
All of the information about Tom is stored in a single document.
Now let's consider how we can store that same information in a relational database. We'll begin by creating a table that stores the basic information about the user.
Users
ID | first_name | cell | |
---|---|---|---|
1 | Tom | tom@example.com | 765-555-5555 |
A user can like many things (meaning there is a one-to-many relationship between a user and likes), so we will create a new table named "Likes" to store a user’s likes. The Likes table will have a foreign key that references the ID column in the Users table.
Likes
ID | user_id | like |
---|---|---|
10 | 1 | fashion |
11 | 1 | spas |
12 | 1 | shopping |
Similarly, a user can run many businesses, so we will create a new table named "Businesses" to store business information. The Businesses table will have a foreign key that references the ID
column in the Users
table.
Businesses
ID | user_id | name | partner | status | date_founded |
---|---|---|---|---|---|
20 | 1 | Entertainment 1080 | Jean | Bankrupt | 2011-05-19 |
21 | 1 | Swag for Tweens | NULL | NULL | 2012-11-01 |
In this simple example, we see that data about a user could be stored in a single document in a document database or three tables in a relational database. When a developer wants to retrieve or update information about a user in the document database, they can write one query with zero joins. Interacting with the database is straightforward, and modeling the data in the database is intuitive.
Visit Mapping Terms and Concepts from SQL to MongoDB to learn more.
The document model is a superset of other data models, including key-value pairs, relational, objects, graph, and geospatial.
The document model is a superset of other data models
Due to their rich data modeling capabilities, document databases are general-purpose databases that can store data for a variety of use cases.
With document databases empowering developers to build faster, most relational databases have added support for JSON. However, simply adding a JSON data type does not bring the benefits of a database with native support for JSON. Why? Because the relational approach detracts from developer productivity, rather than improving it. These are some of the things developers have to deal with.
Working with documents means using custom, vendor-specific SQL functions, which are not familiar to most developers and don’t work with your favorite SQL tools. Add low-level JDBC/ODBC drivers and ORMs and you face complex development processes resulting in low productivity.
Presenting JSON data as simple strings and numbers rather than the rich data types supported by native document databases such as MongoDB makes computing, comparing, and sorting data complex and error prone.
Relational databases offer little to validate the schema of documents, so you have no way to apply quality controls against your JSON data. And you still need to define a schema for your regular tabular data, with all the extra overhead that's involved when you need to alter your tables as your application’s features evolve.
Most relational databases do not maintain statistics on JSON data, preventing the query planner from optimizing queries against documents, and you from tuning your queries.
Traditional relational databases offer no way for you to partition (shard) the database across multiple instances to scale as workloads grow. Instead you have to implement sharding yourself in the application layer, or rely on expensive scale-up systems.
Document databases have many strengths:
These strengths make document databases an excellent choice for a general-purpose database.
A common weakness that people cite about document databases is that many do not support multi-document ACID transactions. We estimate that 80%-90% of applications that leverage the document model will not need to use multi-document transactions.
Note that some document databases like MongoDB support multi-document ACID transactions.
Visit What are ACID Transactions? to learn more about how the document model mostly eliminates the need for multi-document transactions and how MongoDB supports transactions in the rare cases where they are needed.
Document databases are general-purpose databases that serve a variety of use cases for both transactional and analytical applications:
Visit Use Case Guidance: Where to Use MongoDB to learn more about each of the applications listed above.
Document databases utilize the intuitive, flexible document data model to store data. Document databases are general-purpose databases that can be used for a variety of use cases across industries.
Get started with document databases by creating a database in MongoDB Atlas, MongoDB's developer data platform. Atlas has a generous forever-free tier you can use to experiment and explore the document model.
In MongoDB, the first field in every document is named _id
. The _id
field serves as a unique identifier for the document. See the official MongoDB documentation for more information.
Note that each document database management system has its own field requirements.
MongoDB stores data in BSON (Binary JSON) documents.
Yes, MongoDB has two free options:
The most obvious difference between a document database and a relational database is the way data is modeled. Document databases typically model data using flexible JSON-like documents with field-value pairs. Relational databases typically model data using rigid tables with fixed rows and columns.