Cloud Firestore vs. Cosmos DB: Which NoSQL database is right for you?

Ezfire

In what can only be described as the "Cloud Wars," cloud computing providers such as Amazon, Microsoft, and Google are competing for market share with unique, proprietary, managed database solutions designed to seamlessly handle the scale of today's fastest growing businesses. In particular, each of these providers has their own take on the high performance, NoSQL document stor, DynamoDB in AWS, Cloud Firestore in GCP and Cosmos DB in Microsoft Azure. Previously, we analyzed the similarities and differences of DynamoDB and Cloud Firestore. Now you might be wondering "how does Cosmos DB stack up against Cloud Firestore? When should I use one of these databases over this other?"

Not to worry, because in this article we are going to compare and contrast Cloud Firestore and Cosmos DB, analyze their similarities and differences, and zero in on the use cases where they excel.

As in the previous article, all Firestore queries included in this article can be run directly on Ezfire by signing up for our app.

Data Model

Cosmos DB is a truly multi-paradigm database. It provides a number of different API interfaces you can choose to interact with the database, such as the Core(SQL) API, Cassandra API, MongoDB API, Gremlin API and Table API. Depending on the API you provision your database instance with, you can have Cosmos behave as a key-value store, a document store and even a graph database. However, for the purposes of this article, we will be looking at the Core(SQL) API. This is the recommended choice for most new use cases, and is also the API mode that receives updates and new features first.

With the Core (SQL) API, Cosmos DB behaves as a document database with a SQL API. Data in Cosmos is represented as containers full of items where each item is a JSON document. These JSON documents are schemaless and can take any shape, including nested JSON objects. For each container in your database you specify a partition key with Cosmos DB uses to logically partition the data set. Documents with equal partition keys are stored in the same logical partition. The logical partitions of Cosmos define the scope of database transactions in that you can update items within a logical partition using a transaction with snapshot isolation.

In contrast, Cloud Firestore has a much simpler model and only one supported API interface. A Cloud Firestore database is made up of collections of JSON documents, each referred to by a REST-like resource path. Like Cosmos, Cloud Firestore collections are schemaless. Cloud Firestore also supports ACID transactions with serializable isolation, but without any constraints on logical partition like Cosmos. In fact, Firestore does not have any concept like logical partitions.

What makes Firestore's model unique is that in order to provide stable, predictably query performance for all size of data sets, Cloud Firestore requires that every query is backed my an index. In fact, when writing new document to your database, Cloud Firestore will automatically create a single-field index for each field in your document. For composite queries involving more than one field, one needs to manually create an appropriate composite index to perform the query. If an appropriate index does not exist at query time, the query will fail.

Another unique element of the Cloud Firestore data model is the specialized data types it supports. At the database level, Firestore has support for a timestamp data type, document reference data type and geopoint data type that allow for special handling and semantics.

Another particularly useful feature of Cloud Firestore is that it can be called directly from a client application and applications can even listen to specific documents for changes, making it easy to build features that require realtime functionality.

Reading Data

For both Cloud Firestore and CosmosDB, data is queried from the database using the appropriate SDK. For Nodejs, you can use the @google-cloud/firestore package and for CosmosDB you can use the @azure/cosmos package. In each of the examples below, db will refer to the database client object from each of the above SDKs.

With the Core (SQL) API, there are two primary ways of reading data from the database. If you have a document's id (and optionally the partitionKey of the partition it lives in), you can read that document directly from the database like this:

db.container(containerId).item(id, partitionKey).read();

Now how about if you want to query across the container. These types of queries are actually written using a simplified SQL syntax which allows one to express filters, sorts, aggregates and joins on their data in a familiar way.

To make the following examples more concrete, suppose we have a container with id Users that we want to query in our database. The documents in this container have roughly the following shape:

type User = {
  type: "User";
  id: string;
  name: string;
  email: string;
  age: number;
  authProvider: string;
};

Now suppose we want to find a given user document by their email. To do this, we can execute the following query:

db.container("Users")
  .items.query({
    query: `SELECT * FROM Users u WHERE u.email = @email`,
    parameters: [{ name: "@email", value: "robert@speedwagon.foundation" }],
  })
  .fetchAll();

This query will return a list of documents where the email field is matching the specified parameter. Cosmos DB SQL queries also support applying multiple filters to your data set. For example, if you wanted to find the first 10 users older than 25 that are signing in with Google, you could do this:

db.container("Users")
  .items.query({
    query: `SELECT * FROM Users u WHERE u.authProvider = @provider AND u.age >= @age LIMIT 10`,
    parameters: [
      { name: "@provider", value: "robert@speedwagon.foundation" },
      { name: "@age", value: 25 },
    ],
  })
  .fetchAll();

To sort the above data, one can modify the query to include an ORDER BY clause, for example:

SELECT * FROM Users u WHERE u.authProvider = @provider AND u.age >= @age ORDER BY u.age LIMIT 10

Finally, CosmosDB supports aggregations such as COUNT, SUM, MAX, MIN and AVG across documents in a container.

-- Returns the total number of users in the container.
SELECT VALUE COUNT(1) FROM Users

In Cloud Firestore, similar to Cosmos DB, there are two ways to read data from the database. The first method is by directly reading a document by its resource path. If you recall, each document in Cloud Firestore is identified by a unique resource path like users/:userId. If yu know the resource path of a given document, you can fetch that document directly from the database:

db.doc(`users/${userId}`).get();

// Works with nested collections as well.
db.doc(`users/${userId}/addresses/${addressId}`).get();

Now, like Cosmos DB, in Cloud Firestore you can also query a collection to find the documents you are looking for.

What if you don't know the id of the document you are looking for, or want to find all documents matching some set of conditions? Like Cosmos DB, in this case you can simply query the collection. Cloud Firestore supports many of the same features as Cosmos DB such as filters on multiple document fields, multiple sort orders as well as cursor-based pagination. Below are some examples of such queries:

// All users who are 25 years old.
db.collection("users").where("age", "==", 25).get();

// All users in their 30s.
db.collection("users").orderBy("age", "asc").startAt(30).endBefore(40).get();

// First 10 users 30 and over
db.collection("users").orderBy("age", "asc").startAt(30).limit(10).get();

// Users who are not 25.
db.collection("users").where("age", "!=", 25).get();

// All users using google or facebook login.
db.collection("users")
  .where("authProvider", "in", ["google", "facebook"])
  .get();

// The first 10 users older that 25 that are signing in with Google.
db.collection("users")
  .where("age", ">=", 25)
  .where("authProvider", "==", "google")
  .limit(10)
  .get();

One caveat of queries in Cloud Firestore is that they must always be backed by an index. This ensures that any query you can perform will always be fast. The drawback here is that the creation of additional query specific queries becomes commonplace and a little hard to manage.

One notable difference Cloud Firestore has with Cosmos DB is that it does not support database level aggregations and that any required aggregates must be computed at the application-level. This has both performance and cost implications.

Writing Data

What about writing data to the database? In this case, Cosmos DB and Cloud Firestore are much more aligned. To write a new document to a Cosmos DB container, you need only all the create method for the items on your container and provide an object with at least an appropriate id:

db.container("Users").items.create({
  type: "User",
  id: "1",
  name: "Robert E. O. Speedwagon",
  email: "robert@speedwagon.foundation",
  age: 50,
  authProvider: "google",
});

To update a document you must reference it by its id (and optionally partition key). There are supported methods of updating and existing document. You can patch the data in the database with a partial update, you can replace the data completely, and finally you can delete the document. Below are some examples of this in action:

// Overwrite only the age field
db.container("Users").item("1").patch({
  age: 55,
});

// Completely overwrite the existing data.
db.container("Users").item("1").replace({
  type: "User",
  id: "1",
  name: "Joseph Joestar",
  email: "joseph@speedwagon.foundation",
  age: 18,
  authProvider: "facebook",
});

// Delete this document from the container.
db.container("Users").item("1").delete();

Writing data to Cloud Firestore is very similar with a few small differences. In the Firestore world, you only need to specify a document reference as described in the previous section and call the set method to set the document data for that reference:

// Write the document `users/john` to the database.
db.collection("users").doc("john").set({
  name: "John",
  email: "john@example.com",
  age: 25,
});

Calling set on a document reference creates a document if it does not already exist or completely overwrites an existing document. To update an existing document you can use either the update method or the set method in merge mode. When using update, you provide a javascript object identifying the field paths in the document to update:

db.collection("users").doc("john").update({
  age: 26,
  "settings.useMetric": true, // Updates the nested value only
});

When using set in merge mode, Firestore will deep merge the incoming Javascript object with the document already in the database:

// Performs the same update as above.
db.collection("users")
  .doc("john")
  .set(
    {
      age: 26,
      settings: {
        useMetric: true,
      },
    },
    { merge: true }
  );

To delete a document, you simply need to call the delete method:

// Delete the document.
db.collection("users").doc("john").delete();

Cost

Cloud Firestore has an easy to understand pay-per-use cost structure. The costs are $0.06 for 100000 reads, $0.18 for 100000 writes, and $0.02 dollars for 100000 deletes. This makes for an easy to understand cost for an application but also has the potential to become very expensive if app traffic grows rapidly.

Cosmos DB in contrast is a little more flexible when is comes to pricing. Cosmos DB offers a pricing model for provisioned throughput, useless for predictable workloads, and a serverless pricing model, useful for unpredictable and spiky workloads. Both pricing models depend on the concept of request units (RU), where 1 request unit is represents a single read of a 1 KB document in the database.

In the provisioned throughput mode, you can specify a number of request units per second (RU/s) to be available for your consumption. The cost of the provisions are $0.008/hour per 100 RU/s of provisioned capacity.

In serverless mode, you pay for the exact number of RUs that you use during your billing period. The cost in serverless mode is $0.282 / 1M RU.

Considering the relative costs, Cosmos DB works out to be much cheaper than Cloud Firestore for the same serverless workloads. Furthermore, for predictable workloads, you can achieve even more cost savings in provisioned throughput mode for which there is no alternative in Cloud Firestore.

Conclusion

In this article, we took a look at Cloud Firestore by Google Cloud and Cosmos DB by Microsoft Azure and compared their features and use cases. We found that Cosmos DB is a very flexible NoSQL database, offering a variety of different APIs and compatibility modes, whereas Cloud Firestore has only one API. In its Core API, Cosmos DB offers a simplified SQL-like query language that supports filters, sorts and aggregates. Cloud Firestore supports many of the same query patterns but cannot perform aggregations at the database level. We also found the Cosmos DB is much cheaper that Cloud Firestore for the same workload.

In general, both databases will work well for reading and writing document data of various shapes, however if aggregating data across collections is a requirement, Cosmos DB will be a better choice. Cosmos is also better for cost savings. On the other hand, if realtime updates are a requirement for your app, Cloud Firestore is a good choice because it makes realtime apps easy to achieve. Thanks for reading, I hope you learned something useful.

If you are interested in getting more experience with Cloud Firestore sign up for our app and try out some of the queries in this article on our sample database.

Comparison