Cloud Firestore vs. DynamoDB: Which BaaS is right for you?

Ezfire

Firebase is a popular backend-as-a-service and app development platform offered as part of Google Cloud Platform. Firebase provides mature and easy to integrate solutions for authentication, analytics and data storage, that can be used directly from client apps. At the core of the Firebase platform is the Cloud Firestore database. Cloud Firestore is a NoSQL, JSON document store, built for applications at scale, that provides developers with the tools to build reactive applications with the need for a custom backend.

Following in Google's path, Amazon has recently release AWS Amplify, a backend-as-as-service platform within AWS that is trying to compete for the same users as Firebase. To provide in-app data storage like Cloud Firestore, AWS has integrated their existing DynamoDB database into Amplify.

In this article we are going to compare and contrast Cloud Firestore and DynamoDB, analyze their similarities and differences, and zero in on the use cases where they excel.

As in the previous article, all Firestore queries included in this article can be run directly on Ezfire by signing up for our app.

Data Model

As mentioned earlier, Cloud Firestore is a schemaless, NoSQL JSON document store. In Cloud Firestore, data is organized into collections of JSON documents, each referred to by a REST-like resource path, such as users/:userId where userId is some unique token for the collection users. One interesting caveat of this model is that documents can be nested as subcollections of other documents, for example users/:userId/addresses/:addressId. This makes it easy to express the hierarchy of one's data model directly in the references to its documents. Creating new collections in Firestore is convenient as they are created on demand as data is written to the database.

DynamoDB is also a schemaless, NoSQL JSON document store like Cloud Firestore, with a few key differences. In DynamoDB, the core components are tables, items and attributes where tables are collections of items and items are a collection of attributes. Unlike Cloud Firestore where collections are create on demand, tables and associated primary keys in DynamoDB need to be created ahead of time using the AWS SDK. Primary keys can either be a single partition key on one attribute or a composite partition and sort key on two examples. For example, the following code will create a simple "Movies" table with a composite primary key on year and title.

const AWS = require("aws-sdk");
const dynamodb = new AWS.DynamoDB();
dynamodb.createTable({
  TableName : "Movies",
  KeySchema: [
    { AttributeName: "year", KeyType: "HASH"},  //Partition key
    { AttributeName: "title", KeyType: "RANGE" }  //Sort key
  ],
  AttributeDefinitions: [
    { AttributeName: "year", AttributeType: "N" },
    { AttributeName: "title", AttributeType: "S" }
  ],
  ProvisionedThroughput: {
    ReadCapacityUnits: 10,
    WriteCapacityUnits: 10
  }
}, err => {
  ...
});

This tells DynamoDB to store your JSON in a table where items are identified by the unique pair of their year and title attributes.

It should be noted here that both Cloud Firestore and DynamoDB support such features like ACID transactions and change data capture with the ability to trigger serverless functions on database write events. Cloud Firestore additionally supports listening for updates in realtime directly from client apps using the various Firebase frontend SDKs.

Reading Data

For both Cloud Firestore and DynamoDB, data is queried from the database using the appropriate SDK. For Nodejs, you can use the @google-cloud/firestore package and for DynamoDB you can use the asw-sdk package. In each of the examples below, db will refer to the database client object from each of the above SDKs.

As mentioned earlier, each document in Cloud Firestore is identified by a unique resource path like users/:userId. To fetch a specific document from the database, you only need to know its resource path, also referred to as the document reference. With this reference, you can load the document like this:

db.doc(`users/${userId}`).get();

// Works with nested collections as well.
db.doc(`users/${userId}/addresses/${addressId}`).get();

With the SDK, you can also programmatically build the document reference from each of the segments:

db.collection("users").doc(userId).get();

db.collection("users").doc(userId).collection("addresses").doc(addressId).get();

What if you don't know the id of the document you are looking for, or want to find all documents matching some set of conditions? In that case you can simply query the collection. Cloud Firestore supports queries with filters on multiple document fields, multiple sort orders as well as cursor-based pagination. Below are some examples of such queries:

// All users who are 25 years old.
db.collection("users").where("age", "==", 25).get();

// All users in their 30s.
db.collection("users").orderBy("age", "asc").startAt(30).endBefore(40).get();

// First 10 users 30 and over
db.collection("users").orderBy("age", "asc").startAt(30).limit(10).get();

// Users who are not 25.
db.collection("users").where("age", "!=", 25).get();

// All users using google or facebook login.
db.collection("users")
  .where("authProvider", "in", ["google", "facebook"])
  .get();

// The first 10 users older that 25 that are signing in with Google.
db.collection("users")
  .where("age", ">=", 25)
  .where("authProvider", "==", "google")
  .limit(10)
  .get();

One caveat of queries in Cloud Firestore is that they must always be backed by an index. This ensures that any query you can perform will always be fast. The drawback here is that the creation of additional query specific queries becomes commonplace and a little hard to manage.

In DynamoDB, items are identified uniquely by their primary key. As an example, lets consider a users table in DynamoDB with he following schema:

{
  TableName: "Users",
  KeySchema: [
    { AttributeName: "email", KeyType: "HASH"},  //Partition key
  ],
  AttributeDefinitions: [
    { AttributeName: "email", AttributeType: "S" },
  ],
  ProvisionedThroughput: {
    ReadCapacityUnits: 10,
    WriteCapacityUnits: 10
  }
}

If we know the primary key of an item, we can easily fetch that item in the following way:

db.get({
  TableName: "Users",
  Key: {
    email: userEmail,
  },
}).promise();

This will return the JSON data for the item with primary key userEmail.

If you have a composite primary key, like in the "Movies" example above, you can also fetch items by specifying the whole key:

db.get({
  TableName: "Movies",
  Key: {
    year: year,
    title: title,
  },
}).promise();

Also, when working with a composite you can query across the primary key. This is done by fixing a partition and selecting along the sort key:

db.query({
  TableName: "Movies",
  KeyConditionExpression: "#yr = :y and title between :l1 and :l2"
  ExpressionAttributeNames: {
    "#yr": "year"
  },
  ExpressionAttributeValues: {
    ":y": 1985,
    ":l1": "A",
    ":l2": "L",
  }
}).promise();

The final method of querying for data in DynamoDB is called a scan. When you scan a table, the database reads every item in the table using an optional filter to build the result set.

// Get the movies in the 90s.
db.scan({
  TableName: "Movies",
  FilterExpression: "#yr between :start and :end"
  ExpressionAttributeNames: {
    "#yr": "year"
  },
  ExpressionAttributeValues: {
    ":start": 1990,
    ":end": 1999,
  }
}).promise();

Writing Data

Writing data for Cloud Firestore is very simple. You only need to specify a document reference as described in the previous section and call the set method to set the document data for that reference:

// Write the document `users/john` to the database.
db.collection("users").doc("john").set({
  name: "John",
  email: "john@example.com",
  age: 25,
});

Calling set on a document reference creates a document if it does not already exist or completely overwrites an existing document. To update an existing document you can use either the update method or the set method in merge mode. When using update, you provide a javascript object identifying the field paths in the document to update:

db.collection("users").doc("john").update({
  age: 26,
  "settings.useMetric": true, // Updates the nested value only
});

When using set in merge mode, Firestore will deep merge the incoming Javascript object with the document already in the database:

// Performs the same update as above.
db.collection("users")
  .doc("john")
  .set(
    {
      age: 26,
      settings: {
        useMetric: true,
      },
    },
    { merge: true }
  );

To delete a document, you simply need to call the delete method:

// Delete the document.
db.collection("users").doc("john").delete();

Writing new data to DynamoDB is equally as simple as Firestore. You just need to specify the table and item data you would like to use and call put:

db.put({
  TableName: "Users",
  Item: {
    name: "John",
    email: "john@example.com",
    age: 25,
  },
}).promise();

To update these values, you need to use the update method and provide an appropriate update expression for the update you want to perform. You must also include the primary key of the document you want to update in your request:

db.update({
  TableName: "Users",
  Key: {
    email: "john@example.com",
  },
  UpdateExpression: "set age = :age",
  ExpressionAttributeValues: {
    ":age": 30,
  },
  ReturnValues: "UPDATED_NEW",
}).promise();

Finally, to delete items, you just call the delete method and specify the primary key you want to delete:

db.delete({
  TableName: "Users",
  Key: {
    email: "john@example.com",
  },
}).promise();

Deletes in DynamoDB can also be made conditional by passing an optional ConditionExpression that must evaluate to true for the item to be deleted.

Cost

Cloud Firestore has an easy to understand pay-per-use cost structure. The costs are $0.06 for 100000 reads, $0.18 for 100000 writes, and $0.02 dollars for 100000 deletes. This makes for an easy to understand cost for an application but also has the potential to become very expensive if app traffic grows rapidly.

DynamoDB is a little more flexible when it comes to cost, offering both an on-demand pricing model similar to Cloud Firestore and a provisioned capacity mode where you prepay for a specific amount of capacity. On-demand pricing for DynamoDB is actually a bit cheaper than Firestore at $1.25 per million write request units and $0.25 per million read request units. Note however that transactional reads and writes to DynamoDB actually use 2 request units. On-demand pricing is good for unpredictable workloads where required capacity changes with time or cannot be known ahead of time.

Provisioned capacity mode is better for predictable workloads where the required capacity for a workload will be known ahead of time and can be accounted for. You can read more about the pricing model for the provisioned capacity mode here.

Conclusion

In this article we took a look at Cloud Firestore by Google Cloud and DynamoDB by AWS and compared their features and use cases. We found that Cloud Firestore' data model is a little more flexible when it comes to storing new data, whereas DynamoDB requires the creation of tables and primary keys ahead of time. Generally, Cloud Firestore and DynamoDB offer the same kinds of querying, with both databases requiring indexes to perform fast queries. DynamoDB however also offers the the scan method which allows for some arbitrary queries at the cost fo performance. Lastly, we found writing data to both databases is analogous. Thanks for reading, I hope you learned something useful.

If you are interested in getting more experience with Cloud Firestore sign up for our app and try out some of the queries in this article on our sample database.

Comparison