Ezfire Logo

Tutorial

DynamoDB: PartiQL vs. DocumentClient

In this article we will introduce the new PartiQL language supported by DynamoDB and compare and contrast this new API with the legacy DocumentClient API used originally for querying DynamoDB.

Author Profile Picture
Mar 4, 2022 · 7 min read

Ezfire


When DynamoDB was originally released, the only way to query and mutate data in the database in Node.js was to use the Document Client API. The query language of this API was an unfamiliar mix of JSON and template strings combined with a few DynamoDB specific method in the aws-sdk.

To make DynamoDB more accessible, AWS announced late in 2020 support for the PartiQL query language in DynamoDB, allowing a user to dispatch familiar SQL-like queries to Dynamo in order to query and mutate data.

In this article we are going to take a look at PartiQL, how it interacts with the DynamoDB data model, and how it compares to the existing Document Client API. Through out this article we will be looking at examples of DocumentClient queries written using the Node.js SDK and their comparable PartiQL queries.

All PartiQL queries for DynamoDB included in this article can be run directly on Ezfire by signing up for our app.

But first, what is PartiQL exactly?

What is PartiQL?

PartiQL (pronounced the same as "particle") is a open source, SQL-like query language developed by AWS and supported by the majority of their database products. PartiQL extends SQL in order to support nested data, the likes of which you would find in a typical JSON document store like DynamoDB.

By supporting PartiQL in the majority of data stores on AWS, is it easy to adopt new, unique or unfamiliar data stores since they all have a familiar interface. This also makes it easy for developers with only a SQL background to be productive with these new, highly scalable database systems.

Now that we know a little bit about PartiQL, how does is differ from the DocumentClient API?

Comparing PartiQL and DocumentClient

For the purpose of this exercise, suppose we have a Countries table in our DynamoDB instance with a document for each country in the world. The definition of this table looks something like:

const aws = require("aws-sdk");
const dynamodb = new aws.DynamoDB();
db.createTable({
  TableName: "Countries",
  KeySchema: [{ AttributeName: "iso2", KeyType: "HASH" }],
  AttributeDefinitions: [
    { AttributeName: "iso2", AttributeType: "S" },
    { AttributeName: "continent", AttributeType: "S" },
    { AttributeName: "areaKm2", AttributeType: "N" },
  ],
  GlobalSecondaryIndexes: [
    {
      IndexName: "ContinentsIndex",
      KeySchema: [
        { AttributeName: "continent", KeyType: "HASH" },
        { AttributeName: "areaKm2", KeyType: "RANGE" },
      ],
      Projection: { ProjectionType: "ALL" },
      ProvisionedThroughput: {
        ReadCapacityUnits: 1,
        WriteCapacityUnits: 1,
      },
    },
  ],
  ProvisionedThroughput: {
    ReadCapacityUnits: 1,
    WriteCapacityUnits: 1,
  },
});

And structure of these documents looks something like this:

type Country = {
  iso2: string;
  iso3: string;
  isoNumeric: number;
  continent: string;
  capital: string;
  areaKm2: number;
  population: number;
};

Reading Data

Given the above table schema, the simplest query we could do with the DocumentClient API would to select the a country, like the United States, by the hash index key, iso2:

const db = new aws.DynamoDB.DocumentClient();
db.get({
  TableName: "Countries",
  Key: {
    iso2: "US",
  },
}).promise();

The equivalent query in PartiQL for Dynamo looks something like this:

select * from Countries where iso2 = 'US';

In this case there isn't a big difference in query complex between DocumentClient and PartiQL. How about in the case of something more complex?

When querying with along a composite index in Dynamo, such as the ContinentsIndex in our Countries example, we are able to apply filters along the RANGEportion of the index for a fixed HASH key. For example, if we want to find the iso2 code and capital of all the countries in North America with area between 10000km2 and 100000km2, we could dispatch the following query with DocumentClient:

db.query({
  TableName : "Countries",
  IndexName: "ContinentsIndex",
  ProjectionExpression: "iso2, capital",
  KeyConditionExpression: "continent = :continent and areaKm2 between :area1 and :area2",
  ExpressionAttributeValues: {
      ":continent": "North America",
      ":area1": 10000,
      ":area2": 100000
}).promise();

This is where the flaws in DocumentClient start to become apparent. Not only are we using a different method to run this query than the get method we were using previously but this method expects us t specify this KeyConditionExpression, a SQL-like filter expression DSL. As a result the query becomes quite verbose.

How does an equivalent query look using PartiQL? Something like this:

select iso2, capital from Countries.ContinentsIndex
  where continent = 'North America'
  and areaKm2 between 10000 and 100000;

Behind this scenes, this query string will resolve to the same query as what we would send using DocumentClient. However this PartiQL query is much more compact and easier to parse and understand.

What if we weren't querying along an index? For example maybe we want to find all the countries in North America with less than 10 million people. For this type of query, we are forced to use the more expensive scan method to look at and filter every document in the partition given by the specified hash key. Do a query like this with DocumentClient is very similar to using query above, but instead using a FilterExpression:

db.scan({
  TableName: "Countries",
  IndexName: "ContinentsIndex",
  FilterExpression: "continent = :continent and population < :population",
  ExpressionAttributeValues: {
    ":continent": "North America",
    ":population": 10000000,
  },
});

Again, in this case, the PartiQL query is much more compact and easy to digest:

select * from Countries.ContinentIndex
  where continent = 'North America'
  and population < 10000000;

One word of warning however about these PartiQL queries. Based on the type of data being queried, Dynamo will automatically perform a get, query or scan in order to fetch the request data. There is a certain convenience to this behaviour, however it is a double-edge sword. Scans are an expensive query (in terms of performance and cost) and the opaqueness of the actual plan for a given PartiQL query can introduce expected performance issues or costly bills by bloating the number of documents that are read in the average query. For example, we would run into problems naively running a query like this:

select * from Countries where iso3 = "USA";

This is because there is no index on iso3, so a full-table scan is needed to find a matching document. This does not scale well.

Writing data

Compared with reading data from DynamoDB, writing data using the DocumentClient API is relatively the easy. To write a new document to a table in Dynamo, you need only specify the TableName and an Item JSON body that includes at least the primary key:

db.put({
  TableName: "Countries",
  Item: {
    iso2: "FO",
    iso3: "FOO",
    isoNumeric: 123,
  },
}).promise();

It is equally straight forward to delete an item with document with DocumentClient, as you need only specify the primary key:

db.delete({
  TableName: "Countries",
  Key: {
    iso2: "FO",
  },
}).promise();

Updating a document or set of documents is straight forward but verbose, like querying, as you need to specify an update expression:

db.delete({
  TableName: "Countries",
  Key: {
    iso2: "FO",
  },
  UpdateExpression: "set isoNumeric = :val",
  ExpressionAttributeValues: {
    ":val": 1,
  },
}).promise();

These three cases are equally simple to represent with PartiQL, but unfortunately you don't game much in the way of simplicity. Either method of querying is equally complex:

-- Write a new document
insert into
Countries value {'iso2' : 'FO','iso3' : 'FOO', 'isoNumeric': '123'}

-- Delete a document
delete from Countries where iso2 = "FO";

-- Update a document
update Countries set isoNumeric = 1 where iso2 = "FO;

Conclusion

In this article we introduced PartiQL, the new SQL-like query language for DynamoDB, and compared it with the legacy DocumentClient API from the aws-sdk library for Node.js. We found that, particularly for querying data, PartiQL offers a simplified and less verbose way of expressing queries for documents. This comes however at the cost of making specific query strategy opaque to the query writer. This is convenient, but can sometimes lead to unintended table scans that are very expensive. When it comes to writing data to the database, PartiQL does not help as much to simplify things and only offers an advantage for users who are already familiar with SQL. Thanks for reading, I hope you learned something useful.

If you are interested in getting more experience with DynamoDB, sign up for our app and try out some of the queries in this article with you own database.


Ready to get started?

Ezfire will always be free to use for individuals. Try out Ezfire risk-free.