Intro to QLDB
- 9 minutes read - 1802 wordsWhat is QLDB?
Amazon QLDB is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log owned by a central trusted authority.
QLDB is called ‘Quantum’ as in indivisible discrete changes. All the transactions are recorded to a transparent journal where each block represents a discrete state change.
Overview
QLDB is considered a ledger database. A ledger consists of one or more QLDB tables, and a journal, as shown below:
QLDB Tables
QLDB data is organised into tables of documents, or more precisely, document revisions. A document revision is a structure that represents a single iteration of a document’s full dataset and includes both user data and system generated metadata. Each revision is uniquely identified by a combination of the document ID and a zero based version number. When a document is deleted from a table, no document with the same document ID can be created again in the same ledger.
QLDB documents are stored in Amazon Ion
format. Ion
is a superset of JSON
, meaning that any valid JSON
document is also a valid Ion
document. It contains additional data types, type annotations and comments. It is based on an abstract data model that lets you store both structured and unstructured data.
QLDB Journal
When an application needs to modify data in a document, it does so in a database transaction. QLDB transactions are ACID compliant and have full serializability - the highest level of isolation.
Within a transaction, data is read from the ledger, updated, and committed to the journal. The journal represents a complete and immutable history of all the changes to your data. QLDB writes one or more chained blocks to the journal in a transaction. Each block contains entry objects that represent the document revisions you insert, update and delete, along with PartiQL statements that committed them.
As a transaction is written to the journal, a cryptographic digest is computed and stored as part of the transaction. Any time the transaction is moved through the system, the digest is checked to ensure that it has not been corrupted.
In QLDB, each ledger has exactly one journal. Currently, a journal only a single partition which is called a strand.
Walk through demo
Create a Ledger
The following is the simple CloudFormation
template required to create a ledger:
create-ledger-cf.json
{
"AWSTemplateFormatVersion" : "2010-09-09",
"Resources" : {
"myQLDBLedger": {
"Type": "AWS::QLDB::Ledger",
"Properties": {
"DeletionProtection": true,
"Name": "<ledger-name>",
"PermissionsMode": "ALLOW_ALL",
"Tags": [
{
"Key": "name",
"Value": "aws-south-wales"
}
]
}
}
}
}
To deploy the template, run the following from a terminal window in the same directory:
aws cloudformation deploy --template-file ./create-ledger-cf.json --stack-name qldb-demo
You could of course just use the AWS CLI to achieve the same thing:
aws qldb create-ledger --name <ledger-name> --permissions-mode ALLOW_ALL
Create tables and indexes
Run the following commands one at a time to create Person and Vehicle tables with an index on each. Note that:
- Indexes can only be created on empty tables
- Indexes can’t be dropped after they are created
- Query performance is only improved using an equality predicate
- Indexes can only be created on a single field. Composite indexes and range indexes are not supported.
CREATE TABLE Person
CREATE INDEX ON Person(GovId)
CREATE TABLE Vehicle
CREATE INDEX ON Vehicle(VehicleId)
Insert sample data
Insert sample data into the tables
INSERT INTO Person
{
'GovId' : 'LEWISM1980',
'FirstName' : 'Matt',
'LastName' : 'Lewis',
'Email' : 'myemail@email.com',
'DOB' : `1980-09-25T`,
'Address' : 'TechHub Swansea, Swansea, Wales'
}
INSERT INTO Vehicle
{
'VRN' : 'ABC123',
'VehicleId' : 'AB123456',
'Make' : 'Tesla',
'Model' : 'Model S',
'KeeperId' : ''
}
Update sample data
While you can define fields that are intended to be unique identifiers (for example, the VehicleId), the true unique identifier of a document is the id metadata field. This means the best practice is to use the document id metadata field as a foreign key. This means we need to retrieve the document ID of the Person record, and use that to update the KeeperId field.
SELECT metadata.id FROM _ql_committed_Person AS p
WHERE p.data.GovId = 'LEWISM1980'
UPDATE Vehicle AS v
SET v.KeeperId = '<value-returned>'
WHERE v.VehicleId = 'AB123456'
We can then check this has worked by running the following:
SELECT KeeperId FROM Vehicle AS v
WHERE v.VehicleId = 'AB123456'
Query data
QLDB provides queryable views of your data based on the transaction information that is committed to the journal. Similar to views in relational databases, a view in QLDB is a projection of the data in a table. Views are maintained in real time, so that they’re always available for applications to query.
The user view shows the latest non-deleted revision of your application-defined data only. This is the default view in QLDB.
SELECT * FROM Vehicle
WHERE VehicleId = 'AB123456'
The committed view shows the latest non-deleted revision of both the user data and the system-generated metadata. This is the full system-defined table that corresponds directly to your user table. You can access the metadata by querying the committed view
. To query this view, add the prefix _ql_committed_
to the table name in your query. The prefix _ql_
is reserved in QLDB for system objects.
SELECT * FROM _ql_committed_Vehicle AS v
WHERE v.data.VehicleId = 'AB123456'
The returned document contains the following:
blockAddress
- The location of the block in the ledger’s journal where the document revision was committed. It contains the following fields:strandId
- The unique ID of the journal strand that contains the blocksequenceNo
- The index number that specifies the location of the block within the strand
hash
- The SHA-256 value that covers thedata
andmetadata
fields and can be used for crytopgraphic verificationdata
- The user datametadata
- The document’s metadata attributes:id
- The system assigned document IDversion
- The zero-based integer that increments with each document revisiontxTime
- The timestamp when the revision was committed to the journaltxId
- The unique ID of the transaction that committed the revision
Query history
The history function in QLDB is a PartiQL extension that returns revisions from the system-defined view of your table. So, it includes both your data and the associated metadata in the same schema as the committed view. To query the history we need to retrieve the document ID
SELECT metadata.id FROM _ql_committed_Vehicle AS v
WHERE v.data.VehicleId = 'AB123456'
Next we pass this into to the following:
SELECT * FROM history( Vehicle ) AS h
WHERE h.metadata.id = '<document-id>'
The output includes metadata attributes that provide details on when each item was modified, and by which transaction. From this data, you can see the following:
- The document is uniquely identified by its system-assigned document id
- An INSERT statement creates the initial revision of a document (version 0).
- Each subsequent update creates a new revision with the same document id and an incremented version number.
- The txId field indicates the transaction that committed each revision, and txTime shows when each was committed.
Verify integrity
Data integrity in QLDB means that your ledger’s journal is in fact immutable. This means that each document revision:
- Exists at the same location in the journal where it was first written
- It hasn’t been altered in any way since it was written
Before you can verify data, you must request a digest from your ledger and save it for later. Any document revision that is committed before the latest block covered by the digest is eligible for verification against that digest.
Then, you request a proof from Amazon QLDB for an eligible document revision that you want to verify. Using this proof, you call a client-side API to recalculate the digest, starting with your revision hash. As long as the previously saved digest is known and trusted outside of QLDB, the integrity of your document is proven if your recalculated digest hash matches the saved digest hash.
What you are specifically proving is that the document revision was not altered between the time that you saved this digest and when you run the verification.
Request a Digest
Amazon QLDB provides an API to request a digest that covers the current tip of the journal in your ledger. The tip of the journal refers to the latest committed block as of the time that QLDB receives your request.
aws qldb get-digest --name <ledger-name>
Verify a Document Revision
In order to verify a document revision, you need the document ID and block address of the document revision you want to verify.
SELECT v.metadata.id, v.blockAddress
FROM _ql_committed_Vehicle AS v
WHERE v.data.VehicleId = 'AB123456'
Amazon QLDB uses a modified binary hash tree, based on the Merkle tree, in which the leaf nodes are the set of all document hashes in your journal.
The root node represents the digest of the entire journal as of a point in time. QLDB uses SHA-256 as its cryptographic hash function, so each hash is a 256-bit value. Using a Merkle audit proof, you can verify that a document revision exists in your journal and is in the correct position relative to the digest, without having to check your ledger’s entire document history. You do this by traversing the tree from a leaf node to its root, meaning that you only need to compute the node hashes within this audit path. This process has a time complexity of log(n) nodes in the tree. A proof in QLDB is simply the list of node hashes required to mathematically transform any given leaf node hash (a document) into the root hash (the digest).
The following diagram illustrates the Amazon QLDB hash tree model. It shows a set of block hashes that rolls up to the top root node, which represents the digest of a journal strand. In a ledger with a single-strand journal, this root node is also the digest of the entire ledger.
When you run verification in the Console it returns two tabs.
The Block tab shows the contents of the block that contains the document you are verifying.
The Proof tab shows the contents of the proof returned by QLDB for your specified document and digest. It includes the following details:
- Revision hash — The SHA-256 value that uniquely represents the document revision that you are verifying.
- Proof hashes — The ordered list of hashes provided by QLDB that are used to recalculate the specified digest.
- Digest calculated — The hash that resulted from the series of Hash calculations done on the Revision hash. If this value matches your previously saved Digest, the verification is successful.
Non Functionals
QLDB is a serverless technology. It is fully managed and automatically scales to meet the needs of an application with no provisioning required. It is designed to handle hundreds of transactions per second. Amazon QLDB stores multiple copies of data across 3 availability zones to provide a high level of durability.