Using CRDTs v5.6
Using CRDTs in tables
Permissions required
PGD CRDTs requires usage access to CRDT types. Therefore, any user needing to access CRDT types must have at least the bdr_application role assigned to them.
To use CRDTs, you need to use a particular data type in CREATE/ALTER TABLE rather than standard built-in data types such as integer
. For example, consider the following table with one regular integer counter and a single row:
Non-CRDT example
Suppose you issue the following SQL on two different nodes at same time:
After both updates are applied, you can see the resulting values using this query:
This code shows that you lost one of the increments due to the update_if_newer
conflict resolver.
CRDT example
To use a CRDT counter data type instead, you would follow these steps:
Create the table but with a CRDT (bdr.crdt_gcounter
) as the counters data type.
Configure the table for column-level conflict resolution:
And then insert a row with a value for this example.
If you now issue, as before, the same SQL on two nodes at same time.
Once the changes are applied, you find that the counter has managed to concurrenct updates.
This example shows that the CRDT correctly allows the accumulator columns to work, even in the face of asynchronous concurrent updates that otherwise conflict.
Configuration options
The bdr.crdt_raw_value
configuration option determines whether queries return the current value or the full internal state of the CRDT type. By default, only the current numeric value is returned. When set to true
, queries return representation of the full state. You can use the special hash operator
(#
) to request only the current numeric value without using the special operator (the default behavior). If the full state is dumped using bdr.crdt_raw_value = on
, then the value can reload only with bdr.crdt_raw_value = on
.
Note
The bdr.crdt_raw_value
applies formatting only of data returned to clients, that is, simple column references in the select list. Any column references in other parts of the query (such as WHERE
clause or even expressions in the select list) might still require use of the #
operator.
Different types of CRDTs
The crdt_gcounter
type is an example of state-based CRDT types that work only with reflexive UPDATE SQL, such as x = x + 1
, as the example shows.
Another class of CRDTs are delta CRDT types. These are a special subclass of operation-based CRDT.
With delta CRDTs, any update to a value is compared to the previous value on the same node. Then a change is applied as a delta on all other nodes.
Suppose you issue the following SQL on two nodes at same time:
After both updates are applied, you can see the resulting values using this query:
With a regular integer
column, the result is 2
. But when you update the row with a delta CRDT counter, you start with the OLD row version, make a NEW row version, and send both to the remote node. There, compare them with the version found there (e.g., the LOCAL version). Standard CRDTs merge the NEW and the LOCAL version, while delta CRDTs compare the OLD and NEW versions and apply the delta
to the LOCAL version.
Query planning and optimization
An important question is how query planning and optimization works with these new data types. CRDT types are handled transparently. Both ANALYZE
and the optimizer work, so estimation and query planning works fine without having to do anything else.