Update 30 September, 2021

As we mentioned a few weeks back, the old Sequence data type for mutable data was replaced by Register. Registers can do everything Sequence could in terms of storing previous versions with the added bonus that it can also handle concurrency. This week David Rusu lifts the lid on the mysterious world of Merkle Registers.

General progress

David Rusu has pretty much completed the basic functionality for the client writing the Spentbook. Once it’s fully in place it will make the Mint’s task read-only: the Mint will only need to check the Spentbook for DBC transaction commitments to verify that all inputs are committed to the same transaction and validate the transaction balances.

@Chris.connelly continues to wrestle with qp2p. Many of the jobs it does relating to incoming messages are probably obsolete now we no longer use the connection pool to keep connections open, and bringing connections and incoming message handling inside the safe_network repo should improve stability. It’s a bit of work though!

Chris also identified a long-standing bug in the qp2p tests, which is now fixed.

@lionel.faber is focused on CI/CD and launch tools and updated the automation tools to build on a cloud VM to launch a full-fledged testnet. Anyone can use this tool, provided they have the required credentials for AWS and Digital Ocean. The tool is configurable and can be triggered as required. Internally we are using manual triggers and a special launch-testnet comment in PR reviews.

Work on the CLI and API continues to get them compatible with the latest safe_network version … it’s getting there, but as @Chriso mentioned on Tuesday there are still some gremlins gnawing cables, including some issues around file retrieval with the CLI.

We’ve been setting up some basic testnet log asserts, which should help ensure we’re not looping messages anymore (and indeed has already caught some minor message amplification). This is still not merged as we’re continuing to hack away at stability corner cases.

This week we’ve rejigged client retries to be more sane, which has helped; we’ve squashed some false errors occasionally being produced, which was due to us relying on tokio errors (when the actual task itself had completed fine). And @yogesh has started in on improving client bootstrapping to the network, which currently goes for any contactable node, but doesn’t attempt to establish the state of the network, which later leads to more AE-messages than are necessary.

Merkle Register

This week we’d like to do a deep dive on the Merkle Register, one of the two primitive datatypes in the Safe Network (the other being immutable chunks).

The Merkle Register is a novel CRDT we’ve come up with to model mutable state in the Safe Network, it supports concurrent writes and history traversal. They are used extensively in NRS to maintain name mappings, used to track versions of a file(s) and any app that needs mutable state.

The Register CRDT Family

Registers are used to write and read a value, think CPU registers. Existing Register CRDTs are the Multi-Value Register (MVReg) and the Last-Write-Wins Register (LWWReg). Our Merkle Register (MerkleReg) is yet another CRDT in this family.

All registers follow a similar API.

# You can read the current value(s) stored in the register
read(reg) -> val(s)

# You can write a value to a register using a causal context
write(reg, val, ctx) -> reg

# Merge two registers
merge(reg1, reg2) -> merged-reg

The MerkleReg is built on top of a Merkle DAG. The Merkle DAG is a generalisation on the Merkle Tree that relaxes nodes to have multiple parents and place data at each node.

With Merkle Trees, we arrange data at the leafs of a binary tree and hash them in pairs to get the next level of nodes, we repeat this by hashing pairs of these hashes until we reach the root. It’s an incredibly flexible data-structure, used in many decentralised systems. Merkle DAG’s remove the binary tree restriction. We can place data at each node now and each node may have multiple parents.

The way we build the MerkleReg is to use the roots of the Merkle DAG as the current register values. For example, consider the following series of operations.

Start off with reading the empty DAG, we get back the empty set.

read(∅) = {}

Write a to the register, we use the empty set as the context.

                  ∅
write(∅, a, {}) = |
                  a

Concurrently, another client writes to the same register.

                  ∅
write(∅, b, {}) = |
                  b

Once the two clients push their state up, we can merge the two concurrent registers.

      ∅  ∅     ∅
merge(|, |) = / \
      a  b   a   b

Reading the merged register gives back both concurrent values.

       ∅  
read( / \ ) = {a, b}
     a   b

We can now resolve the concurrent values by using both a and b as the write context.

                            ∅
        ∅                  / \
write( / \ , c, {a, b}) = a   b
      a   b                \ /
                            c

Reading the final register only returns c.

        ∅
       / \
read( a   b ) = {c}
       \ /  
        c

Now a nice property of this MerkleReg compared to some of the other Register CRDTs is that we have the entire branching history of the register maintained in the Merkle DAG. If we are tracking a document, we can traverse its history to see how it evolved.

MultiSets and MultiMaps

We can build MultiSets and MultiMaps on top of Merkle Registers quite simply by taking advantage of how write contexts work.

Entries written to the register with an empty context will appear when we read the values back, this simulates a Set CRDT.

Start with the empty register.

read(∅) = {}

After writing y with empty context.

      ∅
read( | ) = {y}
      y 

Then writing x with empty context.

        ∅
read(  / \  ) = {x, y}
      x   y

Writing z with empty context.

         ∅
read(  / | \  ) = {x, y, z}
      x  y  z

Now we can simulate deletes by assigning a special “trash” node:

        ∅
read( / | \  ) = {x, y, 🗑} 
     x  y  🗑

Merge y with :wastebasket: to delete it.

        ∅
read( / | \  ) = {x, 🗑}
     x  y  🗑
         \ /
          🗑

That gives us a Set CRDT with inserts and removes. To get a Map CRDT we make the set members a 2-tuple key:value:

          ∅
read(  /  |  \  ) = {x:1, y:2, 🗑} 
     x:1 y:2 🗑

Updating x’s value is done by using the current x entries as children:
Here we update x to store 2 instead of 1

          ∅
read(  /  |  \ ) = {x:2, y:2, 🗑} 
     x:1 y:2 🗑
      |
     x:2

Since this is a concurrently edited structure, we may end up with multiple values written for a key, (hence the multi in Multi-Map):

Another client concurrently writes x:3

          ∅
       /  |  \
read( x:1 x:3 🗑 ) = {x:2, x:3, 🗑} 
       |
      x:2

These multiple entries can be resolved by writing back an entry for x with both x:2 and x:3 as children.

          ∅
        /  |  \ 
read( x:1 x:3 🗑 ) = {x:3, 🗑} 
       |   |
      x:2  |
        \  /
        x:3

This gives us a MultiMap CRDT with history and merging of concurrent values!

Implications of the MerkleReg

The MerkleReg replaces the old Sequence datatype, we found that trying to model linear sequences in a partially ordered network was not going to have clean semantics. Instead we embrace forks, the result is that applications need to deal with concurrent values. For example, a browser built on the Safe Network must deal with the possibility a website has multiple concurrent index.html pages. NRS will have to deal with multiple entries for the same name.

This is not as big of a problem as you may think, most large scale systems are already working with this model. For example DNS allows for multiple IPs associated with a domain, and things (generally) still work.


Useful Links

Feel free to reply below with links to translations of this dev update and moderators will add them here:

:russia: Russian ; :germany: German ; :spain: Spanish ; :france: French; :bulgaria: Bulgarian

As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!

64 Likes

First finally??

Edit:now to read…

23 Likes

Thanks so much to everyone for all of your hard work and dedication! :racehorse: Remember not to work too hard. :racehorse:

17 Likes

Keep up the great work! :+1:

16 Likes

Things are really rounding out! Love to see this.

I have a feeling the next Testnet is gonna be a lot more reliable than previous.

Keep up the great work everyone @maidsafe. You all are our inspiration.

21 Likes

Fifth :(. H

10 Likes

Thank you all for your hard work. I’m going to have to reread the section on Merkle Registers a few times to get my head round it though…

It really looks like its all coming together now. Can’t wait for the next testnet…

15 Likes

Iv now read about Merkle Registers and fear I will need to buy Southside that pint so he can explain it to me once he has had time to digest it!!

Great work team keep up the good work and ill try to catch up!!

15 Likes

Not first, but still proud!!

Good work @maidsafe :tada:

15 Likes

I must admit, the technical details of these Merkle data structures lose me in the details, but I can see that it is a huge step forward in flexibility and function.

I was always a bit worried that the air-tight data security structure of xor node consensus might make some functionality limited. It would still be invaluable, but would handicap universal functionality. Leave it to the MaidSafe to keep at it and not settle for anything short of amazing.

In the end it will be like “Duh! It’s all so obvious!” Cheers to @dirvine for being able to look past the context that has been and bring into existence the context that must be for freedom to advance. Sometimes it just take a Scotsman!

18 Likes

Oh the Merkle part is all a Canadians doing (David Rusu) :slight_smile: It’s a very cool approach though for us and CRDT data. Linked to the DBC work, Ae and data chains we have something pretty cool

21 Likes

Thanks for the elaboration of Registers, it’s something I’ve been curious to hear about and of course I have questions!

  • is the purpose of the tree to provide a version history for a Register, or something else, such as concensus or in other words the CRDT capability?

  • is this orthogonal to the CRDT Tree developed earlier, or will it also be the basis for a new way of doing a CRDT Tree, or even doing the NFS directly (without a CRDT Tree)?

As always, thanks everyone for all the work we can see going on day after day. You guys really are blazing a trail here and I’m looking forward to kicking the tyres again ‘soon’ :wink:

21 Likes

It allows validly concurrent writes to be applied. So this is a fork if you like. It’s to allow CRDT really and data correctness. In the crypto world, a fork is called a double-spend and seen as a horrible thing, but in the data world, it’s sometimes required, like git or document editing etc. So here we have valid forks and they are “handled” by letting apps say which result they wish or to merge them to a newly agreed tip of the tree.

Nice catch, we can still implement the safe file system in registers like this and intend to get that done as we progress. However we still have the crdt-tree in our back pockets, it’s just a bit more complex and the register a lot simpler. So the answer is yes we can safe-fs this but we need to test it properly @danda was chatting about the filesystem this week in slack, it’s just dbc etc. took over and still will for a few weeks yet.

The guys are doing amazing stuff while I Am still in admin land a good bit of the time. Interviews etc. to come and we can sort that, but it keeps me distracted an awful lot right now. I am glad the guys have our backs here and are going at this pace to launch though.

23 Likes

The idea that has been floated a while back is to use the merkle dag internally within the crdt-tree for storing the tree’s log. So crdt-tree would still form basis of the sn_fs filesystem, but its implementation would change a bit. However, David Rusu and I are both full time on dbc’s now, so the filesystem work has been put on hold for a while. There is already the FilesContainer mechanism for storing files, so mountable sn_fs could come later… not seen as necessary for an MVP.

18 Likes

Any news on the scientific paper to be published from a couple of year ago? Or is that research no longer relevant it the current design?

3 Likes

Agreed, great to hear the ideas though, and also that you two are sorting out DBCs which are to me very exciting indeed.

13 Likes

It (PARSEC) is no longer relevant, but that shouldn’t affect the decision to accept for publication or not.

12 Likes

Exactly. PARSEC was an innovation and worked, just not for what SAFE needed. It’ll get published eventually, but my experience has been that academic pipelines have been really slowed down by covid as well.

8 Likes

Fantastic update team Maidsafe! Thanks for your hard work.

mind-blown

I need to obtain a welding helmet, as my sunglasses just aren’t sufficient for tracking the project anymore.

12 Likes