Byzantine fault tolerance of CRDTs

With reference to the fault tolerance mentioned in my other CRDTs thread, here is some very recent research from an author of automerge, which offers a way to provide (at least) Sybil attack tolerance:

https://martin.kleppmann.com/2020/12/02/bloom-filter-hash-graph-sync.html#practical-relevance

16 Likes

This quote in particular is highly relevant to this project:

All of this is directly relevant for local-first peer-to-peer applications in which apps running on different devices need to sync up their state without necessarily trusting each other or relying on any trusted servers.

And this is a direct link to the paper: Byzantine Eventual Consistency and the Fundamental Limits of Peer-to-Peer Databases

I wondered about the risks of the growth

For a Bloom filter, as long as we grow the size of the filter proportionally to the number of items (here we have 10 bits per item), the false positive probability remains pretty much constant at about 0.8%.

but then see the risk can be managed by the frequency of updates

we keep track of the heads from the last time we reconciled our state with a particular node, and then the Bloom filter only needs to include commits that were added since the last reconciliation.

11 Likes

This is very interesting stuff by Kleppman. Thanks for posting it here @gsvarovsky. I’m curious how this may influence @maidsafe’s stellar efforts with bft-crdt. Is any aspect of routing not invariant confluent? This could potentially affect node aging re:sybil.

3 Likes

At the moment we are ensuring this is not the case. We have constraints across algorithms. Sounds weird, but node age takes care of valid voters in a set, relocation takes care of authority to join a section. Apart from that we have simple rules, such as you cannot vote or observe a remove node unless there is an already agreed add node. With all this we reduce the possible network Events that need consensus to ungraceful Leaves ( a node drops) and penalties based on Observations. In terms of relocation/join/graceful leave we have no consensus issues past order of these events. The algo we are finalising now does this in the simplest possible way that we can see.

tl;dr we are not using a God algorithm for all, we are ordering where we must, partial order where we can and consensus only on non provable events, at the same time we are making most events self validating. i.e. we have went a bit deeper. I need to read into their use of bloom filters as we have looked at these a lot in the past bloom/bloomier/cuckoo etc. They can help reduce a set of Events, but with an error rate. Here Martin seems to say the error is circa 0.8% and that works, but it really does depend on how long the filter is in-place with relation to the number of events it is responsible for, as it fill the error rate definitely increases.

11 Likes

Not surprised that you guys had already considered something like this. Even the assumptions reminded me of Safe’s current design, e.g., “We assume that all messages sent over these links are signed with the sender’s private key, and the recipient ignores messages with invalid signatures.”

10 Likes

Greg Maxwell and Peter Wuille are part of the discussion of this paper on hacker news just now:

https://news.ycombinator.com/item?id=25278128

8 Likes

Yes, it’s an interesting convo. They have focused on the bloom filter part that in this case looks very effective (it’s for transmitting as opposed to holding as a filter). The actual blog post itself by Martin is not really doing justice to his paper IMO https://arxiv.org/pdf/2012.00472.pdf which is fascinatingly close to what we are looking at from a slightly different angle, so a very food find @gsvarovsky

5 Likes