Byzantine fault tolerance of CRDTs

gsvarovsky · December 2, 2020, 7:06pm

With reference to the fault tolerance mentioned in my other CRDTs thread, here is some very recent research from an author of automerge, which offers a way to provide (at least) Sybil attack tolerance:

https://martin.kleppmann.com/2020/12/02/bloom-filter-hash-graph-sync.html#practical-relevance

mav · December 2, 2020, 9:37pm

This quote in particular is highly relevant to this project:

All of this is directly relevant for local-first peer-to-peer applications in which apps running on different devices need to sync up their state without necessarily trusting each other or relying on any trusted servers.

And this is a direct link to the paper: Byzantine Eventual Consistency and the Fundamental Limits of Peer-to-Peer Databases

I wondered about the risks of the growth

For a Bloom filter, as long as we grow the size of the filter proportionally to the number of items (here we have 10 bits per item), the false positive probability remains pretty much constant at about 0.8%.

but then see the risk can be managed by the frequency of updates

we keep track of the heads from the last time we reconciled our state with a particular node, and then the Bloom filter only needs to include commits that were added since the last reconciliation.

Bogard · December 3, 2020, 1:05am

This is very interesting stuff by Kleppman. Thanks for posting it here @gsvarovsky. I’m curious how this may influence @maidsafe’s stellar efforts with bft-crdt. Is any aspect of routing not invariant confluent? This could potentially affect node aging re:sybil.

dirvine · December 3, 2020, 1:33am

At the moment we are ensuring this is not the case. We have constraints across algorithms. Sounds weird, but node age takes care of valid voters in a set, relocation takes care of authority to join a section. Apart from that we have simple rules, such as you cannot vote or observe a remove node unless there is an already agreed add node. With all this we reduce the possible network Events that need consensus to ungraceful Leaves ( a node drops) and penalties based on Observations. In terms of relocation/join/graceful leave we have no consensus issues past order of these events. The algo we are finalising now does this in the simplest possible way that we can see.

tl;dr we are not using a God algorithm for all, we are ordering where we must, partial order where we can and consensus only on non provable events, at the same time we are making most events self validating. i.e. we have went a bit deeper. I need to read into their use of bloom filters as we have looked at these a lot in the past bloom/bloomier/cuckoo etc. They can help reduce a set of Events, but with an error rate. Here Martin seems to say the error is circa 0.8% and that works, but it really does depend on how long the filter is in-place with relation to the number of events it is responsible for, as it fill the error rate definitely increases.

Bogard · December 3, 2020, 1:48am

Not surprised that you guys had already considered something like this. Even the assumptions reminded me of Safe’s current design, e.g., “We assume that all messages sent over these links are signed with the sender’s private key, and the recipient ignores messages with invalid signatures.”

mav · December 4, 2020, 1:22am

Greg Maxwell and Peter Wuille are part of the discussion of this paper on hacker news just now:

https://news.ycombinator.com/item?id=25278128

dirvine · December 4, 2020, 10:23am

Yes, it’s an interesting convo. They have focused on the bloom filter part that in this case looks very effective (it’s for transmitting as opposed to holding as a filter). The actual blog post itself by Martin is not really doing justice to his paper IMO https://arxiv.org/pdf/2012.00472.pdf which is fascinatingly close to what we are looking at from a slightly different angle, so a very food find @gsvarovsky

Topic		Replies	Views
Update May 12, 2022 Updates	39	2314	May 27, 2022
Update August 5th, 2021 Updates	317	8476	August 30, 2021
MaidSafe Dev Update - May 24, 2018 - Introducing PARSEC Updates	296	18127	March 27, 2019
Node age and relocation Development	26	960	January 1, 2023
SAFE Network Dev Update - August 6, 2020 Updates	23	2474	August 23, 2020

Byzantine fault tolerance of CRDTs

Related Topics