Here are some of the main things to highlight since the last dev update:
- Internal testnets continue to help us run end to end tests and pinpoint issues.
StoreCostmetrics have been just about nailed down after our initial experimentation.
- Some major Routing refactoring PRs here and here were merged to master this week, making it easier to read and understand, and therefore debug, onboard and contribute to.
- We are re-hiring a CRDT consultant to help us achieve our objective of developing a permissionless network of autonomous agents collaboratively hosting CRDT data.
We’ve continued down the refactor road for our keypair types in the API. We have the basics in place now to use either Ed25519 keys (which would be the default) or BLS keys for clients/wallets, etc. The focus there this week has been on updating the code base and tests. We’re now looking at some finer points around how the various keys allow or disallow cloning, and will be looking to refactor things in general to prevent the need for cloning entirely (which should be safer code-wise). Though that will likely come in a follow-up PR.
We are moving forward this week with more internal testing. Testnet results are getting more and more consistent with various bugs being hunted down and fixed. One of the major fixes that is in progress is within inter-section communication. Elders can send messages to other Sections either as individual nodes or as a part of their Section to prove Network Authority. There are a few messages that do not need to be accumulated, for example client messages that need to be forwarded to its data Section, therefore
sn_node had its own layer of messaging using
sn_routing to overcome this. But this left us with a disadvantage as we could no longer validate the forwarding intermediate Authority. Therefore the upcoming change brings in tighter and more secure messaging done solely within
sn_routing, while also removing an extra layer of messaging at
StoreCost metrics have also been more or less nailed down in the last week with the testnets seeing stable and reasonable fluctuations which should be good for us to start with. Up next, we’ll begin to chaos test these dynamics alongside data operations to optimise the metrics based on observations. This allows us to simulate the effects of running a public testnet where randomness cannot be bounded, helping to prepare us for what to expect when a testnet is released, and to potentially adjust the metrics accordingly.
This week we first got the major refactoring work to remove SharedState, introduce Node, Section and Network modules merged. This simplifies the Routing crate’s structure, making it easier to read and understand, and therefore debug, onboard and contribute to.
There was then further cleanup work merged which removes the remaining state machine in Routing. This removed the Bootstrapping and Joining states. The bootstrapping process now happens inside
Routing::new, so when that function returns, the node is already connected to a section.
Meanwhile, the broken minimal example was fixed this week in this PR. Having this running again has already helped us to reproduce some potential misbehaviours that were being observed by upper layer tests using Routing. We are investigating those closely now.
There was also a mysterious crate dependency CI failure which was fixed by no longer using serde macro derive. We suspect the recent Rust update meant our previous macro derive shortcut was no longer supported, so we will now use a safer import-on-use approach.
More testing and experimentation has been done on the CRDT PoC, this time experimenting with different types of messaging mechanisms when an Elder is being voted for removal. We are trying to use proptest to help us prove/validate these mechanisms can work accordingly in edge cases.
This week we also spent some time working on a separate PoC for dynamic membership in a section using distributed secure broadcast (from the AT2 paper) to provide Byzantine fault tolerance. Our bft-crdts implementation already supports reaching consensus over adding a peer, so the task at hand is to add support for peer removal. A peer may be removed voluntarily or forcefully if detected to be faulty. Both cases require a round of voting to reach consensus, but the latter is more complex as each voting peer must also detect that the peer is faulty. We have some encouraging preliminary results, but this remains a work in progress.
Some good news for the project this week - we have agreed to again hire a CRDT consultant (same person we hired recently, at the beginning of our CRDT investigations) to help us achieve our objective of developing a permissionless network of autonomous agents collaboratively hosting CRDT data.
We’ve settled on 3 changes we need to make to rust-crdt in order to achieve this goal:
- All Causal CRDTs need to be modified to reject Ops who have been delivered out of “causal” order AND report back a summary of the missing Ops required to apply the given Op.
- Remove internal buffering of out-of-order Ops in ORSWOT and Map.
- Introduce a Causality Barrier to bring back the buffering behaviour for users who rely on the automatic buffering done in ORSWOT and Map.
The CRDT consultant is someone who is very familiar to us, and us to them, and who we trust to help us deliver. We expect that they will be onboard with us for a few weeks, in which time we hope to achieve the following:
- All Causal CRDTs present in rust-crdt will be hardened to reject out-of-order Ops.
- All Causal CRDTs present in rust-crdt will be modified to respond with a summary of dependent ops that are missing before an Op may be applied.
- ORSWOT and Map will be modified to remove internal Buffering.
CausalityBarrierwill be implemented to optionally add back buffering.
Note - please respect everyone’s right to confidentiality and do not speculate on names.
Feel free to reply below with links to translations of this dev update and moderators will add them here:
As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!