Here are some of the main things to highlight since the last dev update:
sn_nodehas had its first new releases since July!
sn_routinghas had its first new releases since August 2018!
- We have
sn_clienttests working and seemingly running stable.
- We have resolved the main issue with section startup in CI.
- We are getting down the the last few remaining failing sn_api tests.
- Working prototype code for the Generation Clock approach to dynamic membership is expected by the end of the week.
- Two important
sn_routingPRs have been merged this week, a measure to defend against key selling attacks and carrying out resource proofing during potential new node bootstrap
Safe Client, Nodes and qp2p
This week was a bit of a milestone for
sn_node with the first release and publish of that crate under its new name, and the first since July. There has been a lot of changes in there in the months since then, as the changelog so clearly highlights, and as we’ve been trying to keep you informed of in these updates. These releases don’t mean that development is finished in
sn_node, just that we considered it stable enough to now merge the previously created Continuous Delivery pipeline PR. Development continues at pace, and now each merged PR will result in a further automatically generated new release.
We merged work to improved error handling in sn_node, which clears the path for progression. After removing the caching of Sequences last week, this week we exposed the client to some more normal network conditions, which resulted in several failing tests. With some tweaks to the client tests to account for this, they are now stable again. On top of this, we’ve been looking at section startup, which locally has been running fine, but CI has been having a bit of a hard time with. We’ve established that some of this is due to the CI machines being slow, and increasing timeouts between nodes starting there has been getting us more reliable results. There are still some bugs to be ironed out here.
We have made some more progress with some of the failing tests during end to end testing of the network, specifically regarding the storage of data using the Blob data type. We identified some issues that sneaked in during some of the refactors, and with those fixed we are now seeing the tests passing consistently. We are now moving on to re-enabling the chunk replication system when nodes leave the network. Once that’s done, all Blob flows will be re-enabled in the refactored implementation of sn_node.
Another feature that we have been working on this week is the regulation of storage in a section, and ultimately in the network, via PR#1153. As of that PR, the data-storing nodes monitor their storage on every write they perform, warning their section elders if they reach their max capacity. The elders then maintain a register of these filled nodes and when they reach a threshold of filled nodes in the same section, they would vote to accept new nodes to join their section, closing the gates once supply/demand is met, hence maintaining a balance in the network. We will be iterating through with various tweaks to metrics in the coming days.
A PR for Rust-CRDT has been raised to bring LSeq in line with other CRDT types in needing
apply to be called before modifying the data type itself. We’re using this in our
Sequence data type to ensure that all operations are signed.
Changes to sn_api and CLI continued this past week, adapting to new
sn_client APIs, and as mentioned last week, also trying to migrate to new UX terminologies. We have most of the sn_api tests passing now using a local section, and we are now trying to finalise these changes as well as resolve the failing tests, which should hopefully mean the CLI commands and E2E tests will be up and running again.
And finally, in order to allow increased test coverage of transfers, payments, and rewards modules, some restructuring of that code and changes to access granularity has this week been initiated.
BRB - Byzantine Reliable Broadcast
Work continues on the Generation Clock approach to dynamic membership. Working prototype code is expected by the end of the week.
On a parallel track, the bft-crdts crate is being split into separate crates in a modularization effort. The idea is to define 3 traits: one for BRB implementation, one for data types to be transmitted and secured, and one for the network layer.
In this manner, implementations of all three traits can be mixed and matched. For example, we can use an in-memory network use for test cases, a qp2p impl for routing in Safe Network, and a 3rd party might use a TCP/IP sockets implementation for something else. On the data side, we already have an AT2 bank implementation and a CRDT orswot implementation.
As mentioned above,
sn_node was published and released for the first time in its current form this week. Not to be outdone, the routing team have now also released and published
sn_routing, the first release in almost 2.5 years! See the changelog for a flavour of the changes made. As with
sn_node, these releases don’t mean that development is finished in
sn_routing, just that we considered it stable enough to now merge the previously created Continuous Delivery pipeline PR. Development continues at pace, and now each merged PR will result in a further automatically generated new release.
This week we implemented a measure to defend against key selling attacks. A key selling attack is when a node that has built up its reputation sells its secret key to a potentially malicious entity who can then assume its role and do damage. This type of attack is very hard to completely prevent, but we at least made it difficult for people to access the secret key by making sure the key is not exposed or stored on disk.
We also worked to introduce the process of carrying out resource proofing during bootstrap, which has been merged today. This challenges any potential new joining nodes to ensure they are sufficiently qualified to share network workload. With the resource proofing check in place, all newly joined nodes will be considered as
adults immediately - the handling of
infant nodes is now no longer required.
We have also been working refactoring section update which is intended to simplify the DKG handling codebase. The two main changes proposed to achieve this are:
- DKG failures will no longer be reported to the current section elders, instead it restarts itself.
- The separate DKG result accumulator will be removed, instead using the regular vote accumulator.
However, after discussions during the review process there are more issues spotted which we have decided to resolve. Hence the PR has been changed to Draft status and is expected to be ready for review again, with the additions, soon.
Feel free to reply below with links to translations of this dev update and moderators will add them here:
As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!