Safe Network Dev Update - January 7, 2021

Summary

Here are some of the main things to highlight since the last dev update:

  • Happy New Year! :fireworks:
  • We are excited to announce that the Dynamic Membership work has been completed and we have today published several Byzantine Reliable Broadcast (BRB) crates on GitHub :tada:. All are linked from the central brb crate.
  • We’ve been working hard, even over the holidays, to improve our errors and comms layer, including creating a new sn_messaging crate.
  • We are in the process of preparing some minor fixes for a new patch release of the CLI.
  • We are also close to having the CLI and authd built with musl, meaning compatibility with a larger range of platforms.
  • We have a new stress testing tool for routing with the PR currently under review. This new tool is designed to discover the limits of routing in terms of how it handles membership changes (churn) and has already brought some issues to our attention.

Testnet Update

Thanks to everyone who has taken time to try out the testnet code released before Christmas. With all of the MaidSafe team now back at their desks, we are continuing to work through some issues we identified before release, and others you highlighted for us. Once we are satisfied that these issues have been resolved we will announce another iteration, with the intention being that we will host a public testnet for everyone who can, to connect to.

Safe Client, Nodes and qp2p

Safe Network Transfers Project Plan
Safe Client Project Plan
Safe Network Node Project Plan

We’ve been working to improve our Error story throughout our libs, and have transitioned to using thiserror throughout node/data_types/client/transfers to provide a better error chain and some greater encapsulation of functionality. Previously, we were using a lot of mixed errors, pulling a lot from sn_data_types into other libs. Now we have specific errors in each lib, for that lib, and only propagate errors from lower libs as another version of the current lib’s errors.

On top of this, we’ve extracted sn_messaging from sn_data_types into its own crate in order to separate out our comms layer, as well as errors that we’ll be sending to/from other nodes and clients. This is a small step towards more clearly defining a network ‘API’ of messages and errors and it provides cleaner separation of errors from internal libraries to the client.

As part of this effort, we are exploring different serialisation types, with the end goal of having one which is programming language agnostic. We are at the moment focusing on a simple JSON serialisation (as opposed to currently used bincode), but also playing around with Msgpack.

The knock on effect of all this has been some cleaner code, and much clearer error flows throughout all the involved libs, which is great.

In tandem, we’ve been removing client “challenges” from the node/client bootstrap flow. These were previously used to verify a client was holding keys, in order to prevent message replay attacks. But with idempotency coming from AT2 and CRDT data types, this will be handled there. Yet more simplification for both the client and the node, and further clarifies network operations as signed messages only.

Previously, to prevent key-selling attacks on the network, we removed all SecretKey exposing APIs from sn_routing and contained them only within their crate. However, we found that there were multiple complications down the dependency tree caused by this removal, and so agreed to bring back those APIs to allow us to move ahead quickly with the testnet during the holidays. We have decided to tackle this problem head-on right away, and have started refactoring the sn_transfers and sn_node crates, where we hold and use those SecretKeys outside of sn_routing.

The signature-aggregate work carried out by sn_node during exchanging messages among KeySection and DataSection is possibly duplicated with routing’s consensus accumulation work, as both are actually being undertaken by elder nodes. We are investigating and carrying out some refactoring work trying to remove this part from the sn_node crate, and to trust the consensus messages from sn_routing.

And a final wee bit of work is underway removing stream management from nodes. This was put in place to maintain comms with clients, but with recent qp2p changes we can rely on connection pooling there to handle this for us, and so remove a lot of complexity from the node’s client handling. We are also in the process of refactoring the qp2p examples into separate parts to demonstrate the echo_service and messaging systems clearly and distinctly. We are doing trial runs with these examples with manual port forwarding to potentially support routers not compatible with IGD in further testnets.

API and CLI

We have been focusing on changes and improvements on the network side, however, we have still been working to take care of some minor bugs that have been reported by the community while using the testnet and so are in the process of preparing some minor fixes for a new patch release of the CLI.

Also, we are trying to get our next release of CLI and authd to be built with musl, which as we know will allow us to run these applications on many more platforms using the same released artifacts. We were able to build them manually already (thanks a lot to @mav and @tfa for their input and contributions to this), so we are now looking to get this into our CI in the coming days.

BRB - Byzantine Reliable Broadcast

We are excited to announce that the Dynamic Membership work has been completed and we have today published several Byzantine Reliable Broadcast (BRB) crates on GitHub :tada:. All are linked from the central brb crate.

The BRB system consists of:
1. The core BRB broadcast protocol for members of a quorum to replicate data in BFT fashion.
2. The dynamic membership protocol for nodes to dynamically join and leave an active quorum.
3. Data type wrappers that encapsulate compatible data types (e.g. CRDTs) for transfer via BRB.
4. Comprehensive tests to verify correctness.
5. brb_node_qp2p: an example CLI app/node for manually invoking BRB functionality.

For those interested in digging into the details, slides are available and provide further insight into the system and protocols.

Routing

Project Plan

As we all know, relocation is good for the network, facilitating node ageing amongst other things. However, we observed that in some situations we were over relocating. For example, we were relocating even when we did not have enough elders due to churn, and also relocating nodes when they had newly joined. To resolve, we set up some criteria to avoid over relocation aiming to keep the network stable during certain scenarios.

An API change was also undertaken to return section info for specified target name. This is mainly for sn_node usage for upcoming refactoring work.

We put together a stress test for routing (PR under review). It’s a little tool designed to discover the limits of routing in terms of how it handles membership changes (churn). It generates random churn according to a configurable schedule. It then periodically outputs various useful information about the network, and measures the network health. This tool will be very useful for the upcoming work of integrating the new dynamic membership solution. Running it on the current version of routing, it already discovered some issues we have around relocations and splits, which we will look into soon. This is actually good, because the first step of fixing a problem is knowing about the problem :smile: Here is a little screenshot of the tool’s output:

Useful Links


Feel free to reply below with links to translations of this dev update and moderators will add them here:

:russia: Russian ; :germany: German ; :spain: Spanish ; :france: French

As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!

74 Likes

Zeroth! Starting as I mean to go on!

Looks like some were busy over the holidays. I wasn’t expecting so much activity, so this is nice. Thanks all at Maidsafe. Hope you had a nice break as well.

Meanwhile in Safe Git Portal land, things have not been resting either…

43 Likes

First

Happy New Year :slight_smile:

17 Likes

Thanks so much to the entire Maidsafe team for all of your hard work! Especially over the holidays! And Happy New Year everyone! :racehorse:

30 Likes

A bumper update with lots of meat!

Thank you for working so hard, even through the holidays.

…and a Happy New Year!!!

22 Likes

merry xmas, happy new year and lets sink our teeth into 2021 - the year of Fleming :slight_smile:

great first update!!! I´m happy that SafeNet-Thursdays are back

20 Likes

Best update yet! I don’t have much to say besides Bravo! And thank you @maidsafe team for your constant dedication and effort.

18 Likes

Interesting numbering system you are using there , @happybeing. I have always admired your innovation and alternative viewpoints. :slight_smile:
Like you I am heartened and not a little surprised at just how much meat there is in this update. Truly this will be the Year of Fleming.

And a special Happy New Year to @MaxSan - cos it might just be the Year of Maxwell and all :slight_smile:
Excellent work folks, we are not worthy.

22 Likes

image

20 Likes

Comes of studying Physics. You aren’t the first to think they’ve got the first (law of thermodynamics) only to realise you missed one. Amazing I remember any of that. I couldn’t tell you what the zeroth law is though.

12 Likes

I’m just amazed at your ability to magically look as if it was you got the zeroeth response :slight_smile:
I dont know how you did it but well done.
Dunno about thermodynamics but I know who really got the initial comment in and I will not have this sacred victory torn from my grasp by the forces of fake-news and err err err.

9 Likes

Would be 11th if others would not keep posting twice before I have a chance ! :open_mouth:

Best update of 2021 :+1:

I was wondering that the testnet over-stresses 1 PC as 12 nodes… a bit slow perhaps to realize that later we might host one node each and so a lot more performance will follow.

10 Likes

The zeroth law of thermodynamics states that if two thermodynamic systems are each in thermal equilibrium with a third one, then they are in thermal equilibrium with each other. Accordingly, thermal equilibrium between systems is a transitive relation.

Which just sounds like narcissisism and attention-grabbing by revisionist mathematicians

2 Likes

Looking forward to this! Let’s get the core tech proven and then we can shout it from the roof tops!

17 Likes

Preach it, brother.

We should all be making an effort to run the testnet and probe the limits. THanks to @folaht on the Dev forum, I think I am ready to get a LAN example working between 2-3 different machines if I can get the configurations correct on each box. Strange, although this is a decentralised system, IIUC to get this running I need one machine as an authd “server”

Anyhow off to download and build the latest versions and play about once I have fed the family.

8 Likes

There is a lot of bugs crushed recently and some important PRs in flight. So no rush a few days will give you a much better experience.

27 Likes

woohoo ! These are great news ! Thank you for all the hard work. I hope the team managed to enjoy well deserved holidays. I wish everyone a happy new Safe year.

13 Likes

is that Maidsafe corporate-speak for " We can make it work in-house but there are many rough edges - please don’t inundate us with daft questions about stuff that we know we will sort soon"?

8 Likes

Great update, all very positive and lots of simplifications which is good to see. On the PR

To avoid over relocation, we only allow relocation when there is
enough elder nodes. AND, the newly joined node will not be relocated
immediately.

What is meant by ‘immediately’? Isn’t rapid relocation of new nodes an important security measure?

13 Likes