Safe Network Dev Update - November 26, 2020

Summary

Here are some of the main things to highlight since the last dev update:

Safe Client, Nodes and qp2p

Safe Network Transfers Project Plan
Safe Client Project Plan
Safe Network Node Project Plan

After last week adding the Sequence CRDT operations signing, this week we’ve been integrating this into the stack, with some changes in sn_node here and sn_client here and here to get all operations signed before being passed on to the network. This highlighted some other issues with regards to local caching of Sequences (when to use local vs network, etc.). We’ve opted to remove this cache for now via this PR for simplicity. This has helped shore up the test suite, allowing us to more clearly see some CRDT type flows in action, for example, pushing changes are not immediately pulled from the network, so we have to wait for expected changes there.

With the above changes in, we’ve also been debugging some new section startup issues that were raised off the back of these, with some small client tweaks added here to prevent failure if an elder isn’t available (but enough others still are). With this in we can now see a subset of sn_client tests passing against an eleven member section on CI.

The rest of the failures that we are seeing are mostly due to a few hidden bugs at choosing destinations, and chunk retrieval in blob data chunk replication that we enabled last week. Rest assured, we are well underway with fixes for them and are ticking off the last of the failing tests from the client e2e test suite. Searching for these errors also highlighted that some elements of messaging were still missing and were now required to bring the control flow / error handling one step closer to what we call “lazy messaging”. Work is underway to address this.

Lazy Messaging is where we get a message we cannot handle for whatever reason (out of sync, future sequence number, etc.) and we error that message back to the sender with our last known history. The sender then knows they need to provide us with the missing link (we can also do the inverse (no error though) and update the sender if it is they who are behind). This saves us from holding messages until they order, which could be exploited to attack the network, and it would be more complex code. Lazy messaging is much closer to a message-passing actor model, and we have extended that to handle partially ordered events.

With a change in the new node joining dynamics in Routing, see PR#2234, we’ve also begun updating sn_node such that the nodes take responsibility for allowing new nodes to join the network. Effectively, elders of the network will now be keeping track of the supply/demand of resources in their section and accordingly request routing to let new nodes, who are queuing, to join their section.

We’ve also started getting the authd and CLI adapted to the new UI/UX terminology, for example, moving away from “Accounts” to “Safes”, as well as making the necessary changes to have authd store the applications’ keypairs on the network using a Map as the storage data type for the “Safe”. We’ll be continuing with this task to get all CLI commands and auth features aligned with these new terminologies as well as with the new sn_client API.

Once done with the above updates, fixes, and any bugs they throw up, we’ll be all set to fire up our internal testnets once again at full throttle, tidy up the various modules, double-check their stability, tie up loose ends and hopefully deliver an early Christmas present to you folks! :wink:

BRB - Byzantine Reliable Broadcast

This week our consultant has advanced the Generation Clock idea mentioned last week and presented a pseudo-code algorithm to the team for comment. This hybrid approach imposes a total order over infrequent join and leave operations, but only a partial order over much more frequent data operations. In plain English, this means that join/leave must be bounded (i.e. we cannot allow non-responsive nodes to exist) and use a form of total order, but we can handle many leaves at once, etc, whereas regular data operations can occur with high levels of concurrency, so long as each is from a different source (Actor in CRDT parlance). So far the proposal seems solid and the next step is to implement it in code and write some tests. More on that next week.

Routing

Project Plan

As discussed in previous weeks, the work to improve lost peer detection was this week approved and merged. This takes advantage of the connection pooling feature in qp2p. This change means that the routing code base has been simplified and now allows more complex integration tests to be added to verify the features of the production code.

Some API feature work - Indication for section start-up and Age getter and
notify when key got changed during relocation - was also completed this week. With these in place, nodes will now be more informed of the routing status during start-up, and the updated keypair being used.

While testing, we observed an issue where during bootstrap, when the NodeApproval message was followed immediately by another message, say Relocate, bootstrap was completed after the NodeApproval was processed. This left any following message, such as Relocate, in the intermediary channel buffer never to be taken out and processed, i.e. we were losing that message. We’ve merged a fixing PR Fix losing messages during bootstrap to resolve this issue. It removes the intermediary channel, replacing it with a simple wrapper around the ConnectionEvent receiver. Thus the “push/pull” model is changed into a simple “pull” model. This way, a message is never retrieved from the channel if not ready to process it.

The work to allow nodes to tell routing to accept new nodes or not mentioned in last week’s update was also completed and merged this week. Routing assumes the elder-nodes will track all the adult-nodes in the section and when they detect the average storage capacity (or some other resource) becomes too low, they will flip the flag so the section starts accepting new nodes. All the elder-nodes should detect this more or less at the same time, so that consensus can be reached. In addition to flipping the flag, if the section already has infants, one of them will be promoted with its age increased by one, effectively making it relocate and immediately join back as an adult.

Safe Network App & UX

Feature Tracker / Screens & Flows / Onboarding Prototype

Thanks for all your feedback on the proposed changes to the Safe lexicon. We’ve begun to filter these changes through the UX flows and Safe Network App prototypes, and you should see them popping up in the various Figma files as we work through it all.

While not directly related to the language changes, one interesting little side project that popped out of the work was a revisit of some of the onboarding flows, for example, when a user is ready to create their own Safe.

If you recall, the existing version set out all the options a user had for creating a Safe (or account as it was at the time), and let them select the appropriate route, with step by set instructions.

It looked like this:

But could we make it smoother? Could we perhaps make it less daunting, and help a user quickly get their Safe up and running without any outside assistance, and then get them following on earning Safe Network Tokens to boot.

Here’s a small clickable prototype of the new approach—happy path only—just to give you a flavour.

This won’t be the only route through to getting a Safe, there will be other alternative flows with a little less hand holding, but for the first time user, it will be interesting to see how this compares to the existing approach.

It’s also a pattern that could be applied to other areas of the app—such as earning tokens the first time, creating a SafeID, or choosing strong credentials.

It’s a bit more work for us from a design and flow-logic point of view, but if it is smoother and less intimidating for the user, and gets more safes and nodes up on the Network, it’ll pay off.

Useful Links


Feel free to reply below with links to translations of this dev update and moderators will add them here:

:russia: Russian; :germany: German; :spain: Spanish; :france: French ; :bulgaria:Bulgarian

As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!

78 Likes

:stuck_out_tongue_winking_eye:

24 Likes

Second …!!

Woot!! Move over Santa; MaidSafe is fired up and ready to go. Looks like it’s gonna be an amazing Christmas and even more fantastic New Year!! :smile::tada:

34 Likes

Thirdst!!!

19 Likes

:tada:

Breathing heavily…

23 Likes

Brilliant.

Great idea to generate an invite.

18 Likes

Yes please! I’v been a good boy, and this is all I’ve ever wanted for Christmas :christmas_tree: great work team!

21 Likes

Do these UIs only exist as mockups? they are very tidy.

5 Likes

As expected - solid updates to fill in the gaps and an extra bonus of some more clear thinking and understanding of the users needs from Jim Collinson

Thank you to everyone involved.

17 Likes

Looking great Maidsafe team! Thanks for your amazing efforts. I’m continually amazed at the progress. When I think about it, this is really WORLD-CLASS … I mean the technology being built here is TRULY akin to the Manhattan project or the early US space program of putting man on the moon. The complexity of it all boggles the mind, yet with patience and careful engineering you are steering your way toward real solutions to problems that have never before been solved.

I don’t know if there is a Nobel prize for world changing software, but if there is, this project and the team should definitely get it.

Have a great week all.

Cheers

21 Likes

Could do with an abc at some point on the way sections evolve… expect it might be quite interesting to understand the options and logic, then the way that the network evolves over time. It’ll be fun perhaps later seeing graphics of the mechanics in motion.

Expecting that an existing node might have reason on occasion to move… is the age of node acknowledged by other sections… so, do other sections adopt existing nodes and prefer those over new nodes… and I guess the data churns on a change of section entirely but still a good node is more useful than random new one?

So many questions… but don’t let that distract you from issues and bugs… the talk of early xmas presents is great. :smiley:

Edit: btw is the figma working … not seeing anything in Firefox embedded or direct link.

15 Likes

Thanks so much to the entire Maidsafe team for all of your hard work! :racehorse:

12 Likes

Great update!

@JimCollinson My guess is that Snapp isn’t part of the Christmas present?

8 Likes

Really looking forward for early Christmas gift.

8 Likes

I’m waiting for christmas!!

10 Likes

That’s correct, it’s a separate strand of work that you’ll see coming together post-Fleming.

17 Likes

Let’s go Maidsafe team! Please turn this Christmas into the only good thing of this horrible 2020 :santa:

14 Likes

The package is getting tighter and tighter! Very nice!

I’m curious about this, what does non-responsive mean? How is it defined, how is node recognized as non-responsive?

8 Likes

Regarding the repetitious pattern of:

let mut op = data.create_unsigned_private_policy_op(owner, permissions)?;
let bytes = bincode::serialize(&op).map_err(|err| err.as_ref().to_string())?;
let signature = self.keypair.sign(&bytes);

What’s the rationale of that API rather than something that hides more of the boilerplate code such as:

let mut op = data.create_unsigned_private_policy_op(owner, permissions)?;
let signature = self.keypair.sign_op(&op);
8 Likes

What is the goal to be delivered in the Christmas present?

1 Like