Here are some of the main things to highlight since the last dev update:
- Last Friday’s open house community Safe Chat was a great success! Many thanks to all involved, particularly @sotros25 and @m3data for their excellent presentations.
- @jimcollinson is running an AMA over on Reddit - please do get involved!
- Deep diving into and debugging multi-section networks has been the main theme this week, with several new error handling catches introduced as we weed out issues that can cause networks to fail.
- New versions of
sn_authd(v0.2.0), and the CLI (v0.20.0) have just been released, making it compatible with the latest
- We’ve now completed development on the MerkleReg mentioned last week, and so work has started to integrate it into
- This week saw the first files transmitted from one SNFS node to another via BRB.
- In routing we finally merged the fork resolution PR.
- Also in routing, we merged three separate PRs (#2349, #2351 and #2336) implementing fixes for various issues that came up during stress-testing. Routing is now considered to be ready for a stable testnet.
Safe Client, Nodes and qp2p
Closer and closer…
We’ve been continuing testing and debugging multi-section networks this last week. We’re closer than ever, tantalisingly, but also still not there yet (it’s frustrating for us too!). It’s an interesting place to be debugging, there’s a lot of logs and messages, and they’re coming in all sorts of orders so we’re really testing the edges here.
Some minor rough edges have been filed down. We’re now handling some error situations which can arise from section-split transfers, or indeed any transfer, coming in and throwing errors because it’s no longer valid as we’ve already seen and applied it. We’re now also handling some
sn_transfers errors which can happen when there’s no history - this can arise from new elders which don’t yet have history. Previously this was causing an error and breaking the flow.
Tackling the root issue
But the larger issue in progressing section splits at
sn_node level, is getting the new Elders of the respective sibling sections up to speed with the parent section Elders, and have them handling requests as if they were already an Elder. Without that, the transition would basically deadlock. Now, as long as they don’t have all that’s needed to be an Elder, some of those requests will return an error - but that’s basically fine.
We’re progressing with two tasks here in parallel; one where we are making some quick adjustments that allow for the above described to get us going for the testnet, and another where we do some adjustments also at
sn_routing level, to get the transitions better coordinated. By that we are also progressing the ambition to have all engineers involved end-to-end in some way, which we believe is necessary for a well functioning system of this sort.
Also this week, on the Safe Browser we spent a wee bit of time dusting off the repo (it’s been a while!) updating dependencies so we won’t have a pile of security issues when we’re ready to go there. This isn’t headline exciting news, but it’s one less thing to be tackling when we’re satisfied with testnet stability (plus @bzee seems to have been progressing well with his
napi conversion of
sn_nodejs, which is great to see!).
API and CLI
A new version of
sn_authd (v0.2.0) and the CLI (v0.20.0) have just released with some enhancements to the way Files and NRS containers are stored on the network, as well as with new
bin-version subcommands for auth & node to query their binary version. As usual you can use the CLI to upgrade (
$ safe update and
$ safe auth update), or install the latest CLI as detailed in the User Guide. These new versions are compatible with latest
As mentioned last week, we’ve been migrating our Sequence data type to now contain an immutable Policy which removes all the complexities it brings to our CRDT approach. With that now in place, this past week we were able to adapt all our APIs and messaging types across the board to remove the ability of mutating Policies.
Due to refactoring and other changes over the previous months to the way messages are sent by the client to the network, we had to temporarily disable our
sn_api E2E tests in CI since they need to be adapted to the new more asynchronous nature of messaging. We have now started to slowly adapt them to get them back in our CI system.
CRDT - Conflict-free Replicated Data Types
We’ve now completed development on the MerkleReg mentioned last week: rust-crdt#111, and so work has started to integrate it into
sn_data_types. As a reminder, the MerkleReg will be the new backing CRDT to the Sequence type. It supports a branching history that can be traversed (similar to a git branch history).
Also this week, GList/List CRDT’s were found to have a problem inserting between indices that used the same identifier, and so this has swiftly been resolved in rust-crdt#112.
BRB - Byzantine Reliable Broadcast
The Safe Network File System (SNFS) integration with BRB has been proving very fruitful in discovering some missing features in BRB:
- The client was not being notified when an Op was applied. Clients need this information to resend potentially dropped packets. This has been resolved with
- When an operation was executed by BRB on a client, it would rely on the network stack to send packets to itself. This would cause race conditions for highly concurrent clients when Op’s are executed in rapid succession. We now short-circuit these packets to ourselves and handle them before returning to the client: brb#29.
- To deal with dropped packets, we’ve added an API to re-send packets for which we had not yet received a response for: brb#29.
SNFS - Safe Network File System
Good news! This week saw the first files transmitted from one SNFS node to another via BRB.
We initially performed the BRB transmission synchronously with the FUSE operation, but this proved too slow. An improvement was made to queue operations and return immediately to FUSE. A separate thread then lazily sends operations to peers in source order and ensures each is applied. This is a necessary step towards a filesystem that can be used offline and then synchronized with peers when connectivity is restored. With this change in place, the filesystem feels in terms of speed like using a regular operating-system fileystem such as ext4 on Linux with an SSD. Perhaps not quite as fast yet, but same order of magnitude.
In testing yesterday we discovered that two threads are using more CPU than they should after heavy communications have occurred and completed, and now the process is idle. Both threads are busy executing Quinn (network) code for unknown reasons. A new version of Quinn was released this week which may fix it, so we will test that out as soon as possible.
This week we finally merged the fork resolution PR. The crux of the PR is a new implementation of the
SectionChain data-structure which is now a proper CRDT. This means that it guarantees (eventual) consistency regardless of in what order the operations are applied, how they are grouped, or even duplicated. Even if multiple distinct keys are inserted into it, everybody will eventually agree on which one is the most recent one and thus who the current elders are (because each section key is uniquely associated with a single group of elders). And all of this is achieved without involving any complicated consensus mechanism. This is precisely what we wanted when we removed Parsec and now it’s finally here.
- Avoid infinite request-response loop during bootstrapping by ignoring duplicate redirect responses - caused massive memory leaks.
- Fix accidentally rejecting valid bootstrap responses on relocation - caused a node relocated to a different section to get stuck.
- Fix accidentally invalidating signature of resent bounced
Syncmessages - caused nodes to sometimes not be properly updated about the state of their section.
- Notify relocated elders about them being voted out of their original section - not doing so caused them to never progress their relocation.
- Make sure the sibling section info is trusted after split - not doing so caused the nodes to forget who the nodes in their sibling section were right after split and thus being unable to contact them. This further caused some nodes to get stuck during bootstrapping.
- Allow multiple pending DKG outcomes - not doing so sometimes caused nodes to forget their secret key shares and thus become unable to sign section messages.
- Use a unique key for DKG sessions that have the same participants - not doing so sometimes caused the DKG outcome to become corrupted and so the nodes were unable to sign section messages.
After these fixes the internal networks we are running look much more stable. With this, we now consider routing to be ready for a testnet!
Community & Marketing
The first of our open house community Safe Chats was on Friday, and what a treat! While the agenda was centred around the topic of community marketing of the Network, with some excellent presentations from @sotros25 and @m3data, the discussions were broader than that as you can imagine.
Not only was it great to see some faces—although cameras were optional!—it seems to be shaping up to be an important environment for collaboration and hatching big plans!
If you haven’t already, you can catch up with the conversation here:
We’ll undoubtedly be running more of these, so stay tuned and we’ll let you know when to keep your diaries clear.
I’m a UX Designer helping to build the new Internet: Ask Me Anything!
He’ll be answering as many questions as he can, and saving some of the most asked, and most juicy ones for a YouTube Q&A to make them accessible, easy to share, and hopefully stick around for a little longer.
Please do join in!
Safe Network App UX
As we’ve been improving the onboarding process for people creating a Safe on the Network for the first time, which we gave you a sneak peek of a few months back, we’ve been working through using some more of this more conversational approach for first timers in more areas of the app. Not extensively, but where it might help nudge people in the right direction from ‘no content states’.
One such place is the Invite creation utility, where existing users can help get a friend on the Safe Network.
Working to simplify this flow has also led us to look at the invite process overall, and shape it up somewhat.
If you recall, we’d previously planned to allow for clients to automatically top-up, or debit invites to account for inflation/deflation of the tokens. This had a few drawbacks, one being it would only initially work when the client was open, and the other being that it also added some complexity to flows in circumstances when auto top-up was enabled, but the wallet had run out of funds to support it.
In short, what was intended to make life simpler could work well on the happy path, but be a bit cumbersome off it.
At the same time, we’d also deferred another useful invite feature until after MVE: the ability to add additional funds to an invite when gifting it to a friend.
But no longer! We can make the inflation compensation more understandable and robust, but a little more manual, if we just go right ahead and build manual top-ups.
So it’s a bit of a two-for-one. You’ll get more understandable invite management and you’ll be able to add as many tokens as you like to one.
Help a friend get on the Network, and pay them back for lunch, all in one go.
Feel free to reply below with links to translations of this dev update and moderators will add them here:
As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!