Here are some of the main things to highlight since the last dev update:
- After working through some minor refactors and fixes, we are now in a position to update all our Rust crates to Tokio v1.
- We’ve revisited
sn_routingthis week to PR changes to the way agreement is reached to now require a supermajority (more than 2/3) instead of a simple majority (more than 1/2). We believe this is necessary to make the network resistant to certain types of attacks.
- We’ve decided to implement
sn_node, with work already underway. This was originally planned for post-testnet but we’ve deemed it worthwhile to bring forward.
- @jimcollinson kicked of an AMA on Reddit, and right here on the forum, last week - there’s still time to get your questions in!
- @jimcollinson has created the first in what we believe will be a series of YouTube video responses to the bigger questions received in the AMA - you can watch it here.
- @dimitar has been working behind the scenes to help increase Safe Network awareness in India with a Facebook and Twitter ad campaign.
- Keep a regular eye on the Like This Tweet thread on the forum for some excellent guidance on how to help promote the Safe Network, and surrounding components, with a simple button click!
Last week a long awaited release of Quinn finally arrived with an important upgrade: Tokio v1. Up until now, the usage of an older version of Tokio in Quinn prevented us from updating all our crates to Tokio v1 due to incompatible runtime versions, so we are in the process of upgrading all our crates to Tokio v1. With some minor refactors and fixes we have been able to get all tests passing with the new Tokio version. This update also helped us identify a previously undiscovered issue that left streams open ultimately stalling network communications once the upper limit was reached. The Quinn team promptly assisted us and the issue is now fixed in
sn_routing’s communications are working flawlessly again! We expect all our crates to be updated in the next few days.
This week we also added more examples to the
qp2p crate to better demonstrate the use of the API, and for stress testing
qp2p both locally and on Digital Ocean.
sn_routing this week we decided to change the way agreement is reached to now require a supermajority (more than 2/3) instead of a simple majority (more than 1/2). This was necessary to make the network resistant to certain types of attacks. We also increased the number of elders in a section from 5 to 7 which means a section can still lose up to two elders and remain functioning. These changes are currently undergoing review and testing, and we expect them to be merged soon.
With the need to enable lazy messaging (see subsection below), we’ve been looking at how best to achieve this in
sn_node. We may be able to shim some small parts of this in, but we’re also having a look at a larger refactor here to simplify things. It’s looking like with some of the
sn_node code gone (essentially removing
Duties) and directly parsing messages, we’ll end up with something that we can error out of quite readily, while also probably dropping a lot of
sn_node’s complexity. Initial efforts in this area have been pretty positive. We’re hopefully this won’t be that big of a task, as the underlying logic should be staying the same, but regardless we’re approaching this in parallel to more lightweight solutions so we’re hopefully not blocking anything.
Lazy messaging in
sn_node would work by slightly increasing the size of messages sent between nodes so they include some extra information on the Network’s current state as seen from different observers in space and time. The alternative to this approach would be to poll continuously for changes - we firmly believe that the cost of extra data per message is more than offset with the reduction in overall traffic when compared to constant polling. With polling, even when the Network is going through a quiet period the polling would need to continue furiously in the background. Other approaches to halt/pause parts of the Network to get to agreement are fraught with many side effects and complex code, therefore those are off the table.
As a very brief overview, with lazy messaging if a node receives a message and it realises that the network state details in the incoming message differs to what it believed was the state, then it communicates further with the sending node to bring itself, or the sender, up to date with the correct network state, then the original message can be processed accordingly.
Lazy messaging has been implemented in
sn_routing for some time now, proving to be effective and reliable. In
sn_node we’ve faced some challenges over the last couple of weeks trying to bring nodes up-to-date with the latest network state changes (after churn, promotion, demotion, etc). For example, when a section splits, we’ve faced challenges getting the new Elders of the respective new sibling sections up to speed with the parent section Elders, while still being able to cope with the typical expected traffic and Network events.
We’ve had lazy messaging marked as the end goal for a while, but until now considered it as a post-testnet optimisation. However, it’s felt like an uphill struggle with the challenges we’ve faced over the last couple of weeks, and that we are putting a seemingly never ending amount of temporary plaster over cracks that we know are comprehensively resolved by lazy messaging. So, we’ve decided to waste no more time here and move to using this pattern in
sn_node to update the network state on each node over time, as and when required.
As mentioned, this is extra work that we didn’t think would be required pre-testnet, but we don’t consider it so large that it delays a testnet too significantly. And it’s another item checked off our to-do list on the road to beta & launch.
Work has begun on a bounded counter type that will allow us to cap operations on a data type. This is a valuable component which allows mutable data to continually grow, while bounding data into compartments. This means we can pay to upload the data type, then freely add to it up to a bounded point, at which time the user would pay again and get another set of operations space freed up. Splitting up growable data in this way spreads the load across the network. This is an optimisation, but an important one that is required for launch, but not for Fleming. It’s nice to see these nips and tucks in the pipeline now.
We’ve also been making changes to the Sequence data type since we are migrating to use the new
MerkleReg CRDT, which will allow us to keep all history of appends, along with allowing forks of data if the end user or application generates them either on purpose or not. This is impacting how our API shall be presented to the end user, as depending on the use case there could be forks/branches of appends the user may want to not only traverse, but also resolve. Therefore, we are also working on adjusting our API on the client side to present all the capabilities to the user, without making it more complicated, and without removing the flexibility and power this new CRDT data type brings in for applications and developers.
The original aim was to round up a brace of juicy questions and answer them in a Q&A session on YouTube. There’s been a splendid response, with the answers to many questions turning out to be far more detailed than a single Q&A session could possibly handle. So here is the first video response in what will likely become a series:
Please do like, share, disperse, distribute, disseminate, discuss and enjoy in whichever way you see fit. And feel free to keep the questions coming.
Feel free to reply below with links to translations of this dev update and moderators will add them here:
As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!