Update 05 January, 2023

Happy New Year one and all :tada: It’s great to be back - we’re determined to make this one really count :mechanical_arm: The team have all managed to squeeze in a nice bit of down time and are raring to go. Over the break we’ve also been fixing a few things and pondering some possible improvements, including the optimum size of nodes and sections and changes to node age and relocation. For the first of these, we’ve been running internal testnets with smaller nodes and larger sections. These have been going pretty well and have revealed one or two other issues to do with comms and handover which we’ve worked through. The second is a design optimisation which will treat younger nodes differently from older ones. More detail on that one in the next couple of weeks.

General progress

A break in routine can be a great time to think about what can be done better and tie up loose ends. Here’s a summary of what the team has been up to since the last update.

Splitting the catch-all NodeMsg::Propose messaging into four distinct variants for clarity.

RequestHandover: when nodes finish DKG and request a handover to the current elders (node->elder)
SectionHandoverPromotion: when elders tell those nodes that they are promoted as elder (elder->node)
SectionSplitPromotion: when elders tell those nodes that they are promoted as elder in either side of a split (elder->node)
ProposeSectionState: when elders decide to kick nodes or accept new nodes within a section (elder->elder)

This distinction makes explicit who is signing and who is receiving/aggregating the signatures.

Chopping unnecessary messages
We fixed an expensive issue where AE messages were repeatedly verifying the SectionTree for every message, even when it had already been verified.

Optimising AE
We’ve experimented with slowing down AE probing around splits to reduce the number of messages flying about, and also refactored global network section knowledge to target one random elder in three random sections every five minutes. Previously the default was all elders in three sections every 30 seconds. This resulted in greatly reduced CPU and memory use around splits, and the longer time should be sufficient for our needs: splits do not occur anything like every 30 seconds after all.

High memory use on waiting to join
We’ve had a go at a bug that’s been causing high memory use in nodes as they wait to join, as seen in recent testnets. We’ll be ready to put this through its paces on a community testnet shortly. We’ve also prevented caching of connection sessions for unjoined nodes.

Comms refactor
An ugly lock in the code for send-streams in sn_node has been refactored away. In addition, we’re testing “happy path” comms whereby clients can send a message to only one elder rather than all of them.

We also removed node tracking code and related locks which were potential points of failure. We were seeing many messages as a consequence of a failed send / storage level changes which were blocking nodes.

Changes to storage parameters
A margin has been added to storage capacity, whereby we expect a minimum amount of storage, but nodes can store more. This should help alleviate “could not store” errors before we split. We’ve also set up elders to store data (they weren’t previously) and use their local minimum storage capacity as an indicator of when to split, as discussed on the forum.

With this we now have the following flow:

  1. Nodes receive data.
  2. Every time we pass a certain level of storage used (for the first time) we allow new nodes to join.
    2.1 When a node asks to join, we see that joins are allowed, and elders start a vote to add the node.
  3. When new node joins, joins are disabled, we check if we’ve reached min_capacity
    3.1. We’ve not reached min_capacity, continue as normal.
    3.2. We’ve reached min_capacity, clean up excess storage.
    3.3 If we are still at or above min_capacity, trigger the fail-safe allow_joins_until_split.
    3.4. When a node asks to join, we see that joins are allowed, and elders start a vote to add the node.
  4. The joining node relieves storage load.

Useful Links

Feel free to reply below with links to translations of this dev update and moderators will add them here:

:russia: Russian ; :germany: German ; :spain: Spanish ; :france: French; :bulgaria: Bulgarian

As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!

56 Likes

A new year brings new beginnings and fresh opportunities! Go go go MaidSafe Team!


Privacy. Security. Freedom

24 Likes

Silver… ! And also…

16 Likes

Bronze!!! Happy New Year to All!!

15 Likes

Very nice update! Surprised there even was an update this week - figured the team had earned another week … but even more surprised to see that so much has been done this past week or so – wow!

Looks like next test-net will be here soon and we can discover how these optimizations are going to play out.

VERY EXCITED!!!

Thank you Maidsafe team :smiley:

Cheers

17 Likes

Was a plesent surprise to get an update and can’t wait for the upcoming test nets :slight_smile:

I can’t wait for the year ahead to see how everything comes together :clinking_glasses:

15 Likes

On Xmas eve I discovered flutter which is an interesting option for GUIs… https://flutter.dev
and trying to wrap that around the sn_api suggested bug which would be interesting to understand the cause of - the rust compiles via cargo without error but cmake does see a rust error, which seems odd. sn_api error -> wallet.rs:290:17 c.f. sn_dbc::PublicKey · Issue #1935 · maidsafe/safe_network · GitHub

Then tomorrow I lose internet connection because the ISP is doing ‘essential’ maintenance again in a way that seems to occur every few months.
Would elders lose their status at every loss of connection?.. can they retain some kudos or is there a risk of few nodes ever getting old enough??

13 Likes

A double surprise for me as I too wasn’t expecting an update after a supposed break, I’d also convinced myself yesterday that Thursday had passed without one.

I think the team can take more breaks and get there faster based on this. Maybe it’s the way to get away from family and other Christmas terrors: “I’m just going to check my email, back in a bit…”

Here’s to a great 2023. Best of luck to everyone! :pray:t2:

20 Likes

Nice surprise and lots accomplished over the holidays :grinning:.

Nobody else is asking so I guess that only I am not fully grasping the node join flow.

Say we have 11 nodes.
min_capacity is 1gb.
The network knows that min_capacity is reached when elders store 1gb and for arguments sake data is evenly distributed.

Joins are now allowed.
A node joins.
Joins are disabled.

Where does this new node get its chunks from? All 11 previous nodes?

Perhaps I am reading the flow too literally.
Does one or multiple nodes join before joins are disabled?

I’ve probably got it all muddled up which is why I am not understanding.

14 Likes

It gets them from the nodes he is close to in xorspace. However we added an ask all section members message to make this even more concrete.

i.e. he finds the chunk names, works out which is close to him and he then asks for them form many machines in parallel.

15 Likes

I guess what I am struggling with is how is it decided that this join has got the network below min_capacity and no more joins are required.

I’ll try to explain where I am stuck.

The elders have reached min_capacity so joins are allowed, do they need to drop below the example 1gb after cleanup to prevent node joins from being allowed again?
(What if they were not the nodes that shed data)
I think this is where I have it wrong)

8 Likes

They should keep adding nodes until their storage drops. New nodes should already have all the data they should have and on join other nodes can check what data they are now not primary for. It may take several nodes to be added to take average storage down if that makes sense.

i.e.

  1. Add a node
  2. check your storage and if still too high, propose adding another node. If enough elders are in agreement then we will add another node and back to 1.
17 Likes

Yes.
This was the issue.
The flow read a single node joins to me and the problem is solved.
Got it, thanks.!

14 Likes

Thanks so much to the entire Maidsafe team for all of your hard work! 2023 is getting off to a wonderful start.

12 Likes

Amazing work maidsafe!

10 Likes

Thx 4 the update Maidsafe devs

Happy NY 2 all

Looks like a new testnet will be coming soon

Keep hacking super ants

9 Likes

Thanks for the update. I am so glad and likewise astonished that you keep pushing. Keep up the good work!

13 Likes

Thank you for the heavy work team MaidSafe! I add the translations in the first post :dragon:


Privacy. Security. Freedom

13 Likes

does this affect the minimum amount of nodes required to start a “working” network?
As in how many nodes need to join before we can put.

Edit: it would appear so! tested on a 8 node network.
Edit2: make that a 4 node network now works!

This is pretty awesome, granted if you lose a node you are done for… but I love it!

13 Likes