Update 27 October, 2022

The last weeks have very much been peeling back the layers of the network and ensuring that what lies beneath is stable. This week we’ll talk about how we’re hoping to integrate this work and what this should mean for us going forwards.

General progress

Mostafa is deep into consensus mechanisms and how we might apply them to some workings of Safe Network, including the membership issues we touch on below.

@davidrusu and @dirvine are heads down in this stuff too, working through the edge cases and making sure they resolve with our planned mechanisms.

@bochaco is still working through messaging implementation and connectivity bugs, together with @joshuef and @qi_ma. The working branch we’ve been maintaining ahead of main is looking good.

Then there’s the issue of what precisely happens when a GET request is made. What needs to be signed and what minimum info needs sent with the chunk so the adult can be identified, monitored and punished if it fails? @ansleme and @oetyng are mostly looking at this now.

And on the testing front @chriso is rejigging the self-hosted runner for GitHub Actions CI/CD to simplify the fork we’ve been using.

Next Steps on Stability

Recently we’ve been pushing hard to stabilise the lower layers of the network (as we saw last week). We’ve been simplifying connection management and removing several layers of “retry” code from the sn_client and sn_node layers, which were obfuscating infrequent bugs (which became more frequent over the course of a longer running testnet and many tests). We’ve been working on this in the connection management and node storage layers, as well as in membership.

As we’ve focused in on this instability, we’ve moved to use a simpler CI setup. This has tests working in sequence, but working much more consistently, but for a few issues which crop up on main. But this is with a much simpler connection management setup, and a lot of the ‘retry’ layer removed.

We’ve not yet put this through the testnet wringer, as we’re still trying to clip the remaining issues: Membership stalls… (a fix is being tested for this); occasionally slow API tests - which appear to be related to membership getting out of sync; and very occasional DBC test failures.

We’re hoping to merge our working branch into main very soon. Once we have that we’ll be looking to lock in these stability gains with some further CI improvements.

More tests

We’ll be adding tests to check all adults and elder paths (re enabling split tests, data balancing on churn, etc), and checking each and every adult who should store data does store data. This should eventually allow us to move to a simpler happy path setup for client comms (ie. communicating with only one elder… awaiting an ACK before attempting verification), which will further reduce network load.

Once we have that in place, we’ll be looking to force all PRs through an adjusted CI workflow. We’ll be streamlining tests to only run on linux machines initially (as those are fastest). With the full suite running only when the PR has been reviewed (via BORS, our CI robot).

This should hopefully speed up the feedback loop on our PRs, with BORS providing the CI meat, and testing across all the platforms once the PR has been reviewed.

And while that alone is not a million miles from what we have now… CI should be more stable, more useful (BORS manages pending PRs and rebases automagically; as we may have spoken about here before)… And… We’ll then be adding a full droplet testnet to the BORS flow, which will mean that main will only contain code that has passed every test we have, on every platform, in all networking conditions!

And with main reliable, tested and releasable, we’ll be much better placed for working on improvements, benchmarks and new features without slipping on the stability front.


Useful Links

Feel free to reply below with links to translations of this dev update and moderators will add them here:

:russia: Russian ; :germany: German ; :spain: Spanish ; :france: French; :bulgaria: Bulgarian

As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!

56 Likes

Tada! Two weeks running and on a hat trick…

24 Likes

I almost had you this week Mark!!!

Nice update as always. Will be interesting to see community tests again once the big changes are into main.

19 Likes

Third bronze is best :slight_smile:

15 Likes

I’m surprised to read this.
Let’s see how it will work.

16 Likes

What a positive update! Excited to read about the progress. Good luck with the bugs guys, go get 'em!

16 Likes

Nice update Ants! Thank you. I believe getting all the parts working together smoothly is one of the more difficult aspects of this project, so thank you for working so diligently to track down the bugs and find more elegant solutions to efficiently speed things along!

New stable diffusion image: “Multi Array Internet Disk – The Safe Network”

Cheers :wink:

16 Likes

Yes talk to me yet :slight_smile: I love reading that a lot of code layers have been removed and the network is more stable.

Cheers @maidsafe!

14 Likes

Glad to hear this! Great update all in all! :beers:

13 Likes

@dirvine Could you explain how this was achieved and if it could be done again?

7 Likes

Thanks so much to the entire Maidsafe team for all of your hard work! :racehorse:

9 Likes

I’m guessing it is because the price of MAID went a lot higher after the fund raise. Hopefully they converted some to other crypto or cash at that time – as it’s lower again now.

7 Likes

This is what I was thinking as well. I’d just like to understand how it was possible from someone who knows. It is an amazing use of limited resources. Being able to double the runway is great news.

11 Likes

I do remember several years ago when he addressed the limited funding issue(s), David mentioned that if they did run out of funds, he would still press on with whatever resources he had available to get the network to the finish line. No details past that that I remember him disclosing. But he stated emphatically that he was irrevocably committed to the release and deployment of a working Safe Network. Nothing seems to have changed.

13 Likes

I could not tell you really. Just some care and attention where required. We do it all the time and will keep doing that.

17 Likes

Great work. Keep moving forward. I will wait patiently for the end result.

9 Likes

Hm… I have been complaining so much recently, that I feel bad doing it again, but there is one thing that bugs me, and that’s qp2p and how it is approached.

I mean all the creative and inventive “finding your way” stuff that happens in node, membership, consensus, DBC etc. is very understandable and the only way it can be done. But when it comes to qp2p -stuff, I think it is not enough to get it to work somehow, but it should be understood in detail. If not now, then before official release anyway. And it seems to me a bit of waste to direct your own talents to it, when the knowledge (or maybe some guidance) could be bought outside. Or could it?

Maybe I don’t understand this well enough, but the way I see it with my limited understanding is, that qp2p is nothing new, nothing inventive compared to other stuff, and somewhere out there are persons who know it inside out. And still it seems to be a big source of problems for every other aspect of the project. Couldn’t you just lay a pile of money on the table and have someone to sort it out in two hours? (I am drawing a caricature here, but you know… in principle?)

4 Likes

Probably. However, QUIC is new qp2p is new. Like many parts of the system, we use 3rd party code. We cannot write everything or indeed employ experts in every aspect. However, as we are OSS, then, it’s easy for folk to help out if they can. If they can not, then they should feel great that we are.

12 Likes

Who has a great succinct safe elevator pitch?

Twittering with the guy that wrote tendermint consensus and he is asking for the elevator pitch?

8 Likes