The last weeks have very much been peeling back the layers of the network and ensuring that what lies beneath is stable. This week we’ll talk about how we’re hoping to integrate this work and what this should mean for us going forwards.
Mostafa is deep into consensus mechanisms and how we might apply them to some workings of Safe Network, including the membership issues we touch on below.
@davidrusu and @dirvine are heads down in this stuff too, working through the edge cases and making sure they resolve with our planned mechanisms.
@bochaco is still working through messaging implementation and connectivity bugs, together with @joshuef and @qi_ma. The working branch we’ve been maintaining ahead of
main is looking good.
Then there’s the issue of what precisely happens when a GET request is made. What needs to be signed and what minimum info needs sent with the chunk so the adult can be identified, monitored and punished if it fails? @ansleme and @oetyng are mostly looking at this now.
And on the testing front @chriso is rejigging the self-hosted runner for GitHub Actions CI/CD to simplify the fork we’ve been using.
Next Steps on Stability
Recently we’ve been pushing hard to stabilise the lower layers of the network (as we saw last week). We’ve been simplifying connection management and removing several layers of “retry” code from the
sn_node layers, which were obfuscating infrequent bugs (which became more frequent over the course of a longer running testnet and many tests). We’ve been working on this in the connection management and node storage layers, as well as in membership.
As we’ve focused in on this instability, we’ve moved to use a simpler CI setup. This has tests working in sequence, but working much more consistently, but for a few issues which crop up on
main. But this is with a much simpler connection management setup, and a lot of the ‘retry’ layer removed.
We’ve not yet put this through the testnet wringer, as we’re still trying to clip the remaining issues: Membership stalls… (a fix is being tested for this); occasionally slow API tests - which appear to be related to membership getting out of sync; and very occasional DBC test failures.
We’re hoping to merge our working branch into
main very soon. Once we have that we’ll be looking to lock in these stability gains with some further CI improvements.
We’ll be adding tests to check all adults and elder paths (re enabling split tests, data balancing on churn, etc), and checking each and every adult who should store data does store data. This should eventually allow us to move to a simpler happy path setup for client comms (ie. communicating with only one elder… awaiting an ACK before attempting verification), which will further reduce network load.
Once we have that in place, we’ll be looking to force all PRs through an adjusted CI workflow. We’ll be streamlining tests to only run on linux machines initially (as those are fastest). With the full suite running only when the PR has been reviewed (via BORS, our CI robot).
This should hopefully speed up the feedback loop on our PRs, with BORS providing the CI meat, and testing across all the platforms once the PR has been reviewed.
And while that alone is not a million miles from what we have now… CI should be more stable, more useful (BORS manages pending PRs and rebases automagically; as we may have spoken about here before)… And… We’ll then be adding a full droplet testnet to the BORS flow, which will mean that
main will only contain code that has passed every test we have, on every platform, in all networking conditions!
main reliable, tested and releasable, we’ll be much better placed for working on improvements, benchmarks and new features without slipping on the stability front.
Feel free to reply below with links to translations of this dev update and moderators will add them here:
Russian ; German ; Spanish ; French; Bulgarian
As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!