It better than that, there are 7 Elders but circa 20 nodes in a section . So we could lose 2 from 20 in that case.
It is also catastrophic recovery? So can a blip cause 90% loss, but then if those 90% reconnect quickly, it is different to lost forever node.
Yes reboot is a serious one, all nodes might end up moved, but the rule is nodes must never delete data without republishing it first.
So a few different things at play, but recovery from lost consensus should come first I reckon.
Another thing to consider is higher replication fact means less different data on nodes, so nodes fill very fast with extra duplicates. If we take replication to it’s logical end, then all nodes hold all data. So now it’s less than that, but what is safe?
So we can lose 2 Elders and be OK, so we can say we can lose 2 copies of data and also be Safe, so keep 3 copies?
There’s a lot of tweaking gonna happen here for sure.
In terms of how much protection we get, it’s down to many factors, more copies, more nodes, more admin.
So our goal should be what is the replication factor that makes Data as Safe as the network? Again though a massive amount of things to consider. I will defo take time and describe this with all its side effects as soon as I can. It’s really interesting.
The first one I get. That’s when there’s not enough data on the network, a safety guard against sybil attacks, and you need to retry later.
~ # safe node join
...
~ # cat /root/.safe/node/local-node/sn_node.log
[sn_node] INFO 2021-04-13T10:55:18.014731066+00:00 [src/bin/sn_node.rs:104]
Running sn_node v0.35.5
=======================
[sn_node] ERROR 2021-04-13T10:55:18.338546196+00:00 [src/bin/sn_node.rs:110] Cannot start node due to error: Routing(TryJoinLater)
This next one I don’t understand.
$ safe files ls safe://hyryyry6mw7uiufjbwxapfht8fdy3p89jxyrq7iiem9fx8h8xmwr1s1bm9wnra
Error: Failed to connect: ConnectionError: Failed to connect to the SAFE Network: QuicP2p(UnresolvedPublicIp)
Next up. InsufficientBalance.
I thought were not at this stage yet.
Can anyone hand me a safenet token?
$ safe files put ~/sur/Bòȥxr/ --recursive
Error: NetDataError: Failed to PUT Public Blob: Transfer(InsufficientBalance)
It may have but the connectivity bug will kill it anyway.
Full AE will also put apid to some, but we are on that.
Then a bunch of smaller stuff, UX fixes, more cmd lines, maybe a browser. These will all continue I think.
Then we will be testing section recovery and consensus recovery.
When this is unfailable it’s Fleming. It’s gonna be a wild ride this one, we need to strap in!!!
Seriously I think section recovery and consensus recovery may see a bit more of a delay than just a day or so, but we will see. (BTW I think this goes way beyond any network project out there when we do this)
I was trying v2(?) with >safe files put test.jpg
and get thread 'main' has overflowed its stack
If it is my mistake, then nevermind, if not it is better to fix.