Launch of a community safe network

Yes, I have 9 vaults, all rented at Hetzner.

1 Like

i hope the community isn’t spending too much, perhaps there’s a way you guys can post what you’ve been hosting, and be reimbursed at least partially for the “Resource you’ve Proven” :stuck_out_tongue:

Would be a way to build your reputation as a trusted node hoster, and costs / methods can be experimented with and compared openly that way too

Now, I have proofs for some node resurrections:

root@TFA--00:~# egrep "(Added|Dropped) 851f19" /var/log/safe_vault.log
I 19-01-27 16:35:54.866071 Node(dba170..()) Added 851f19.. to routing table.
I 19-02-01 02:33:17.518202 Node(dba170..(1)) Dropped 851f19.. from the routing table.
I 19-02-01 02:33:17.604263 Node(dba170..(1)) Added 851f19.. to routing table.

root@TFA--00:~# egrep "(Added|Dropped) f797fd" /var/log/safe_vault.log
I 19-01-27 16:35:55.769884 Node(dba170..(1)) Added f797fd.. to routing table.
I 19-01-29 08:56:49.318479 Node(dba170..()) Dropped f797fd.. from the routing table.
I 19-01-29 08:58:01.955662 Node(dba170..()) Added f797fd.. to routing table.
I 19-01-31 04:05:13.822275 Node(dba170..(1)) Dropped f797fd.. from the routing table.

root@TFA--00:~# egrep "(Added|Dropped) 9ee544" /var/log/safe_vault.log
I 19-01-27 16:35:54.882023 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:19:44.665987 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:20:04.890520 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:21:57.436329 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:21:57.641061 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:23:57.455836 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:23:58.091701 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:25:57.441377 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:25:57.634087 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:27:57.443735 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:27:57.670821 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:29:57.445960 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:29:57.663853 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:31:57.447273 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:31:57.646797 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:33:57.450513 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:33:57.889641 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:35:57.452795 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:35:57.669788 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:37:57.455345 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:37:57.667860 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:39:57.457802 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:39:57.658768 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:41:57.458559 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:41:57.652685 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:43:57.461075 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:43:57.680297 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:45:57.463317 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:45:57.670163 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:47:57.465882 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:47:57.689275 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:49:57.469312 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:49:57.686771 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:51:57.539279 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:51:58.501068 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:53:57.473946 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:53:57.683820 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:55:57.477416 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:55:57.695752 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:57:57.478013 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:57:57.642485 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 07:59:57.483055 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 07:59:57.686005 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:01:57.481699 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:01:57.709544 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:03:57.482926 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:03:57.743845 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:05:57.485483 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:05:57.730510 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:07:57.487051 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:07:57.668898 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:09:57.488542 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:09:57.674848 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:11:57.491280 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:11:57.686692 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:13:57.584062 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:13:58.215598 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:15:57.494983 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:15:57.678682 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:17:57.497199 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:17:57.691621 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:19:57.509096 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:19:57.840435 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:21:57.503661 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:21:57.721531 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:23:57.506248 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:23:57.690320 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:25:57.508494 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:25:57.732225 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:27:52.438767 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:27:52.681340 Node(dba170..()) Added 9ee544.. to routing table.
I 19-01-29 08:28:57.516443 Node(dba170..()) Dropped 9ee544.. from the routing table.
I 19-01-29 08:28:57.706231 Node(dba170..()) Added 9ee544.. to routing table.
I 19-02-04 01:25:04.248451 Node(dba170..(1)) Dropped 9ee544.. from the routing table.

The last one means that node 9ee544 has been Added and Dropped 37 times!

I don’t think this is a good thing for the health of network to accept reconnecting nodes. There are few of them but can be a burden to manage (for the subsequent data relocations).

1 Like

I agree.

It won’t when node age in used. It will reduce a nodes age by 50% on reconnect and force it to relocate.

12 Likes

Hi really interested in running a node, I have a 60mbps connection so can download about 7megabytes a second. How easy is it to set up a node, I’m not super technical with computers but know the basics, I have win7 and a pi3

2 Likes

There has been a glitch in Hetzner cloud about 3 hours ago that has restarted all my vaults, same for @happybeing’s vault. Which means that 10 vaults out of 19 were restarted at once.

The network seems to have absorbed the impact: I have tested some of the safe sites listed here and they still work.

Sorry, they didn’t restart. A docker command at the service level made me believe that. But when I inspect each vault individually, no such event happened.

3 Likes

I don’t know if someone has already tested win7 or pi3. Try to follow instructions in the README file here and ask questions if you come across any error. Good luck!

2 Likes

I was just writing about this. The lowest number of Vaults I could find is 14 a day ago.

imagen

Not in this test but yes in previous ones. There shouldn’t be any problem running a Vault on windows 7 (64bits).

1 Like

This kinda merges reconnect(temp connectivity loss to particular peers) and restart(node loosing more than quorum connections/proc restart/…) together. With fleming I’d just not be expecting reconnects in the std way to maintain static connections given dynamic connection req and … and yep in terms of restart node ageing takes care of it with the age drop and also the relocation of restarting node and all that.

3 Likes

You mean Raspberry Pi 3 I assume. Yes that is working for me, but the link where I got the ARM vault executables from doesn’t work anymore. You could try to compile it yourself: Launch of a community safe network.
Edit: you can now download it from this community network: safe://vault.arm/safe_vault-linux-arm-musl.zip.

Hey Will, I was actually thinking of automating a cloud node setup here, and let community members pay for them on a site (and get their name there and so on). That would let people contribute without having any technical skill whatsoever.

The thought hit me as I saw there probably should be people out there who would want to contribute to this community network, but for one or other reason cannot set it up themselves (at home or in cloud).

I don’t know, would anyone be interested in that?
The cheapest nodes are at 6 euro each now, so that’s about how much 1 contribution would be at. I can pay for the site, db etc., that will be like 50eur a month.
The small problem is that I don’t know how to do it without having the subscriptions on me, and then I need to take the payments. Technically it’s trivial with Stripe for example, but it would be cooler if there was some commonly owned Stripe and Hetzner account instead. It’s always a bit wonky to send funds to an individual in a decentralised movement :slight_smile:

I would need some help with the cloud init script though for configuring the node (since I’m an absolute Linux novice).

Anyway, maybe this community network is not that big of a deal to justify the hassle, but I thought it would be a fun thing to do.

4 Likes

Lots of logging today about Not enough signatures in SignedMessage

Small part:

I 19-02-05 14:47:10.758514 Node(41ed5a..(0)) - Indirect connections: 0, tunnelling for: 4
I 19-02-05 14:47:10.808001 Node(41ed5a..(0)) - Indirect connections: 0, tunnelling for: 3
I 19-02-05 14:47:18.355509 Node(41ed5a..(0)) - Indirect connections: 0, tunnelling for: 4
I 19-02-05 14:47:18.391503 Node(41ed5a..(0)) - Indirect connections: 0, tunnelling for: 3
I 19-02-05 14:47:18.470713 Node(41ed5a..(0)) - Indirect connections: 0, tunnelling for: 4
I 19-02-05 14:47:18.494299 Node(41ed5a..(0)) - Indirect connections: 0, tunnelling for: 3
I 19-02-05 14:47:18.509981 Node(41ed5a..(0)) - Indirect connections: 0, tunnelling for: 4
I 19-02-05 14:47:18.510210 Node(41ed5a..(0)) - Indirect connections: 0, tunnelling for: 5
I 19-02-05 14:47:18.522210 Node(41ed5a..(0)) - Indirect connections: 0, tunnelling for: 4
I 19-02-05 14:47:18.526450 Node(41ed5a..(0)) - Indirect connections: 0, tunnelling for: 3
W 19-02-05 14:47:19.096829 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: b29afd..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, ce88bc.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.099552 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: d95463..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, a100d0.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.102182 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: a07762..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, 42eca6.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.104916 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: 82b80f..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, e6b607.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.107737 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: a5c77e..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, 32450e.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.110350 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: 94dea3..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, 2d8c18.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.113016 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: a5ddcf..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, f6d955.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.115894 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: b3cbc8..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, 5f9fd2.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.118551 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: c7f4dc..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, 0e08d9.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.121169 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: da7fdc..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, f2159e.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.127312 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: 918728..), dst: ManagedNode(name
: c48a99..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, 239800.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.188441 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: b29afd..), dst: ManagedNode(name
: 958753..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, f41ef0.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.191095 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: d95463..), dst: ManagedNode(name
: 958753..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, d6182b.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.
W 19-02-05 14:47:19.197995 Node(41ed5a..(0)) Not enough signatures in SignedMessage { content: RoutingMessage { src: ClientManager(name: 94dea3..), dst: ManagedNode(name
: 958753..), content: UserMessagePart { 1/1, priority: 1, cacheable: false, b6b60e.. } }, sending nodes: [SectionList { prefix: Prefix(1), pub_ids: {PublicId(name: 851f1
9..)} }], signatures: [PublicId(name: 851f19..)] }.

That is gathering consensus, so quorum number of nodes need to agree on the event. So this is expected.

2 Likes

I am thinking about following experiment: stop suddenly half of the nodes in the network. The aim is to do what I thought had happened this morning to check if the network resists such a shock.

Of course, I cannot stop all my TFA--nn nodes because they are the hard-coded contacts in the vault’s configuration file and so no new nodes could be recreated.

Here is what I intend to do:

  • Create temporarily 5 nodes to raise the total number of nodes to 22 nodes (there are currently 17 nodes)
  • Stop suddenly half of them (11 nodes): The 5 new nodes + @happybeing’s node + 5 TFA--nn nodes.
  • This will leave 4 hard-coded contacts, plus the 7 non-docker vaults from the community.
  • Check some safe sites
  • Recreate suddenly the stopped vaults
  • Check again the safe sites

As the network is very small, before stopping the nodes I can check that remaining nodes are spread in the four quadrants and that each quadrant has less than 8 nodes. Under these conditions there shouldn’t be any lost chunks (no probabilistic computations about birthday paradox needed to be done).

One thing I am unsure is keeping only 4 hard-coded contacts alive during the experiment. @Viv, could this be a problem?

Ideally I shouldn’t involve the hard-coded contacts, but the problem is that I can synchronize simultaneous drops only on docker vaults and the community created few of of them (only @happybeing and @riddim did that).

I am not in a hurry to do it. Maybe next evening or next weekend or even later.

3 Likes

It should not survive that actually. Not without node restart and node age etc.

3 Likes

Can you elaborate on why it shouldn’t survive that?

Each chunk will be stored in at least 3 or 4 remaining vaults, so none should be lost.

1 Like

There are may reasons, but the main one is you would lose consensus on the network and all sections would stall. It will withstand small amounts of loss, but not 50%. This is where a lot of work in routing has gone or is going and should be improved by PARSEC and the chain moving to data chain.

7 Likes

Not sure I followed this part. Are you referring to “four quadrants” as four prefixes? also not sure why you mention each with less than 8 nodes, just in terms of what significance that adds.

But yeh basically what david mentioned, given a GRP_SIZE of 8, any more concurrent node loss than G-Q(8-5) of 3 nodes at a time, consensus would be collapsed in a grp responsible for a given chunk. Thus newly responsible nodes wouldn’t be able to have consensus letting them know they’re responsible for that chunk and the content is genuine and … ofc similar applies to clients as well to trust what they’re given. Interestingly for ID a single copy is self validatable but for MD ofc that scenario wouldn’t apply due to the lack of Hash(Payload)==Name/Location expectation. We don’t specialise handling in that way for ID/MD in A2.

As to sections stalling, do note its all in the perspective of data and thereby vaults. Strictly section wise as in routing wise given a loss of within G-Q of section size per section, sections can recover but groups within them would have the problems of the previous point.

3 Likes

sorry I missed answering your question :slight_smile: It shouldnt be a problem. A single hard coded contact should be enough for bootstrapping outside startup phase. It all depends on whether churn brings the system down to its startup phase(less than grp_size nodes), if above it, then a single current node from a network that’s holding invariant should be able to get new peers accepted ok.

4 Likes

Ok, there need to be strictly more than 4 nodes holding each chunk. Is this the problem you are referring to?

Then, how about a less ambitious goal with stopping 1/3 of the nodes. 1/3 is not a random number, it is the upper limit of misbehaving nodes ratio to keep consensus with PARSEC. This could prove that existing other routing parts are ready to manage this limit.

Before launching the experiment, I can check that:

  • each quadrant has less than 8 nodes (to ensure that a group is larger than or equal to a quadrant)
  • each quadrant will have strictly more than 4 surviving nodes (to ensure that each chunk is held at least 5 times)

Would these conditions ensure success of the experiment?

Notes:

  • There are not enough nodes to do it right now: second condition means at least 20 surviving nodes, and 1/3 ratio means 30 initial nodes. There are currently only 17 nodes and I am not willing to provide the 13 missing ones.
  • I think it can work with less nodes than that, but it’s hard to prove that each chunk has 5 surviving nodes storing them when these nodes don’t belong to the same quadrant.

Yes, quadrants are:

  • prefix 00 (SW area in galaxy)
  • prefix 01 (NW area in galaxy)
  • prefix 11 (NE area in galaxy)
  • prefix 10 (SE area in galaxy)
1 Like