Consensus 28/32 or 8/8?


#121

So you are only removing good nodes and then only adding bad nodes? I doubt that is real life stuff.

What was the purpose of this. Just to see when sections become taken over?

actually doing a initial size network of 10 or 20K nodes would be a reasonable start and then do the adding 78/12K or 68/12K would be closer

Do you do section splitting? or just estimate the number of sections according to number of nodes.

Honestly I suspect we will end up needing at least 100K good nodes network before going live. Maybe even more and when network large enough and working right then go live.

EDIT: I am definitely missing something. How does 313 bad actor elders in a 2K section network end up with a section having 1/3 bad actor elders?


#122

The problem is there isn’t really a way to verify what is a good node until the ‘turn evil now’ switch is pressed.


#123

If MaidSafe started them all, and then bit by bit turned them off as network grew, we would have it.

Mm, real life stuff is also going to be various things during some time periods. Will be quite an undertaking to “cover it all”.


#124

Yes, the last numbers seem too strange.

Yes, and therefore insist on the idea of creating a support club, with sufficient funds, able to control, in the early stages, a considerable percentage of the network.


#125

There will be ways. Like Maidsafe themselves run that many - unlikely but there are enough trusted people to help out. There will be other ways for a controlled start up. Just need to devise it.

Yea, but good nodes decreasing and bad nodes added at 10 times that rate. If that was happening then SAFE deserves to die. It means no good people wanting to add nodes and actually leaving. And only baddies adding nodes. Well obviously it becomes their network doesn’t it.


#126

The young and small network would need to play by different rules than the established one.

I think it is a misconception that the two must follow the same rules. If we don’t try to force the young network to play by the same rules, instead incubate it, we would avoid some, maybe even a lot of this complexity.


#127

No, everything is normal.

I suppose that @mav kept current value of 8 for the section size which makes a disruption threshold of 3 elders. The disruption occurs with around 2000 sections and the birthday paradox says the attacker needs 264 elders to have a 50% probability to disrupt one section:

qbirthday(prob = 0.5, classes = 2000, coincident = 3)
[1] 264

This value is in the range of values observed by @mav and so his experiment confirms the birthday paradox theory.

Section size needs to be increased to improve the situation and Maidsafe already announced this. They didn’t specify what will be the new value and I took 31 and then 64 in my simulations above.

2000 sections with 8 elders in each make a total of 16000 elders. And 264 elders make a 1.65% fraction of that.


#128

I disagree Oo I seriously wouldn’t trust me and I wouldn’t trust you with this :wink: … (and I wouldn’t want to be trusted anyway …)

sure you need to consider the effects of having a large global network and there comes benefits with it (and the statistical fluctuations that come with size) … but when do you think this will be reality? how long do you plan on ‘supporting the network’ ? And when the network is global and a thing … Don’t you think the network might have new enemies that have some additional financial resources? Oo america pays more than 50 billion each year for their secret services … europe pays 80 billion each month for their financial system not to collapse …?

…I think it is wishful thinking that there is a large difference between what the network needs to be able to cope with later vs. in the early days … once it’s large and a big thing (and safecoin the world currency) it would be the largest treasure box that ever existed on this planet …

ps: changed the numbers in my estimations above from 11/32 and 22/32 for disrupt and control to 3/8 and 5/8 for disrupt and control … 2000 sections * 8 elders, 16000 nodes total as @tfa suggested … well …

being lucky enough to just cause trouble or even having a section of your own seems pretty manageable under those conditions :thinking:

pps: and monte carlo with 1000 iterations agrees as well

[not super surprising in the retrospective since it all represents the same mechanisms … but nonetheless beautiful to see all the numbers match so closely :open_mouth: ]


#129

Given that large sections improve security against flooding the network with bad nodes, what are the down sides of larger sections?

I can imagine what some may be, but it would be good to have definitive reasons why. After all, if large sections solve these sort of problems, there surely needs to be compelling reasons to keep them small?


#130

Just to add some more thoughts here :wink:

An important issue for very large sections is connectivity. A node cannot maintain much more than 100 active connections. If it can disconnect for a period and reconnect then there is trouble there as a web page could be created to “sell” your private key and the attacker controls sections that way. So reconnects should be viewed with great suspicion. This is why we have constant connections and recursive routes as opposed to kademlias iterative route.

Another consideration is NAT traversal, connecting to many nodes means you take up router ports and more when you are doing NAT traversal. So you can flood a router with “holes” etc.

There are more bits an pieces, but I hope this help.


#131

holy moly … okay … this now is suprising to me :face_with_raised_eyebrow: … and someone should probably check it … seems with the increased number of sections to have the majority of one specific other section is not super-high even in a smaller network :roll_eyes: :wink:

what I did here (for the sake of less connections) … i “chained 2 groups” (so it’s not only enough to have the majority in one group - but you need 5/8 in another group as well [that you can’t decide on and the network selects for you] where you’d need a majority too to do bad stuff …)
and then i chained 3 + 4 groups …

exactly same assumptions as before: 5/8 horrible things happen - 2 000 sections with 8 elders each:

exactly same assumptions as before: 5/8 horrible things happen - but this time 200 000 sections with 8 elders each:

ps:
the details of the calculations are on github

so there still would be a possible problem with disruption … but the control-part looks pretty fast pretty good with chaining …

pps: there is a rather huge error in the chained calculations! - the real results will look significantly less beautiful i think!


#132

Quick question, did you chain groups by requiring the 2nd group to be specific to the attacked group? (I mean the 2nd group was deterministic, i.e. the closest to the attacked group or similar) ?


#133

not specific “to the group” where the attacker happened to have a local majority - just one (randomly selected) other group

mathematically - i just multiplied the probability of having the [majority anywhere in the network] with the probability to have a majority in [any one specific of] the sections. As soon you decide which one (!) section to look at it shouldn’t make a difference probability-wise from all i know … you can select a random group, the closest one or the most distant one - as soon as you point out one specific section and there is not a group of possibilities the attacker can chose from => the probability is fixed and very low (unless you managed to get a pretty high network share)

oh shit… i just realized … if an attacker can know in advance if he will succeed … then it needs to be considered that he might be in control of more than 1 group and so it doesn’t depend on only 1 other group but as many other groups as he controls … so the results will look differently in reality … (but not sure how much different … let me think about this a bit … that complicates the formula a bit … )


#134

I thought that was the case. Chained sections would require deterministic groups to collude. So the attack is get any group (with only bday paradox this is easy in small sections) but then you need to get a deterministic group to fulfill the attack, if that makes sense.

Also checking the quorum of 29 in a group of 32 would be another interesting graph (we did tones of these a few years back, I think I still have the code somewhere, its c++).

Thanks again for the work here, it’s always good to remind ourselves of these issues and why consensus alone is not sybil resistant in networks with huge numbers of sections.


#135

:ok_hand: yapp - it’s a bit late now - but i will think about the expected value of sections and the conditional probailities involved to get your chained section then :slight_smile: + report back when i have a bright moment

29/32 looks perfect without chaining (2000 sections)

and still good with 200k sections

edit/ps: (so yes maybe another option would be to order the events through parsec and then vote on one package of events democratically 29/32 on a direct message basis … might be worth considering as well … that would make everything pretty straight forward to analyze risk-wise … no algorithm to select another section as potential weakness … no conditional probabilities-stuff that needs to be considered …)

pps: 29/32 with 200 000 000 000 sections: (200 billion)


#136

I’ve done some explanation of this (initial results are repeated in the table below for convenience)

GROUP_SIZE = 8
Test            1      2      3      4      5  |  Avg
T Nodes     20297  67517  49797  67960  68381  | 54790
T Nodes (%)  16.9   40.3   33.2   40.5   40.6  |  34.3
Sections     1961   2233   2115   2206   2211  |  2145
T Elders       13    296    172    333    313  |   225
T Elders (%) 0.08   1.66   1.02   1.89   1.77  |  1.28

Test 1

This test had the attacker owning 17% of all nodes and 0.08% of all elders to disrupt the network. This is a surprisingly low number of attacking vaults to cause disruption.

The attacked section age distribution is shown below (* means attacker):

8 6 5* 5* 5 5 5* 5 | 5 5 5 5 4* 4 4 4 4 4* 4 4 3 3 3 3 3 3* 3* 3* 3 3* 3 3 3 3 3 3 1*

It’s quite a small section, only 37 vaults, compared to an average of 61 vaults.

The attacker got very lucky in resolving the tiebreaker for age.

For comparison, here’s some non-attacked sections

Youngest Section (least total age)
7 7 6 6 6 6 6 5 | 5 5 5 4 4 4 4 3 5

Average Section
7 6 6 5* 5 5 5 5 | 5 5 5 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3* 3 3* 3* 3 3* 3 3* 3 3... (31 more vaults)

Oldest Section (most total age)
11 11 10 9 9 7 7 7 | 7 7 7 6 6 6 6 6 6 6 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 4 4 4 4... (207 more vaults)

Test 4

Test 4 is interesting because it had the highest number of attacked elders before a disruption happened. Despite that, less than 2% of all elders were needed by the attacker to cause disruption, which is also fairly low.

Attacked section age distribution:

7 6 6 6* 6 6* 6 5* | 5 5 5 5 5 5 5 5 5 4 4* 4 4 4* 4 4 4 4 4 4 4 4 4 4 4 4 ... (49 more vaults)

It’s slightly above average size section, 83 vaults, compared to an average size of 76 vaults.

The attacker was really lucky again with the age tiebreaker.

Youngest Section
8 8 6 6 6 6 6 6 | 5 4 4 3 3 3 3 1* 1 1* 1* 1* 

Average Section
6 6 6 6 6 6 6 5 | 5 5 5 5 5 5 5 5 4* 4 4 4 4 4 4 4* 4 4 4 4 4 4* 4 4 4 3* 3* ... (47 more vaults)

Oldest Section
7 7 7 7 6 6 6 6 | 6* 6 5 5 5 5 5 5 5 5 5 5 5 5 5 5* 5 5 4 4 4 4 4 4 4 4 4 4 4 ... (163 more vaults)

Changing to 31 elders instead of 8

GROUP_SIZE = 31
Test             1       2       3       4       5  |   Avg
T Nodes     134077  136771  155184  167490  121992  | 143103
T Nodes (%)   57.3    57.8    60.8    62.6    55.0  |   58.7
Sections      1027    1022    1041    1051    1023  |   1033
T Elders      3094    3128    3755    4203    2729  |   3382
T Elders (%)  9.72    9.87   11.64   12.90    8.61  |  10.55

Much closer to the expected calculated 12% for disruption from the birthday problem.


I think it’s possible. It’s imaginable that a single operator might suddenly try adding huge numbers of vaults for a whole day / week (considering the disallow rule means entry is competitive so the attacker ends up taking almost all new spots because they dominate all the queues). For a young network this would appear quite similar to pure attacking vaults being added to the network.

But on the other hand, I can agree with you it’s not real life stuff; hopefully the network participation is vigorous so there’s no monopoly on attacks, even if 99% of joining activity is considered to by ‘an attacker’ (from some vast number of attackers).

To me the unimaginable part is not “only baddies adding nodes” it’s only one baddie adding all bad nodes.

Say we end up with only 10 of the ‘worst’ companies ‘totally owning’ the network; google, microsoft, apple, facebook, amazon, twitter, ebay, wechat, baidu, reddit - their window to disrupt the network becomes very small due to competition for section membership (assuming they even want to cause disruption in the first place).

Perhaps another way to look at it is ‘how many non-malicious nodes do we need to prevent disruption’. And achieving that is probably easier than detecting and evicting malicious nodes. Given two approaches to the same problem 1) recruit +90% neutral nodes or 2) prevent +10% malicious nodes I’d say 1) is much easier to achieve than 2) even though it solves the same problem.

Yes, and merging, and age tiebreaking, and relocations, and best neighbour targeting, but not relocation tiebreaking (todo).

If we have 1 PB of ‘benevolent’ chunks (total) uploaded in the period before going ‘live’ spread over 100K vaults, that’s 10 GB per vault. I think there will be many more vaults than 100K very quickly once the network is live, mainly because 10 GB per vault is too high for manageable churn (at least that’s how I imagine it, open to debate on what the early network may look like but a bit out of scope for this topic perhaps).


#137

At this time the only relocation by the safe network is the initial relocation when the node joins isn’t it. The node is given a “random” address by the section contacted initially. Or does multiple relocations occur?

I notice you look for the neighbour with the least nodes. I didn’t think this happened.

Now by doing that and adding basically only bad nodes this means that you give the attacker an unfair advantage and might really explain why sometimes the attacker can disrupt a section with so few of the total elders. Basically you are funnelling the bad actor nodes into the section most vulnerable and why the %age is much smaller than the birthday paradox suggests.

Yes I’ve always contended that if the attacker groups are not coordinated then their competing interests will be at odds with each other and end up helping the network because it will be rare for one group to ever control any section.

There will be 2 major reasons above all others for an attacker and they are to gain safecoin (create all the section can and take them) OR to disrupt. The disruption is the easiest for unrelated groups to agree on, and the safecoin one its unlikely any group will agree with the others (pure greed). Now there is one other major possibility and that is the NSA style where they want tracking or users in general or when accessing certain chunks.

Another thing to consider for the googles and apples and microsoft and that is that if they could be caught doing this then legally they are in deep shit for disrupting and/or taking control of a network they do not own. It only takes identifying data centres with a major portion of vaults and/or a person in their organisation leaks the information. A leak is quite possible since it will be more than a one man in the organisation involved and bragging is not beyond the realms of believably.


#138

It’s proposed all vaults will be relocated and aged on an exponentially increasing period.

RFC-0045 Node Ageing Overview - “This proposal relocates nodes using an exponentially increasing period between such relocations.”

The simulation implements the ideas from that rfc and from the topic about datachains where relocations were revisited. The current intention (to my knowledge) is to relocate to the most-in-need section, although I disagree with that idea and think it should be to a random section. Still not sure which way it’ll go.


#139

I would think that the infant criteria should be observed so that even if a section is more needy it will be bypassed if its at max infants (currently 1)

That way even when adding a lot of bad actor nodes quickly the most vulnerable sections will not get swamped.

Do you honour the max infant == 1 rule?

How do you do ageing? Assume a set time? I am doijng a quick sim program to simulate adding, splitting, merging, etc to follow ageing and see if the results are similar to yours and b’day paradox when involving actual ageing and splitting with infants restricted. This sim is only testing this issue and not part of a complete sim.


#140

No. There’s some explanation in this datachains topic reply by me - “it was necessary to remove the disallow rule because the simulation got stuck at a single Complete Section which disallowed every new vault”

I just now tried enabling the disallow rule and found the network would not start. It needs churn to create ageing to create space for new vaults, but if vaults are only added and not removed (as my simulation does initially) there’s no chance for clearing the deadlock when all sections have vaults aged 1 (normally that deadlock is removed by a vault dropping out, causing ageing which hopefully clears space for new young vaults).

I think relocations will need a trigger beyond just vault joins / departures if the disallow rule is retained, eg relocate every Xth PUT event; see this response by dirvine: “These [other triggers] nudge the network to grow as there is a need (when trigger is put especially).”

But I feel this may be getting off topic…