If we refer only to deletion, yes, but chunks are immutable data and are self-verifying as their XOR direction is the hash of their content. So they are much more secure than MDs that could be modified without the user being able to detect it.
If the network is already creating copies - won’t it really enhance network’s durability and security to actually make it such that one section gets one chunk only? This way four chunks mean four different sections - greatly enhancing the %age of malicious nodes required to manipulate a piece of data?
If I understand correctly it does do what you are suggesting - i.e. the data is spread randomly across the machines that make up the network. The word ‘section’ is where the confusion lies. Here’s my understanding of how it works.
A section is a group of nodes that’s responsible for looking after data stored within a certain range of addresses on the network (for the sake of a simplified example, addresses beginning A or B). If the hash of a chunk of data begins with A or B then it will automatically have an address in the range A - B (because the hash is the same as the network address) so it will be managed by the section that looks after the range A - B, which will usually be around 10 - 20 nodes. A number of the nodes in this section will store copies of the chunks and most likely they’ll be widely geographically dispersed because XOR distance bears no relation to geographical distance - physical location within a section is randomised.
Just because the network address is fixed, that doesn’t mean that a chunk is only stored on a few machines for ever. The membership of this section is constantly changing, with nodes leaving and new ones joining. When a new node joins it gets copies of the data chunks whose hashes begin with A or B; when it leave it loses those chunks. So a node in London might be replaced in the section by one from Mumbai and the Mumbai node will now store all the A and B chunks. A while later that Mumbai node might leave to join another section with a new node located in Addis Ababa replacing it and getting all the A-B chunks, and so on. So the XOR addresses (and therefore the data stored at those addresses) looked after by a particular section stays the same, but the membership of the section, i.e. the machines storing copies of a particular chunk, is constantly changing.
The google attack is related to the EMP/solar flare attack thread:
“What about a catastrophic event that wipes out millions of nodes”
Random locations in disk space does not ensure geographic spread. It doesn’t matter if I have 8 redundant copies of my files if they are all in a couple 5000PB server farms in sunny CA when the big earthquake hits. After SafeCoin launches I think you will still see the same geographic concentration of servers to affluent regions unless there is something that be added to the code that allows one to know the general geographic region of a node. Perhaps there could be an incentive to become “beacon” nodes, that can determine a trusted geolocation. To preserve anonymity this geolocation does not need to be accurate… I would think that even knowing that a node was in a particular octant of the sphere/globe would help ensure the data is spread. This might be able to achieved through latency timings and known locations of servers on the internet backbone. Ensuring this geographic spread might help maintain a balance of power between a few big players and the churning masses…
These, more than unlikely, mega farm will be in an anti-earthquake building, with its own power systems and both terrestrial and satellite connections, so the possibility that your data will disappear forever is negligible and, thanks to the datachain, the data would recover.
So you want to change the complex routing crate based on an improbable assumption of an unlikely case of a very small possibility.
These what-if cases, of infinitesimal possibilities, begin to be very tiresome. I would be extremely happy that the network, with the current configuration, came out within a reasonable period of time.
Google attack stats to control one section
The google attack test should report how much work it takes to control a single section. This post uses a test which:
- establishes an honest network of a certain size and average age
- then repeatedly
- add attacking vaults
- add 1 normal vault for every 10 attacking vaults
- remove 1 normal vault for every 10 attacking vaults
- this attack pattern continues until the attacker controls a single section.
- report how many attacking vaults are required to control a single section
This test better represents the target of the attacker to reach their motive (ie to control consensus). The test includes ageing and elders etc.
Given an initial network size (col 1), how many vaults need to be added by an attacker until they control their first section (col 2-6)? And what percentage of the network does this represent (col 7, average of 2-6 as percent)?
Unexpectedly, the larger the network the smaller the proportion needed for success. Of course it’s still more total vaults to attack a larger network, but the proportion decreases as the network gets larger.
Netsize Test 1 2 3 4 5 | Avg Percent 1K 2599 1573 1964 2043 2340 | 67.4 10K 11449 19945 15004 19601 11616 | 60.0 100K 125776 98073 137955 103118 97975 | 52.7 1M 893224 983631 864215 974953 724229 | 46.9 10M - 7026273 5017128 6996670 7921890 | 40.0
The simulation is deterministic so these tests are repeatable (see commit
edf7aff). The simulations take optional flags for seed and netsize, eg
$ ./google_attack -netsize=1000 -seed=2 should give 1573 attacking vaults (row 1 test 2)
One caveat is the ‘disallow rule’ preventing multiple vaults aged 1 had to be disabled because it prevents small networks from growing.
What does it mean?
These results should be taken with a large amount of skepticism since there’s so many unknown factors. And the ageing mechanism is still being fine tuned by the maidsafe team.
The answer to ‘how many vaults are required to control consensus’ can not be any more precise than ‘it depends how big the network is’. Even when the network size is known, it constantly changes throughout the attack so the amount of resources required is very hard to know in advance. The table above at least gives some idea of the magnitudes at play as the network grows.
I’m surprised how difficult it is to make correct assumptions about this attack; most of my intuitions about it have been shown incorrect by the simulations.
It always needs to be remembered that controlling a single section doesn’t necessarily give the attacker much benefit (as happybeing pointed out above).
Also from a marketing perspective, it may be desirable to use percentage figures from before the attack. So for row 1 test 1 in the table above, 2599/3599 = 72% is the proportion of attacking nodes on the network after the attack. But it’s more impressive and marketable to use the proportion of attacking nodes required before the attack, ie 2599/1000 = 260% of the network required to perform a google attack. That’s almost triple the current network size! 260% is much more impressive than 72%, even though it’s actually the same thing.
Impact of Ageing
A typical attacked section has an age distribution like the table below (in this case taken from netsize=100K seed=4), with detail for the elders (oldest 8 vaults).
Attacked Section Age Distribution Age Attacker 7 false 5 true 5 true 5 false 5 true 5 false 5 true 5 true 5 false 5 false 5 false Age Attackers NonAttackers 4 4 6 3 15 5 2 5 2 1 1 0
Yes this is an interesting question.
The limits on how many vaults can be thrown at a network… trying to wander through the variables… they must store data, but the more vaults there are the less data each vault has to store. They must supply bandwidth for chunks, so at some point that becomes a bottleneck depending on the size of their pipe. They must be timely in their responses, so latency and bandwidth both matter there, probably also cpu for signature generation and verification. Additional labour and skill to modify the vault code for coordination between the attacker vaults… All these things cost money so ultimately budget will limit the total number of possible vaults any one entity can run and the duration they can run for. Maybe I missed some aspects?
Amended thoughts on google attack
I don’t think it’s possible to fully model a google attack because it depends on human behaviours of non-attacking participants (which are difficult to predict and model). Despite the imperfections there can be some attempt by big farmers to categorise themselves as a potential attacker vs merely a large-scale participant. Bystanders will probably only know of a google attack after it happens.
A successul google attack seems pretty far-fetched to me… but bitcoin mining ended up more centralized than people first imagined so I’m not too keen on making predictions!
Where to from here?
I’d like to look into the difficulty of an attacker controlling all copies of a chunk, ie not just controlling a single section but also controlling sections with redundant copies of a chunk (see RFC-0023 Naming Immutable Data Types although routing as currently coded only keeps chunks in a single section so I’m not sure what the plan here is). However, if there was some sort of ‘cache scavenging’ mechanism to recover chunks from temporary caches anywhere on the network, this would negate the attack.
Data loss seems to be Peter Todd’s main concern of a google attack, so for that to happen all copies of the chunk must be controlled by the attacker (which the simulation in this post models as currently implemented, but not as currently designed, pending implementation of backup and possibly also sacrificial chunks).
Excellent work again @mav and seems to be a more realistic simulation.
Considering work is being done on how nodes are added, (due to your finding the bugs) then the dev team can use these simulations to assist in making it even harder for an attacker.
I think that is enough to start with
Once the network is of a decent size (1M nodes) then the attacker has to have very large resources indeed to get control of one section.
And I’d agree with your amended thoughts.
@mav you do really great work, I like you are at the edge of the team and independent. That is important for sure, but please do post a btc (maid) address here. I think we need to be thanking you at least, but probably do a bit more than that. In any case I cannot thank you enough, these snippets are hugely helpful and extremely honest (brilliant). I am sure @nicklambert will sort out a few coins for you now though.
In terms of the post, I have not gone though it all yet in too much detail (working on a niggle/itch I have in our algorithm for secured off line chain, not a worry but there is an interesting take on it). Anyway I wonder if you can run this test (and also if @bart can do likewise in the sim) with infants created with age 4 and again restricted to one per section (all restrictions like this only happen when there is a complete group (i.e. group size elders all of age > 4)? Just for a small test if that is OK?
Thanks again chap, SAFE needs folk like you, the whole community is better for it.
Before adding this attack, could you just publish a working version of the simulation? To understand what I mean see this post.
Wonderful work, as usual. The numbers agree quite well with my rough mental calculations and reaffirm me that an attack of this kind is in practice, except in a small network, unrealizable.
In this type of attack there is also a fundamental factor that is the network effect created by the attacker himself. Malignant nodes need time to reach the necessary age, which is helping the network to grow and attract new users. The attacker enters a vicious circle that can end up helping the network without managing to control a group.
My impression is that more than a brute force attack we should consider the possibility of targeted attacks mixing a significant number of malignant nodes and waiting for certain matches to perform DDOS attacks against nodes of a specific group.
An exciting world …
I appreciate this but my approach to donations is outlined here: https://iancoleman.io/no-donations/
Since my projects are often the efforts of many people, most of which don’t appear in the obvious places like code or issues, donating to projects I work on poses significant operational difficulties.
In this case the ‘many people’ are you and the amazing maidsafe team who have given everyone such a fascinating project to explore, and I feel unable to accept donations when so many people are deserving.
I’ll look into this when I next get a chance. Does ‘infants created with age 4’ mean initializing vaults with age 4 instead of age 1? Maybe some description of the intention of the test will help clarify what this means.
There are only a couple of changes that should ‘fix’ the simulation, but these are not yet formalised in any docs anywhere
- relocate oldest possible instead of youngest possible, see choose_for_relocation
- temporarily disable (or modify) the disallow rule section.rs L198
I know this is just an indulgence, but could you outline your approach to these rough mental calculations? I’m always curious about the way people approach Fermi Problems. It’s a great skill to have.
I don’t think targeting specific sections is possible since a) vaults are relocated to a ‘random’ part of the network when they join and b) vaults must relocate at least 3 times to become an adult, and often much more than that to become an elder. The relocations are essentially random and cannot be targeted.
However, relocations are always to neighbouring sections so this does allow some better degree of targeting than purely random relocation. I’m not sure how much of an improvement it adds to the attack. Very interesting idea though!
Thanks, I will try that.
The simulation already does this because nodes_by_age returns nodes by descending age. The name is misleading, this function should be renamed in something like nodes_by_age_descending.
@mav Warms my heart to see this and the responses Ian. Well done and thank you man… again!
this is an interesting point too and one that comes up often in discussions as to what should be valid age triggers to allow just a gradual network growth rather than requiring churn to facilitate growth apart from startup conditions of requiring a complete grp and so on before this rule comes into play. Infants starting at age 4 idea was exactly new nodes starting at age 4 rather than 1, thereby not being relocated immediately at next churn event in joining a new grp and so on.
Data from this is very useful indeed and with the simulations can certainly help the guys structure the requirements here accordingly. Huge thanks for that I’m sure you’ll be hearing from them in this thread once they’re back next week (few of the guys still on their end of the year holidays)
Yes the thought is to still have short lived peers not cause hassle (we don’t record them, if they go off line we just delete their id etc.). Beginning at age 4 just means they need to hang around for approx 2^4 churn events (it will be more than that due to oldest first relocate). So we should still be ignoring people who just try a vault for a short period and give up (or try to mess with the network via switching off/on etc.) but do less work as a section by relocating the infants at all.
Hope this helps, thanks again Ian, love the donations approach BTW, very admirable, I am going to copy that approach myself. Makes a lot of sense.
Hey @mav. Thanks for all your efforts. I’ll get some BTC donations sent over to the EFF and FSF when we can do so in relative safety (re Meltdown and Spectre). While personal monetary donations are not for you (kudos for that) we would be happy to get some liquid donations your way. If this is of interest and your comfortable divulging your address please DM me. Cheers!
Ditto! You’ve earned an enormous amount of respect since you joined the community @mav. I hope I get to shake your hand and buy you a drink some day.
Great Skill??? Only too much time…
My logic tells me that:
.-If an attacker has the resources, and patience, to generate,for a certain time, more vaults than the sum of the entire network will end, inevitably, controlling a group. Node ageing delays the process but, in a permissionless network, it can not avoid it.
.-The proportion to control a group decreases, if the network is larger, because the attacker increases the chances of achieving it. The average percent begins with the consensus ratio and will be decreasing, as the network grows, until get near, but below, TotalVaults/2 or 50% with the actual parameters.
.-So the key is the consensus ratio but increasing this ratio implies lower network speed, more messages between nodes, more possibility of not reaching consensus and other atacks. Difficult choice…
This rule always worry me.
How can the network grow with this rule?
Will not it also generate enormous frustration to new farmers who can be stopped for a long time, without being able to work for the network?
I join the praises to your work. It is a pleasure that someone with such talent help in this development.
I think this is a misreading of what the figures were saying.
To control the first (just one) section, the attacker has to have more resources if the network is larger.
For a small network (1K) the attacker only needs around 2500 nodes.
As the network grows the attacker needs massively more attacking nodes to and at 1 million nodes the attacker needs 893224 Nodes.
So the larger the network the larger the number of attacking nodes needed to gain control of the first (just one) section.
While the proportion is interesting and shows a reducing %age we still see the number of attacking nodes becoming too many for any one entity on this planet when the network goes global with millions and millions of nodes.
And of course this is an area where the dev team can make it even harder for an attacker to get a foot hold with a network past the testing stage.
My English is, probably, worse than I thought.
In the post I am talking only about proportions, not global numbers. Of course, if the network is larger, you need many more nodes but the percentage decreases from 67.4%, in a network of 10K nodes, to 46.9% in a network of 1M nodes.
These numbers match my calculations where an attacker in a large network will need less than TotalVaults/2 to achieve his goal.
Although, as I said before, these calculations do not take into account the effect of attracting new users that the attacker generates, which is why I have always doubted their effectiveness.