Analysing the google attack

Ok, well then it’ll become a DDOS attack on the whole SAFE network, no one will be able to access SAFE or at least it’ll be very very slow if google takes most of the bandwidth farmers can upload.

So why doesn’t google do that with their competitors now? Answer that before assuming that your google attack would ever happen. Remember that your google attack is also attacking all the ISPs that the nodes are connected to. So Google would be in court very fast being sued by a lot of ISPs

Yeah fair point, Google itself may not be but other big firms may, but yes, it’s unlikely, i was just thinking about possible attacks. But yeah i’m not worried about this especially with what @oetyng has said. At worst it’ll just be a temporally DDOS when the network is tiny, just like many other websites can get DDOSed too, they will climb back up.

1 Like

I’ve been wondering - what would a malicious party be able to do if they get control of the following:

  • A Section or several groups

  • 5% of Sections or Groups?

I guess I’m just wondering at what point the attacker becomes able to delete data, or block access to data or shift safecoins from one account to another or double spend etc.

Because with blockchains I’ve observed it’s mostly all or nothing - either you can’t double spend or do much or you gain 51% and you control the network.

My impression is with Safe Net it’s different, so at what point of network compromise (lets ignore the debate if it’s possible or not for a moment) are we talking about major disruption to network?

1 Like

If you control a section then you could manipulate the data it controls.

If you control multiple then the minimum is that you multiply the above. The most is that you be lucky to control the specific sections used to validate a safecoin transaction and thus steal some coin.

But the chance of gaining even one section is not as easy as it sounds but on a small network the possibility cannot be ruled out at the moment. BUT we haven’t yet seen all the protections yet. I’d say that there will be a point after which that even the likes of google or China could gain control of any sections.

1 Like

Agree with @neo but also worth noting the other difference you highlighted:

  • 51% attack on blockchain means control of the entire network. That means you could very subtly choose exactly which data you manipulate and when. So the scope of such an attack is enormous, including of course bringing down the whole network. It would better be called the God attack because it’s power is total. You can identify individual wallets and possibly individuals and target them, and in all likelihood few people would ever know. This immense power and versatility makes such control very attractive for a wide range of purposes, criminal, commercial and political, which in turn makes them more likely to be attempted.

  • controlling a section on SAFEnetwork would mean that for the time you have control, you could do at least these things (I think): a) block actions that happen to data controlled by your section (get, put, transfer SafeCoin etc), but only in the tiny amount/fraction of data your section controls, not the whole network and you can’t target individuals or content because you can’t identify the individuals or decrypt the data you are storing. b) delete data that is being looked after by your section. This is more serious than a), but again the scope is limited to a tiny fraction of the network storage, and again you can’t choose which data and you can’t target individuals or particular content. c) steal a SafeCoin whenever one of the SafeCoin in your section is being transferred by intercepting the transaction and modifying it. I don’t think can just go ahead and steal all the SafeCoin in your section, at least not easily, but if you can, it is still only an extremely tiny fraction of all SafeCoin, and again you can’t target it - the theft will be random.

Even though you can do much less, controlling a section is still very hard, similar scales of difficulty in terms of cost of resources than a 51% attack once SAFEnetwork passes a certain size. And much more difficult through centralised political or commercial control (whereas the number of miners needed to control 51% of bitcoin is quite small, making it very vulnerable to this kind of attack IMO).

But the big difference is that the rewards for the attacks on a SAFEnetwork section are tiny whether your aim is sabotage or theft, and this matters.

This is a great protection, because attacks are less likely if the rewards are less, or simply don’t justify the cost of attempting them - especially if there are more profitable targets, in which case you can pretty much rule out attacks motivated by theft.

So on SAFEnetwork we may only need to worry about sabotage for political or censorship reasons, because there will almost always be much more profitable targets for thieves. The good news is that if SAFEnetwork succeeds in decentralisation as intended, these attacks become very difficult and impractical once the network passes a certain size.

11 Likes

What makes you think this is possible?

Economics.

So now you have a distributed network of datgacentres and there goes the attack and the network survives.

Try hosting something contentious, like ISIS propaganda and see what happens.

They have to compete with all the billion of potential home vaults which cost their owners near zero costs.

This assumption appears flawed to me: see @mav “spare resources at the consumer level probably won’t be viable as vaults”

their are billions of computers/devices on the internet and then that people will run 2 or more vaults if they can.

I hope that this is the case but it certainly sounds delusional to me. Running a 24/7 server is not something many people can actually do. Otherwise, we would already see many more home server that do simple email for example.

Works very much against you here. If the economics of SAFE is to reward just enough to make home vaults worthwhile (they have near zero costs) why would data centres be able to compete when they have to pay for everything and its never “spare”

So the home vault with virtually zero additional costs to run a vault will find just enough profit to keep running is always going to trump a vault running on non-spare resource costing real $$$ to run.

They have no idea which nodes contain that stuff. And would they even know its there since they are not in those circles of contacts?

And that is an opinion. With 40mbits/sec upload in australia a home vault needing less than 6mbits/sec to run is easily going to work and we will have millions of those available in Austalia (within 2-4 years) if people wish to run a vault.

Europe is up to 1Gbits/sec home links.

So no I don’t agree with @mav that only data centres can be vaults.

The datachains and ageing algorithm

I don’t think this will happen intentionally. So aging won’t help. I for one will simply spin up as many instances I can until I reach the marginal cost of producing SAFE space. Eventually, some regulation will kick in and I will hand over the keys to whatever agency asks for it. So will everybody else in reach, i.e. on the cloud.

I agree with you that spare resources at the consumer level probably won’t be viable as vaults

That could have worked maybe in 2003, in 2023 everybody will have more space on Dropbox than on their personal devices.

I think the inconvenience caused by vaults consuming bandwidth will mean most consumers won’t like it (ever had someone in your house start a torrent with no throttle?! it sucks!!). So my feeling is datacenters will be the main place vaults will reside (but probably remote individuals will be in charge of the vaults).

Exactly. I have been running home servers for most of my life, this isn’t viable for anybody else I know.

I don’t think cloud providers can be coerced easily.

You may want to consult prq.se or WikiLeaks.

I think a useful approach would be to model the resilience of the network. The 2 extremes would be Amazon/Google = 0 resilience and home hosting routed through Tor = 10.

As it stands, I would argue that 99%+ of the network will have close to 0 resilience. I think most amateurs underestimate how crazy cheap cloud storage and bandwidth (especially upload!) is in comparison to your home rack that interrupts every 24h when the IP changes. See @neo who thinks that it comes at “zero cost”.

home vaults worthwhile (they have near zero costs)

So your conjecture is that the network will be bounded by:
cost of home server < cost of AWS
AND
value of SAFE coin < cost of AWS for gaining SAFE coin

Good luck with that.

They have no idea which nodes contain that stuff. And would they even know its there since they are not in those circles of contacts?

Exactly, that is why they will require every participant in such an arrangement to register and obtain a license. All unlicensed arrangements will be illegal. Bitcoin works because only a small amount of participants with limited hardware and bandwidth needs to evade this to render regulation unenforceable. In case of SAFE you need much more.

Europe is up to 1Gbits/sec home links.

Hardly a reality for 90% of Europeans.

Would only need the 10%. But you will be surprised how many now and its increasing year by year. So when SAFE launches even your 10% might be 15 or 20%, so while a portion of 10% is good, a portion of 15 or 20% is a lot better,

Anyhow you are doing an underestimation of the number who now have access to gbits/s up/down in europe.

Yes, but if it did become profitable for a datacentre to run vaults then it has then become extremely profitable for home users to run vaults and they will use bandwidth limiters (built into node?) to ensure their vault does not overwhelm their internet usage

So which ever way you look at it. The more profitable for a datacentre to run vaults the MUCH more profitable for a home user to run a vault and helps them overcome the sometimes slowdown that might occur if they don’t have some bandwidth limiter. But 6mbits/sec out of 40mbits/sec is not something that typically ruins the experience for the users of that link.

1 Like

Great point. It reminds me of this statement from dirvine

If people can’t make a basic cost / benefit statement on the attack they propose (not necessarily an analysis, just a simple acknowledgement) it almost certainly isn’t well thought out. I understand costs and benefits are not concrete, but they matter a lot in this network.

“I can attack the network” needs to be met with “at how much cost” and “what will you do once it happens”.

I find attacks on the network extremely interesting, but mainly because they aren’t feasible.

13 Likes

Really interesting view happybeing,

Short follow up:

If a malicious section blocks access to a data piece - considering every data will have 4 copies, would the network simply fetch another copy from some other section so that user trying to access the data is not inconvenienced? And would the network assume that the malicious section copy is lost, so make another copy to be controlled by some other section?

Similarly if malicious section deletes data, would it be safe to assume 3 remaining copies would still be there and network would simply make another copy at some other section to make up for the loss?

So if that is the case then a malicious section is severely limited in the level of disruption they can cause - even if access to data is blocked, data is deleted, data is corrupted - the network has copies outside the section - and obviously the data is itself useless to malicious actor as its encrypted and all.

I guess the only profitable scenario that is of benefit according to your reply is stealing a safecoin by intercepting and modifying a transaction by the section.

If we stick with Mav’s simulation (ageing not counted) results that 20% of nodes in the network are needed to capture one section - I would say it would be unfair to equate cost of 51% attack on a decent currency with that of 20% of Safe Net - but nevertheless, it would be a sizeable expense - and as you stated, it would be irrational to incur such a big cost for very little financial gain, as compared with complete dominance of blockchain system at 51% gain.

It does surprise me though, that the data at most risk of being compromised if a malicious actor takes control of a section is that of Safecoins - I was actually under the impression that they’d be the hardest to manipulate.

Wouldn’t it really bad PR and just a bad impression if a user of Safe Net has their transactions manipulated and lose their safe coins? Granted the system as a whole won’t be compromised and the amount hacked may be low, but in the world of crypto where transactions are holy and immutable - Safe Net would suffer from bad rep if people are told any person may lose their coins if a malicious entity gains 20% of nodes.

And if an entity can short Safecoin prices, they may try to spread fear and uncertainity by carrying out such an attack and heavily publicising the ‘fatal flaw’ of safe net - so benefiting not from the safecoins they stole, but from the resulting fall in safecoin prices from the fear in the market.

1 Like

@monty thanks for the comments, but I don’t think they are correct.

My understanding is that each chunk (a piece of a file, or a SafeCoin) is managed by the elders of the section controlling it, and that those elders effectively look after all the copies of the chunk.

So controlling a section would allow deletion etc as I’ve described and I don’t believe your mitigating idea is correct (ie get a copy from another section).

Also I don’t believe that SafeCoin is more vulnerable than other data. Firstly it is just data, so has redundant copies and would have the same degree of vulnerability as a chunk. I’m not sure, but it is also possible that greater protection can be provided to SafeCoin for the reasons you mention, though this won’t be more until Maidsafe decide on implementation.

If we refer only to deletion, yes, but chunks are immutable data and are self-verifying as their XOR direction is the hash of their content. So they are much more secure than MDs that could be modified without the user being able to detect it.

3 Likes

If the network is already creating copies - won’t it really enhance network’s durability and security to actually make it such that one section gets one chunk only? This way four chunks mean four different sections - greatly enhancing the %age of malicious nodes required to manipulate a piece of data?

If I understand correctly it does do what you are suggesting - i.e. the data is spread randomly across the machines that make up the network. The word ‘section’ is where the confusion lies. Here’s my understanding of how it works.

A section is a group of nodes that’s responsible for looking after data stored within a certain range of addresses on the network (for the sake of a simplified example, addresses beginning A or B). If the hash of a chunk of data begins with A or B then it will automatically have an address in the range A - B (because the hash is the same as the network address) so it will be managed by the section that looks after the range A - B, which will usually be around 10 - 20 nodes. A number of the nodes in this section will store copies of the chunks and most likely they’ll be widely geographically dispersed because XOR distance bears no relation to geographical distance - physical location within a section is randomised.

Just because the network address is fixed, that doesn’t mean that a chunk is only stored on a few machines for ever. The membership of this section is constantly changing, with nodes leaving and new ones joining. When a new node joins it gets copies of the data chunks whose hashes begin with A or B; when it leave it loses those chunks. So a node in London might be replaced in the section by one from Mumbai and the Mumbai node will now store all the A and B chunks. A while later that Mumbai node might leave to join another section with a new node located in Addis Ababa replacing it and getting all the A-B chunks, and so on. So the XOR addresses (and therefore the data stored at those addresses) looked after by a particular section stays the same, but the membership of the section, i.e. the machines storing copies of a particular chunk, is constantly changing.

2 Likes

The google attack is related to the EMP/solar flare attack thread:
“What about a catastrophic event that wipes out millions of nodes”

Random locations in disk space does not ensure geographic spread. It doesn’t matter if I have 8 redundant copies of my files if they are all in a couple 5000PB server farms in sunny CA when the big earthquake hits. After SafeCoin launches I think you will still see the same geographic concentration of servers to affluent regions unless there is something that be added to the code that allows one to know the general geographic region of a node. Perhaps there could be an incentive to become “beacon” nodes, that can determine a trusted geolocation. To preserve anonymity this geolocation does not need to be accurate… I would think that even knowing that a node was in a particular octant of the sphere/globe would help ensure the data is spread. This might be able to achieved through latency timings and known locations of servers on the internet backbone. Ensuring this geographic spread might help maintain a balance of power between a few big players and the churning masses…

1 Like

These, more than unlikely, mega farm will be in an anti-earthquake building, with its own power systems and both terrestrial and satellite connections, so the possibility that your data will disappear forever is negligible and, thanks to the datachain, the data would recover.

So you want to change the complex routing crate based on an improbable assumption of an unlikely case of a very small possibility.

These what-if cases, of infinitesimal possibilities, begin to be very tiresome. I would be extremely happy that the network, with the current configuration, came out within a reasonable period of time.

3 Likes

Google attack stats to control one section

The google attack test should report how much work it takes to control a single section. This post uses a test which:

  • establishes an honest network of a certain size and average age
  • then repeatedly
    • add attacking vaults
    • add 1 normal vault for every 10 attacking vaults
    • remove 1 normal vault for every 10 attacking vaults
  • this attack pattern continues until the attacker controls a single section.
  • report how many attacking vaults are required to control a single section

This test better represents the target of the attacker to reach their motive (ie to control consensus). The test includes ageing and elders etc.

Results

Given an initial network size (col 1), how many vaults need to be added by an attacker until they control their first section (col 2-6)? And what percentage of the network does this represent (col 7, average of 2-6 as percent)?

Unexpectedly, the larger the network the smaller the proportion needed for success. Of course it’s still more total vaults to attack a larger network, but the proportion decreases as the network gets larger.

Netsize   Test 1       2       3       4       5 | Avg Percent
     1K     2599    1573    1964    2043    2340 | 67.4
    10K    11449   19945   15004   19601   11616 | 60.0
   100K   125776   98073  137955  103118   97975 | 52.7
     1M   893224  983631  864215  974953  724229 | 46.9
    10M        - 7026273 5017128 6996670 7921890 | 40.0

The simulation is deterministic so these tests are repeatable (see commit edf7aff). The simulations take optional flags for seed and netsize, eg $ ./google_attack -netsize=1000 -seed=2 should give 1573 attacking vaults (row 1 test 2)

One caveat is the ‘disallow rule’ preventing multiple vaults aged 1 had to be disabled because it prevents small networks from growing.

What does it mean?

These results should be taken with a large amount of skepticism since there’s so many unknown factors. And the ageing mechanism is still being fine tuned by the maidsafe team.

The answer to ‘how many vaults are required to control consensus’ can not be any more precise than ‘it depends how big the network is’. Even when the network size is known, it constantly changes throughout the attack so the amount of resources required is very hard to know in advance. The table above at least gives some idea of the magnitudes at play as the network grows.

I’m surprised how difficult it is to make correct assumptions about this attack; most of my intuitions about it have been shown incorrect by the simulations.

It always needs to be remembered that controlling a single section doesn’t necessarily give the attacker much benefit (as happybeing pointed out above).

Also from a marketing perspective, it may be desirable to use percentage figures from before the attack. So for row 1 test 1 in the table above, 2599/3599 = 72% is the proportion of attacking nodes on the network after the attack. But it’s more impressive and marketable to use the proportion of attacking nodes required before the attack, ie 2599/1000 = 260% of the network required to perform a google attack. That’s almost triple the current network size! 260% is much more impressive than 72%, even though it’s actually the same thing.

Impact of Ageing

A typical attacked section has an age distribution like the table below (in this case taken from netsize=100K seed=4), with detail for the elders (oldest 8 vaults).

Attacked Section Age Distribution

Age Attacker
7   false
5   true
5   true
5   false
5   true
5   false
5   true
5   true
5   false
5   false
5   false
Age Attackers NonAttackers
4   4         6
3   15        5
2   5         2
1   1         0

Resource bottlenecks

Yes this is an interesting question.

The limits on how many vaults can be thrown at a network… trying to wander through the variables… they must store data, but the more vaults there are the less data each vault has to store. They must supply bandwidth for chunks, so at some point that becomes a bottleneck depending on the size of their pipe. They must be timely in their responses, so latency and bandwidth both matter there, probably also cpu for signature generation and verification. Additional labour and skill to modify the vault code for coordination between the attacker vaults… All these things cost money so ultimately budget will limit the total number of possible vaults any one entity can run and the duration they can run for. Maybe I missed some aspects?

Amended thoughts on google attack

I don’t think it’s possible to fully model a google attack because it depends on human behaviours of non-attacking participants (which are difficult to predict and model). Despite the imperfections there can be some attempt by big farmers to categorise themselves as a potential attacker vs merely a large-scale participant. Bystanders will probably only know of a google attack after it happens.

A successul google attack seems pretty far-fetched to me… but bitcoin mining ended up more centralized than people first imagined so I’m not too keen on making predictions!

Where to from here?

I’d like to look into the difficulty of an attacker controlling all copies of a chunk, ie not just controlling a single section but also controlling sections with redundant copies of a chunk (see RFC-0023 Naming Immutable Data Types although routing as currently coded only keeps chunks in a single section so I’m not sure what the plan here is). However, if there was some sort of ‘cache scavenging’ mechanism to recover chunks from temporary caches anywhere on the network, this would negate the attack.

Data loss seems to be Peter Todd’s main concern of a google attack, so for that to happen all copies of the chunk must be controlled by the attacker (which the simulation in this post models as currently implemented, but not as currently designed, pending implementation of backup and possibly also sacrificial chunks).

35 Likes