Security features of SAFE - a summary

On learning about PARSEC a few people have jumped in to question its vulnerability to a Sybil attack. Those in the know have pointed out that this would indeed be an issue were it not for other features such as node aging which make this sort of attack much more difficult. Security is maximised by having an interlocking set of features each covering the others’ vulnerabilities – strength in depth. I thought it might be helpful to list all the security features of SAFE and how they combine to ward of the main sorts of attacks. Not really my area of expertise so feel free to chip in.

Features

Encryption - All data on the SAFE Network is protected by several layers of encryption. Even public data is encrypted but in this case the keys are shared to allow others to decrypt it.

Self-encryption - files stored on the network are encrypted then broken into chunks. These chunks themselves encrypted using the hash the previous chunk, hashed and stored at geographically random locations (the location is the hash of the encrypted chunk) on the network with a number of copies retained for redundancy.

XOR networking - randomises the geographical distribution of the chunks. Only someone in possession of the datamap (ie the owner) can find the chunks and piece them together again to recreate the file. Anyone trying to fake a chunk could not do so as its hash - and therefore its address on the network - would be different.

Self-authentication - a user can create an account and log into the decentralised SAFE Network securely and anonymously without requiring any central server to mediate the login process or any trusted third party to store and manage users’ credentials.

Proxy node - To retain anonymity, the identity of a client connecting to the network must be obfuscated from the nodes that comprise it. For this reason connections between clients and vaults in the SAFE Network always occur via a proxy node.

Disjoint sections - addresses on the SAFE network are grouped into sections with each section managed by a small number of nodes. Those nodes know everything about the section they are responsible but very little about the rest of the network. Moreover, the membership of a section is constantly changing and sections will frequently split or merge. So even if an attacker could control a section his potential for damage would be limited.

Datachain - All events occuring in a section are stored in a ledger - a datachain. All section members hold a copy. Elements of it are also shared with nearby (in XOR terms) groups. Because other sections are able to audit their neighbours it becomes harder for an attacker to benefit.

PARSEC - The new PARSEC consensus algorithm provides an quick and efficient way to be sure of the true order of events happening within a section, and by extension in the whole network, even when the section is changing rapidly with nodes leaving and joining.

Node ageing - only nodes that have proved their worth over time (elders) are allowed to vote on the validity of events in a section. Nodes that do not pull their weight or act as they should will be expelled and/or their node age reset to a lower value.

Membership rules - there are rules about how many new nodes (low node age) can join a section.

Churn - nodes are constantly joining or leaving sections. Membership is fluid (how fluid I guess remains to be seen).

Defence against common attacks

Sybil attack - In a Sybil attack, the attacker subverts the reputation system of a peer-to-peer network by creating a large number of pseudonymous identities, using them to gain a disproportionately large influence.

This could be possible if PARSEC were used alone as an attacker owning more than one third of the nodes could effectively take control and manipulate events. However, a combination of node ageing, datachains, churn, rules on joining sections and sections splitting and merging would make this massively more difficult (and presumably very expensive).

Google Attack - when a large company (such as Google) owns a significant portion of the vaults on the network. On blockchains I think this is called a 51% attack - anyone who controls 51% of the nodes wins.

An attacker owning a large number of nodes could potentially control individual sections and block actions happening to data in that section (get, put, transfer Safecoin) but only in the fraction of data the section controls, not the whole network. In addition, disrupting an individual’s data would be impossible - you cannot know where it is stored. However, someone with enough nodes could bring the network down, potentially. This gets harder as the network grows.

Phishing, keylogging etc - these could still work. Any attack on the endpoint that revealed your credentials could allow an attacker to access your data. But only your data (and that others have allowed you to see). Using that as a springboard for a wider attack on a database or whatever would not be possible. For an attacker it would be of dubious value.

Man in the middle attacks - should be impossible. (?)

DDoS - very difficult as there is not one single point to attack. The network will simply reroute around any nodes that are taken down.

Quantum computing - who knows? The encryption would be shot but the decentralisation would provide additional barriers.

Ransomware - nah - nothing to lock

Any more?

28 Likes

Good list. My comments (please correct if wrong):

  • Parsec requires 2/3rds of nodes to agree for consensus to be achieved. So for the entity with lots of nodes, I think they’d need 67% of the network, not 51%, to call all the shots. But I guess once the entity got over 33% they could throw a wrench into things by not letting others reach consensus. Maybe at that point they’d get the boot? Interesting to think about.
  • An aspect of the Google attack that wasn’t mentioned (edit: actually you did have a sentence about it) is what happens if a big part of the network’s capacity suddenly disappears. There is a good thread on it here Analysing the google attack. The network dynamically adjusts the farming rate to try to keep a certain amount of spare capacity available, but at some point a big loss becomes too much to overcome. However, network restart capability can help bring things back up if this occurrence wasn’t malicious but rather due to disruption of internet communications for example.
9 Likes

My understanding is that only a majority is required to carry a vote (not 2/3), and that the 2/3 figure is the proportion of good nodes required to avoid a successful attack.

8 Likes

There is one encryption at this time. It is self encryption where each chunk is encrypted using the hashes of the 2 chunks around it. Without the data map these files cannot be decrypted.

The main first defence is that a new node cannot set its own address and thus which section it joins. This prevents the ability to target a section to add bad nodes to it. Thus on a random distribution you would need to control approximately 1/3 of all nodes in the safe network. The chance of adding a relatively small number of nodes (many thousands) and being able to attack any one section with greater than 1/3 nodes is extremely low and the larger the network the lower that becomes.

In addition each section only accepts a very small number of new nodes at any one time. One new node at this time. This now makes the possibility of using a smallish number of bad nodes (many thousands) into one of long term patience and requires the luck virus from red dwarf.

Now if an attacker were able to get 1/3 of the nodes in the group then the node ageing means that they have to behave perfectly for a long time to get enough age in order to cause damage.

These 3 things work together to help secure the sections from attack.

Because the encryption does not require the web encryption of certificates then the man in the middle only sees encrypted data and no access to the keys. Man in the middle relies on unencrypted data or being able to have your browser accept its certificate instead of the website’s certificate.

I do believe that the encryption used is quantum resistant. Also the shear volume of encrypted packets means that there could only be targeted attempt at decryption and decentralisation makes targetting difficult

Your files are stored in immutable data and cannot be changed, so the ransomware cannot encrypted your data. But it could attempt to delete your datamaps or hide them.

12 Likes

In a situation like this groups are still protected. All nodes that are not already elders with voting rights will be immedietely relocated upon ageing. This would likely mean greater than 1/3 of all nodes on the network would be needed. Unless all of the attackers nodes age at the same rate.

1 Like

Not relocated, but ignored so those nodes have to disconnect and try again.

Thus these bad nodes have to behave till they become an elder. Long time for an attack and becomes costly.

1 Like

I think what I meant is that these young nodes, even if well behaved, will be relocated once they reach an age of authority. So if an attacker reaches the point of constituting 1.1/3 of a sections’ population, it would matter little if they are not already in a position to influence decision making or consensus. IIUC.

2 Likes

Basically once the network is out of baby stage and has 1000 or more sections it is going to be one difficult job to take over a (non specific) section now.

1 Like

Rather surprised by how comprehensive and easy to comprehend the initial post has been laid out, especially for one claiming to not have expertise in it! Just made a tweet linking it and saying, “Actually pretty good forum post summarizing the insurmountably impeccable security aspects[/totality] for our future Internet.” Good to have everything put into perspective and with a particular energy and flow that brings it together—or “all” together, with continued replies. Particularly: quantum defense is a common topic argued by the most long-term of skeptics.

4 Likes

Thanks. I’m really not an IT security expert but doing the background research for the Primer was a good learning experience. It’s a good idea to put #maidsafe or #safenetwork in tweets btw so they get found by people who aren’t following you.

2 Likes

s/the distributed hash table/data map/

No one can own the distributed hash table.

OK so wrong terminology then. I should have said data map, yes?

1 Like

The DHT is the overlay network to locate network object on the network => the DHT is the structure of the network.

1 Like

As much as I want Safenetwork to work I cant but point the truth: it all depends on the proof of resource and aging. Relying on HDD and bandwidth protects the network from rival PoW mining operations but makes it massively vulnerable to a DataCenter attack.

2 Likes

How likely the Google / data center attack is remains to be seen, but if in a few years it turns out that there are any data centers which runs a dangerousously high percentage of vaults, something should be done.

What I can think of for now is to add another type of farmer. You’d have the storage farmers like now and then you have wireless farmers. A wireless farmer wouldn’t store data but would instead be open wifi access points or mobile hotspots. Then you would have mobile clients that would sign some package to the network to verify that a certain wireless farmer served them some bandwidth and the wireless farmer would then increase in rank. Then there could be a rule that each section has to have a number of wireless farmers in addition to storage farmers to help verifying consensus and mitigating attacks of the storage farmers of a section was taken over.

Anyway so this idea isn’t very fleshed out and I don’t know if it would work,but I think it’s nice to at least think a bit about possible directions if farmer centralization were to happen,hopefully it won’t,but we can’t really be 100% certain until the network has been live for a good while.

1 Like

Silly question here… what’s the difference between a Sybil attack and the “Google attack”? They sort of sound like the same thing to me.

The Google attack is a kind of Sybill attack.

1 Like

Great summary. I think it would benefit many if this stuff was a bit more accessible. Maybe something to add to the primer in the future?

4 Likes

Yeah I was thinking along the same lines. Need to get all the facts straight then I’ll see what I can do.

1 Like