Close group security

I have read the system docs, and it seems that a lot of SAFE’s security relies on the hardness to enter other nodes’ close groups at will by means of sybil attacks.

It isn’t quite clear to me why this is hard. Can anybody explain it to me?

I read here that to create a vault ID you need two pairs of keys. One revocation key (Rpriv, Rpub) and one regular (Kpriv, Kpub). The vault ID is then H(Kpub + SigRpriv(Kpub)). Generating new vault IDs that are close to arbitrary other vault IDs seems like it would be easy to me. What am I misunderstanding?

1 Like

You apply to the network for a name. What I mean is the network will deterministically (based on a changing network) give you an ID, you cannot supply your own, you can give keys and ask for them to be used to create your ID. The ID is created by the group closest to the hash of your keys by adding the ID of the 2 nodes closest to your keys at that time. They then send you back the name and send the keys plus derived name to the group now closest to the derived name.

7 Likes

I see. Thank you for the quick response. If you don’t mind, I have three followup questions.

So the ID/name is Hash((Rpub + Kpub) xor nearbyNodeID1 xor nearbyNodeID2) where nearbyNodeIDn are the two nodes closest in xor space to Hash(Rpub + Kpub)?

How do the first and second nodes to ever join the network get their IDs?

Is it possible to generate/find/derive the list of nodes/node IDs currently participating in the network?

I am trying to build a simple simulator for further study, so getting the details right is important. :smile:

2 Likes

These nodes will first self name and then relocate the others. As network grows they get excluded and need to re-bootstrap.

No, you could try searching, but there is no search capabilty, no node will answer this. As the network grows any such result would be temporary as churn is likely high. We are aiming for as little possible network state as possible, to allow for very short lived sessions and very high churn (they go hand in hand).

Cool, you can check out the maidsafe-archive and grab the address_space_tool form the common lib there. It’s a decent start at simulating some of the security assumptions of xor addressing. At least a start. Large scale simulation would be amazing. The hard part is simulating the human use, i.e. churn and node resource limitations etc. but a great project to start. We also have a simulations repo, that has a bunch of omnet and oversim kademlia (mostly) simulations https://github.com/maidsafe/Simulation
Hope that helps

6 Likes

Welcome tot the forum!

That’s really great. Would love to see that.

2 Likes

I should add here, don’t get focussed on only close group security. By that I mean client data cannot be decrypted by the network, it has no idea how to. Then client data is signed by clients meaning only they can update it. So there is base crypto primitives there as well as group security. We consider a break in group security to be a potential vandalism attack at most.

Worth taking all that into account, but if you simulate groups etc. then we can add all this in, rather than a mammoth discussion just now that would complicate it all.

2 Likes

Thanks a lot, I’ll look into that. :smile:

Okay. There’s a lot of details to understand, so I’ll just get started on this part and see where it goes.

5 Likes

I’ve been studying this explanation, and I think I can see a vector attack.

I’m assuming this scenario:

  • You generate a Key (‘K1’) and the hash of this public key is ‘HashPublicK1’
  • The closest group to ‘HashPublicK1’ receives a request to create a new ID (‘ID1’).
  • The closest group to ‘HashPublicK1’ generates then ‘ID1’(your new name) using Hash(‘PublicK1’ + ‘NameClose1’ + ‘NameClose2’) (deterministic)
  • You receive ‘ID1’ as your new name for join the network.
  • The closest group to ‘HashPublicK1’ sends the public key ‘PublicKey1’ to the closest group to ‘ID1’

If I want to create N nodes for monopolize a group consensus(create N nodes with their ID next to each other) , then I only I have to do this:

  • I join the network twice using the procedure above, and my current names are ‘IDOrigin’ and ‘IDDestination’
  • I use a sniffer in my vault at ‘IDOrigin’ (by modifying the source code of the vault) and I can see the two closest nodes to me. They are ‘OriginClose1’ and ‘OriginClose2’
  • I use another sniffer in my vault at ‘IDDestination’ and I can see the closest node to me.
  • Now, I’m going to generate 32 new pair of Keys near to ‘IDDestination’ using brute force, due to I know each ‘brute_force_pair_key’, ‘OriginClose1’, ‘OriginClose2’ and my ‘IDDestination’.
while (num_nodes_generated < 32) {
   loop {
      let pair_key = new_pair_keys();
      let idDestination = closest_node_to_IDDestination();
      let originClosest = two_closest_node_to_IDOrigin();
      let new_name = Hash(pair_key.public_key + originClosest.0  + originClosest.1);
      if (distance_xor('IDDestination', new_name) < distance_xor('IDDestination', idDestination)) {
         join_the_network_using(pair_key.public_key) ;  // implicitly to new_name
         num_nodes_generated ++;
         break;
      }
   }
}

When you reach the majority of 28 of 32 nodes, you can attack the network. I know the network is changing all time and you don’t keep the majority for a lot of time, but only a few seconds can be enough to make an attack.

2 Likes

I am not sure where these sniffers will exist (you cannot sniff a remote group). But say you get 2 vaults in two groups, which is what I think you are saying. Then watch traffic there (including what you think are the close group to some node. Then if we take it form there to see what can be done.

This part seems where it falls down, even a tiny fraction of a change here will send each of the generated ID’s to very different parts of the network.

If you can explain this part a bit more it will help, from my position all you could do is “brute force” (which is exponentially more difficult over time) keys that will go to a particular group to start the transformation process. IF there not identical then they disperse very far, if they were identical then only the first one would work.

I am not sure I am not missing what you are saying though?

2 Likes

Maybe he sniff the group he his currently in.

Even if it’s not identical. If the network count 10000 nodes total has an example. The brute_force will not be able to join the current group fast by generating 32 closest node to form the group?

To that there is role that he can’t choose, all of his nodes have to play the same role for it to work?

Yes, I’m saying that. I have two tainted vaults. I have modified the source code.

The explanation is the pseudo-code that I’ve included.

That’s why I look for the closest group using brute force.

1 Like

Ah in that case then I think what you are doing is almost the exact same as just creating multiple nodes and joining looking for a close group. This is exactly what it is designed against in reality and why we chose the group and quorum sizes. If you check the ``address_space_tool` in the maidsafe-archive/common repo you will find a way to run the simulation that matches this assumption at very high speed and prints out attempt numbers etc.

That will help I think. Getting much closer to xor distance and just how very different it is from a linear distance. This is the part that becomes very hard to keep in mind. Every single person (I mean 100% of them I have met) uses linear approximations almost by default, it makes it imaginable to see such thinkgs being an issue, but it is not when you measure it then the measurements leads to deeper analysis.

I think the address_space_tool will show you that and introduce some more tweaks you can make to try even harder to get a close group.

If you expand the python example (and please do not use linear math, you must measure distance by xor point A against point B WRT point C) The with respect to the 3rd point is critical and alters the tests dramatically (i.e. this kind of security is not possible with a linear addressing scheme)

1 Like

It’s like this, the security of the groups exponentially increases with population, so very small network is less secure. For a network of 200 you need to add approx 700 nodes to get a group, for 1000 then we can run the tool to get the number, I think it is around 4900 nodes. I have the figures somewhere, I think also in my blog, in any case the tool can show this attack, but assumes you are a singular attacker, there are no others adding nodes etc. i.e. if you cannot add nodes faster than the whole network (changes) then no matter how many nodes you add it does not get a close group. This is because the attack is greater than 50% so if you cannot add faster than 50% (a node at a time) then you can never reach the 50% mark. What I mean if 2 nodes are added and one is not you then the attack fails even with all the security turned down and making the worst possible assumptions from a security perspective (means imagine attacker matches every assumption in his/her favour)

2 Likes

I was talking around 32 nodes. Not much more.

Ok, I’ll give it a look.
I comment later.

That’s not really what I meant. Looking at @bcndanos codes, he keep only close distance xor name and discard the far one. Be doing so he join only the group he want but do not create nodes for far distance.

I’ll look too to the address_space_tool and try to understand better.

1 Like

No worries, I can assure you that you can (and we do) take months (>6) to get used to this. So there can be a million forum posts on it for sure. It’s hugely interesting but very confusing at the same time. I have spent a huge number of hours on whiteboards explaining it. I don’t mean to cast dispersions, but this xor DHT / kad stuff is really very under researched, but seems simple (that’s when you know its hard :wink: ) Lot’s of hidden surprises, you will enjoy it for sure.

2 Likes

I’ve looked at the test that you said. It’s really interesting and I agree with you that a brute-force attack to get the control of a consensus group is a huge task. I understand perfectly well the concept. In fact, I made an Node.js - ZMQ implementation of a DHT using XOR spaces just for fun early this year.

But, I wasn’t talking about that, I was talking about forging a name (id) to join the network inside a specific consensus group, using a force-brute algorithm.

As @Ghaunt said before I only join the network if I forge a public key that resolves in an specific group.

1 Like

Cool, it is interesting, (I hope to help here) we actually help out uni’s who work with DHT’s and XOR to help. I promise many folks implement DHT’s and don’t look deep to find the relationships well. Things like no DHT has a delete that can be accurate or allow edits of data (without duplication of varying results) So the first step is knowing why

  1. Distance is symmetric and unique
  2. Closeness is asymmetric (I think I am close to you, but you don’t think you are close to me, so imbalance happens)

These are key parts and from there then it gets hugely encouraging (this is where Eric started to notice some under researched parts as well I think).

I think this is where the difficulty is, you will need to get all the results back to know what groups are there and they may not accept your connection request. [EDIT, I must add we are removing get_group from the library) If they do you will require to connect and settle in before getting to know the whole group, so this would slow down such an attack considerably. The group won’t be known till they accept your connection and connect back, if you try hundreds of parallel connect attempts then you will be forced on different IP addresses (physically run out of resources). If you fail to connect or drop a churn will not let you see the rest of the group (you wont know closest) so you need to stay the course to get the closest and then wonder if that’s what you were after, if that makes sense.

I should add though, after the sprint we are planning just now which will clean up code and some logic that’s hanging then a test to show this will be a great thing to do. We won’t have plugged in sentinel so opens up a huge array of ways in (perhaps). This would be a great test of the network and gets some folks in attack mode, which is a great thing to have and to grow out, as many scenarios tested as possible. Especially of we create an attack_routing repo or similar.

4 Likes

I’m not sure about that with IPv6. Just poking at you.