Convergent encryption ...?

There are a few attacks being discussed here. In terms of the OP, then no we are in the clear here. The network will respond with data if you ask for it, but you cannot know the holders. Imagine the DataManager is the line where the client can see to. Across that line are the node managers and nodes. The client does not see them.

In terms of scanning your own store for data then we did have in the c++ code but not yet reimplemented a scheme where the data is encrypted (obfuscated_ again before store, so the node does not know what it holds. This is for other types of attacks.

All of these need you to have a node in the network that data passes through, which you may have. These nodes it passes through do not know who gave it or who asked for it. There will be an XOR delivery address and this can be a throw away address. The method I described hid that content from the only node that knows the IP address of that client with the throw away address.

I am not sure of an attack at the moment that de-anonymises the client or allows the attack in the OP. It is very different on a p2p network as opposed to a server based network. The coverage area is pretty large. So maybe I am missing something in the question? I feel I am.

1 Like

@dirvine, attacker can use content of the file to associate it with the holder.

For example, assume that company creates XML document with invoice and stores it on the SAFE network. This XML document could have fairly rigid structure and known values (for issuer/recipient, etc.). However, just the fact that such document exists on the SAFE network associates it with specific entities from the real world. There’s no need to associate document with the entity at the protocol level.

Someone on the XRP forum claims a mod on Reddit deleted this question from the MAIDSAFE sub. Any Reddit mods here have a comment?

@happybeing comment: I think @dirvine and I are the only active reddit mods and neither of us would delete something like that. David found it in the mod queue because reddit thought it looked like spam. It is now approved. Thanks for alerting us to it. :slight_smile:

Here are some graphics to help explain my understanding of the attack and potential solutions:

POTENTIAL UPLOAD SOLUTION:

POTENTIAL DOWNLOAD SOLUTION:

It seems that file requests alone are also vulnerable to this attack. One way to solve it would be to have the Client encrypt the file request with the Close Group public key. I’m not entirely sure, but It might be possible for the attacker to obtain the Close Group public key. Enabling the attacker to carry out the very same fingerprint/ probe attack on the file request. If the Client were to couple the request with some random data, this attack could be mitigated. Any good?

2 Likes

Maybe I’m not understanding this correctly. I’ll take a shot anyway.

If the company were to store this XML file you speak of onto the SAFE network, It would make sense that they would store it privately. Nobody other than the one who uploaded it would know that it exists on the network. The uploaders’ credentials would be necessary to retrieve the file and decrypt it. If the company stored a sensitive file as public data on the network, the last thing they should be worried about is correlation. Mistakes like this could be averted by a confirmation prompt for all uploads destined to enter public domain.

Brute forcing other peoples private data stored on your vault is the only way you’ll get a tiny glimpse of what is stored on the network as a whole. Having data broken into many pieces makes this especially difficult as you would first need to gather all of the related chunks.

Files smaller than 1MB are bundled with the users’ datamap of all of his/her files. Those I guess would be the easiest to brute force if at all possible with current and near future processing power. Once that’s achieved, from there is should be fairly strait forward to gather the chunks specified in the datamap and use the brute forced key to decrypt them. All of this is very difficult and likely impractical. Hope I helped to clarify. :relaxed:

Welcome to the forum! :slight_smile:

Hasn’t david already mentioned this fix

Second to last hop is the relay a.k.a attacker. Doesn’t matter if he follows the rules and encrypts the chunk with his own public key. He will still be able to see the chunk before he encrypts it to the Client.

The issue here is that the relay will see the chunk encrypted only with the clients public key. So it’s a self encrypted chunk wrapped in the clients public key encryption. All the attacker would have to do is fingerprint all of the chunks passing through his machine encrypted with the clients public key, then compare all of the fingerprints of the chunks he observes with the ones he has in his database of “illegal” files. This would of course only be possible if the attacker first encrypts his own gathered (the ones he found publicly available on the SAFE network) chunks with the clients public key.

Please reread and reread and reread the information provided in the first graphic. I don’t know how else to further simplify this. I might be missing something but I’ve yet to be given clarifying information. Until then, I can only continue seeing this as real attack vector with serious implications.

If what David meant is that the chunk is encrypted with the Close Groups public key before passing it to the relay then I’d be happier. Although, I wonder if the attacker can get the Close Groups public key. It’s unlikely that the attacker would be using the same Close Group as the Client he is relaying traffic for. So I suppose the attacker wouldn’t have the privilege of knowing the public key of the clients Close Group. I don’t know. Please clarify.

Your diagram say the relay node

SAFE ---> relay ---> user

David says

SAFE ---> 2nd to last hope --> last hop (relay) --> user
          encrypts packets --> encrypted        --> user decypts packets then decrypts chunk

So the relay does not know the key to decrypt the chunk passing through him to the client, thus it doesn’t matter if he knows the keys to the chunk being watched for. All chunks he relays is encrypted with a key he does not know.

The client then decrypts the packets as they come, which the relay does not know and then uses the self encryption keys to decrypt the chunk.

Is this not the attack the relay node knows the self encryption keys of the chunks being watched for and when they are the relay node and see the chunks they ae watching for they catch the user. David suggests as I thought you did too, encrypts the packets so that teh relay node cannot decrypt and watching is defeated

Or am I missing something???

1 Like

I hope you’re not. I assume this means that the last hop (relay) has no way of determining the public key of the 2nd to last hop. If this is so, then I’m clear.

Now what about uploads and file requests. Are they too encrypted with the public key of the 2nd to last node? Can you answer with certainty?

1 Like

What cases might it not be enough?

What is to be mitigated? What else can be done for this higher level of protection to be achieved? I’m very curious. :open_mouth:

Say you calculate there is a 1 in 10,000,000 chance an attacker is in a route to a chunk you want (based on network density) and may know the content. In such cases a salt will provide the additional change to the file to make it unknown. So this just changes possible known data to all unknown.

Really what I described. Much of security is not about impossible, but highly improbable or infeasible (like factoring large primes or guessing a private key).

If we are talking about snooping for known data then the best way to prevent it completely is not have known data and using an agreed salt type mechanism between folks is best. Or encrypt in an app with an agreed group key etc.

I think though digging in here as we move forward is going to be good as we can point to the code and detail exactly what happens. At the moment there are pretty fast moving changes in over 20 libs so we need to get to a beta release and really poke at these parts. But it will always be easier to counter such snooping with non publicly knowable data.

TL;DR Say we image such a snoop would mean target a single node (keep in mind these nodes change keys every session) and may mean several thousand computers being used to try and get to a hop close to that node in the route of a known peice of data, hoping it requests it. So we think OK this is a several millions or hundreds of millions of dollars effort. Then we need to prove it wa requested by that node (a different route) etc. etc. The end result here that with the resource a snooper could say this XOR address was sent a piece of data we know is bad. (using anonymous encryption, to defend the person sending it’s address). and so on

If we want to be even further away from being detected then we can salt all our data we share amongst folk, or use private sharing instead.

I don’t understand why you’re all focusing on finding the owner of the data on the protocol level. This is not the point of the “learn the remaining information attack”.

Point of “learn the remaining information attack” is that the set of possible plaintexts can be actually very small. The owner of the data can be known in advance. If one can derive content of the encrypted chunks only from the plaintext AND check if ciphertext exists on the SAFE network, then this attack is viable.

Does this confirm “check if ciphertext exists on the SAFE network” part?

Yes, at least once.
One of the nice things about new forum members here is they never seem to hesitate to rehash an old topic.

@neo Excuse my ignorance here but you reference “fix” which demonstrates something required fixing. Is a fix required? or is the fix in? Or was a fix never required?

Seems the Storj boyz have their own feelings about this.

I would say something that can take the scheme further if needed. We can add a huge amount of entropy with various methods. So a roadmap of these is always good to keep at hand.

The difference with the Storj model and SAFE is pretty vast, we don’t and should not know who stores what or who asks for it. So like all security elements I think it cannot be compared like that with any honesty really. A secure algorithm is insecure if used incorrectly. This is why NaCl exists as well as other crypt libs like the excellent cryptopp in c++. It’s how they are used and in what context that matters as well as being implemented correctly at the algorithm level. So half story is select the algo and the other half is where and how will you use it. Then the comparisons are valid. Well I feel anyway.

2 Likes

Thanks David for taking the time. I have no idea how to interpret that response but will assume it exceeds the expectations of those looking for answers. Your efforts are not wasted on me though as these posts will surely make it to other venues.

1 Like

What do you mean by storing it privately? Not storing it on the SAFE network? Or using another encryption key to encrypt them before uploading onto the SAFE network?

What do you mean by “public data”? Does the SAFE network support unencrypted files?

I was thinking about use cases which would need encrypted and distributed version of DropBox.

Yes I think it went to moderation queue because the poster had a low rank (new user perhaps).

If was to fix Tonda’s issue, rather than a fix to SAFE. A fix to the problem presented.

The size of he network would make the attack very difficult. Even just becoming the relay node for who you want to investigate would be extremely difficult

1 Like