Self Encryption on the SAFE Network


If for the moment we disregard XOR’ing, it looks like access to all of the encrypted chunks and knowledge of just 1 key (i.e., the hash of any given chunk) will allow its decryption – key i will allow decryption of chunk i-1 which then allows us to compute key i-1 and decrypt chunk i-2 (and so on until the entire file is decrypted). Am I missing something?

Along with this, to encrypt a single chunk c_i, why not use a hash of the entire file or a hash of c_i as the key? I don’t see why this complex chaining process is needed.


I think so. The PRE encryption hash decrypts another chunk, not the post encryption (the chunk name). The pre encryption hash is a complete mystery to anyone who does not have the original data.


But isn’t it still true with knowledge of the plaintext of just 1 chunk (or alternatively the pre-encryption hash of that chunk) and the ciphertexts of all chunks, one can decrypt the entire file due to the chaining nature of the encryption?


No, I cannot see what you mean here. The ciphertext is not chained to create new encryption keys.


Right, so the ciphertext isn’t chained, but the hash of the previous block is used as a key to encrypt the next block. So, if you know the plaintext of any block bi, then you can:
a) find the key for the next block bi+1 and
b) decrypt that next block if you have that next block’s ciphertext


Yes (we are still ignoring XOR here though). So if we never used the whole algorithm then this would work as you say. If you read the paper on vaults and attacks you can see a description of related key type attacks if this is what you are looking for anyway. This may help a bit


Ah, thanks for the link – it does describe something similar (regarding known plaintext attacks).

It begs the question though, why not include block bi’s own hash when computing its AES key? This would essentially mean that to decrypt a block, you would have to know its contents, rather than (in the current situation) just having to know its neighboring blocks.

Going even further, why not just encrypt a block with its own hash or a hash of the entire file?


This is essentially where the XOR part introduces the obfuscation where the current chunks hash part is also used. It provides a similar approach but prevents any link from hash of content to AES, although we do not know of any links at the moment in today’s maths.


Side question: is deduplication happening on the level of files or chunks?


Continuing on the XOR discussion:

  1. I don’t see where the current chunk’s hash is used to encrypt or “obfuscate” in the video or the link provided by janitor
  2. At best, the XOR is a defense against some future (and unlikely) vulernability in AES. At worst, it may introduce its own vulnerabilities. In any case, the security of the system shouldn’t rest on the XOR, especially given the fact that the inputs to the XOR may all have low entropy.

Regardless, potential interaction (which, as you’re saying, is not publically known) between AES and SHA can be resolved in simpler ways than resorting to a complicated neighboring hash/XOR scheme.


Chunks :smile:


1: It’s an over view video so not 100% accurate. The code is 100% accurate :wink:

2: XOR is actually what gives perfect secrecy in a One Time Pad implementation, if you XOR data with a one time use PAD of true random data, not to be overlooked IMHO

The neighbor HASH mechanism is not really all that complex I feel. Of course there may be simpler ways and always happy to look at them.

  1. I’ll give you the benefit of the doubt (even though the system doc should reflect the actual high-level algorithm)
  2. XOR gives perfect secrecy if the key is random and used only 1 time, neither of which are true in this scheme. It could well be the opposite, that the XOR introduces a new vulnerability (because you’re reusing the keys to the “pad”, which can have catastrophic concequences)

Regarding a less complex scheme, here’s some ideas:

  1. To encrypt chunk c_i, use SHA(c_i) as the AES key.
  2. If you’re worried about interaction between SHA and AES, use [IV XOR SHA(c_i)] as the key where IV is a nonce that will remain with the ciphertext in the clear. That way, the key to AES is random, but obtainable iff someone knows SHA(c_i). [#2 is my a scheme I just thought up, but I see no problems with it]

More generally, the burden of proof is on the creator of a new scheme to show it is secure. To even believe that XORing the output doesn’t introduce a new vulernability (let alone achieves the security goals it claims), I need some sort of proof. I say it’s complex because I don’t see a trivial proof of security.


I may be picking you up incorrectly here so apologies, but I am pushed for time as usual.

We all do this is why algorithms are trusted over time. Its good you just thought up some ideas, I am sure they will provide use to people as another mechanism to check and perhaps implement.

In terms of the key not being one time, then perhaps look again, it is one time (the key is the data, that is why the AES happens). This is explained in the system docs.

In terms of random then its an infinity zero edge of universe thing, what is random is there any such thing? Who knows, it’s hard to prove, you can use chi squared tests etc. but even then true random is like infinity perhaps. As I say who knows. Glad you are looking though.

We should do some reward for decrypting a chunk perhaps ? Like the DES competitions etc. it all helps.

In terms of burden of proof, lets see how that goes, are you saying I need to answer every single person who does what you just did and who does not want detail? The math for hash, xor and AES are readily available.

I do not need to explain the metallurgy of a nail for you to pop it one with a big hammer. I think that is a burden somebody else can carry :slight_smile: I am kinda busy right now, but I doubt I have found a way to make AES weaker :wink:

We are working on systemdocs so perhaps that will help, but cheers for the benefit of the doubt comma but :wink:


The goal of a high level algorithm (like self-encryption) should be to reduce its security to crypto primitives. It’s true that you need to take a leap of faith for the primitives, but that’s because we can’t do any better.

Okay, looking at the system doc, the key is one time (in the video it was though…). That said, the keys are not random and are related (since common seeding is going into each), both of which rule out their use in a one-time pad.

Your comments about randomness are misdirected. I’m saying that using a chunk of user data as seed to generate a key is a bad idea because that chunk can have very low entropy. I’m not saying anything about randomness being unattainable, just that user data is horrible for the job and that there are much better sources.

As I said from the beginning, I’m not talking about the security of crypto primitives or accepted standards, I’m talking about the self encryption scheme you’re using, specifically where it strays from those standards. The implicit assumtion through all of this is that AES, SHA, etc. are secure (and if anyone is doubting that these are secure, it’s you ;).

Your comments make me think that you are deliberately misrepresenting my concerns and making straw men for yourself. It’s a pity because I’m just trying to help.


No its not, I think you are missing the point here somewhere.


Can you re-read your first sentence then :smiley:

I just found how to quote replies there now, cool. I am trying very hard to answer a question that wishes neither high level, simple or detailed answers. You must see it very difficult to answer exactly at the level you are looking for.

If you are trying to help then please do, but your tone is a little abrasive and a little condescending. Its much better to be more accurate and state exactly what you are trying to achieve. If you think it broken then show where.

If you think it can be more efficient then show us.

If you think it just does not work, then please shout.

There have been many many very experienced people including professors of cryptography looking and questioning this library and all appear to agree it is extremely robust. You seem to be stating you don’t really know what it is, then without much thought (you say, not that its wrong) create another version etc. It is hard to see how that helps. If you can see improvements then please you provide a formal spec for improvements if that is how you like to work. I may learn by reading that (I am not being facetious I might).

So probably best to drop the straw man accusation approach to a discussion and actually get involved. If anyone thinks we have 100% complete, 100% efficient and 100% secure algorithms they would be wrong. There is always room for improvement. The approach needs to be correct though.


Uh, the diagram here shows that for chunk n, the output of its AES is XORed with some function of chunks n-1, n, and n+1. For chunk n+1, it would then be n, n+1, and n+2. That’s common seeding to me…


I’m not sure what sentence you’re talking about. Do you mean:

Okay, I mentioned crypto primitives only to say that I’m not concerning myself with their security and taking it for granted. I fail to see your point.


Its a sliding window. If you consider the data the pad as stated then it would not matter if the hashes were all exactly the same if AES produced random data. I do not think you are getting this algorithm at all. Chunk n for all n will have a different n-1 if that helps.