Self Encryption on the SAFE Network

joas · September 14, 2014, 6:06pm

If for the moment we disregard XOR’ing, it looks like access to all of the encrypted chunks and knowledge of just 1 key (i.e., the hash of any given chunk) will allow its decryption – key i will allow decryption of chunk i-1 which then allows us to compute key i-1 and decrypt chunk i-2 (and so on until the entire file is decrypted). Am I missing something?

Along with this, to encrypt a single chunk c_i, why not use a hash of the entire file or a hash of c_i as the key? I don’t see why this complex chaining process is needed.

dirvine · September 14, 2014, 6:22pm

I think so. The PRE encryption hash decrypts another chunk, not the post encryption (the chunk name). The pre encryption hash is a complete mystery to anyone who does not have the original data.

joas · September 14, 2014, 6:46pm

But isn’t it still true with knowledge of the plaintext of just 1 chunk (or alternatively the pre-encryption hash of that chunk) and the ciphertexts of all chunks, one can decrypt the entire file due to the chaining nature of the encryption?

dirvine · September 14, 2014, 6:57pm

No, I cannot see what you mean here. The ciphertext is not chained to create new encryption keys.

joas · September 15, 2014, 1:04am

Right, so the ciphertext isn’t chained, but the hash of the previous block is used as a key to encrypt the next block. So, if you know the plaintext of any block b_i, then you can:
a) find the key for the next block b_i+1 and
b) decrypt that next block if you have that next block’s ciphertext

dirvine · September 15, 2014, 1:10am

Yes (we are still ignoring XOR here though). So if we never used the whole algorithm then this would work as you say. If you read the paper on vaults and attacks you can see a description of related key type attacks if this is what you are looking for anyway. This may help a bit University of Strathclyde

joas · September 15, 2014, 1:35am

Ah, thanks for the link – it does describe something similar (regarding known plaintext attacks).

It begs the question though, why not include block b_i's own hash when computing its AES key? This would essentially mean that to decrypt a block, you would have to know its contents, rather than (in the current situation) just having to know its neighboring blocks.

Going even further, why not just encrypt a block with its own hash or a hash of the entire file?

dirvine · September 15, 2014, 1:37am

This is essentially where the XOR part introduces the obfuscation where the current chunks hash part is also used. It provides a similar approach but prevents any link from hash of content to AES, although we do not know of any links at the moment in today’s maths.

joas · September 15, 2014, 2:35am

Side question: is deduplication happening on the level of files or chunks?

joas · September 15, 2014, 2:41am

Continuing on the XOR discussion:

I don’t see where the current chunk’s hash is used to encrypt or “obfuscate” in the video or the link provided by janitor
At best, the XOR is a defense against some future (and unlikely) vulernability in AES. At worst, it may introduce its own vulnerabilities. In any case, the security of the system shouldn’t rest on the XOR, especially given the fact that the inputs to the XOR may all have low entropy.

Regardless, potential interaction (which, as you’re saying, is not publically known) between AES and SHA can be resolved in simpler ways than resorting to a complicated neighboring hash/XOR scheme.

dirvine · September 15, 2014, 8:34am

Chunks

dirvine · September 15, 2014, 8:38am

1: It’s an over view video so not 100% accurate. The code is 100% accurate

2: XOR is actually what gives perfect secrecy in a One Time Pad implementation, if you XOR data with a one time use PAD of true random data, not to be overlooked IMHO

The neighbor HASH mechanism is not really all that complex I feel. Of course there may be simpler ways and always happy to look at them.

joas · September 15, 2014, 5:59pm

I’ll give you the benefit of the doubt (even though the system doc should reflect the actual high-level algorithm)
XOR gives perfect secrecy if the key is random and used only 1 time, neither of which are true in this scheme. It could well be the opposite, that the XOR introduces a new vulnerability (because you’re reusing the keys to the “pad”, which can have catastrophic concequences)

Regarding a less complex scheme, here’s some ideas:

To encrypt chunk c_i, use SHA(c_i) as the AES key.
If you’re worried about interaction between SHA and AES, use [IV XOR SHA(c_i)] as the key where IV is a nonce that will remain with the ciphertext in the clear. That way, the key to AES is random, but obtainable iff someone knows SHA(c_i). [#2 is my a scheme I just thought up, but I see no problems with it]

More generally, the burden of proof is on the creator of a new scheme to show it is secure. To even believe that XORing the output doesn’t introduce a new vulernability (let alone achieves the security goals it claims), I need some sort of proof. I say it’s complex because I don’t see a trivial proof of security.

dirvine · September 15, 2014, 6:14pm

I may be picking you up incorrectly here so apologies, but I am pushed for time as usual.

We all do this is why algorithms are trusted over time. Its good you just thought up some ideas, I am sure they will provide use to people as another mechanism to check and perhaps implement.

In terms of the key not being one time, then perhaps look again, it is one time (the key is the data, that is why the AES happens). This is explained in the system docs.

In terms of random then its an infinity zero edge of universe thing, what is random is there any such thing? Who knows, it’s hard to prove, you can use chi squared tests etc. but even then true random is like infinity perhaps. As I say who knows. Glad you are looking though.

We should do some reward for decrypting a chunk perhaps ? Like the DES competitions etc. it all helps.

In terms of burden of proof, lets see how that goes, are you saying I need to answer every single person who does what you just did and who does not want detail? The math for hash, xor and AES are readily available.

I do not need to explain the metallurgy of a nail for you to pop it one with a big hammer. I think that is a burden somebody else can carry I am kinda busy right now, but I doubt I have found a way to make AES weaker

We are working on systemdocs so perhaps that will help, but cheers for the benefit of the doubt comma but

joas · September 15, 2014, 7:16pm

The goal of a high level algorithm (like self-encryption) should be to reduce its security to crypto primitives. It’s true that you need to take a leap of faith for the primitives, but that’s because we can’t do any better.

Okay, looking at the system doc, the key is one time (in the video it was though…). That said, the keys are not random and are related (since common seeding is going into each), both of which rule out their use in a one-time pad.

Your comments about randomness are misdirected. I’m saying that using a chunk of user data as seed to generate a key is a bad idea because that chunk can have very low entropy. I’m not saying anything about randomness being unattainable, just that user data is horrible for the job and that there are much better sources.

As I said from the beginning, I’m not talking about the security of crypto primitives or accepted standards, I’m talking about the self encryption scheme you’re using, specifically where it strays from those standards. The implicit assumtion through all of this is that AES, SHA, etc. are secure (and if anyone is doubting that these are secure, it’s you ;).

Your comments make me think that you are deliberately misrepresenting my concerns and making straw men for yourself. It’s a pity because I’m just trying to help.

dirvine · September 15, 2014, 7:35pm

No its not, I think you are missing the point here somewhere.

dirvine · September 15, 2014, 7:36pm

Can you re-read your first sentence then

I just found how to quote replies there now, cool. I am trying very hard to answer a question that wishes neither high level, simple or detailed answers. You must see it very difficult to answer exactly at the level you are looking for.

If you are trying to help then please do, but your tone is a little abrasive and a little condescending. Its much better to be more accurate and state exactly what you are trying to achieve. If you think it broken then show where.

If you think it can be more efficient then show us.

If you think it just does not work, then please shout.

There have been many many very experienced people including professors of cryptography looking and questioning this library and all appear to agree it is extremely robust. You seem to be stating you don’t really know what it is, then without much thought (you say, not that its wrong) create another version etc. It is hard to see how that helps. If you can see improvements then please you provide a formal spec for improvements if that is how you like to work. I may learn by reading that (I am not being facetious I might).

So probably best to drop the straw man accusation approach to a discussion and actually get involved. If anyone thinks we have 100% complete, 100% efficient and 100% secure algorithms they would be wrong. There is always room for improvement. The approach needs to be correct though.

joas · September 15, 2014, 7:41pm

Uh, the diagram here shows that for chunk n, the output of its AES is XORed with some function of chunks n-1, n, and n+1. For chunk n+1, it would then be n, n+1, and n+2. That’s common seeding to me…

joas · September 15, 2014, 7:46pm

I’m not sure what sentence you’re talking about. Do you mean:

Okay, I mentioned crypto primitives only to say that I’m not concerning myself with their security and taking it for granted. I fail to see your point.

dirvine · September 15, 2014, 7:47pm

Its a sliding window. If you consider the data the pad as stated then it would not matter if the hashes were all exactly the same if AES produced random data. I do not think you are getting this algorithm at all. Chunk n for all n will have a different n-1 if that helps.

Topic		Replies	Views
Is the "data map" distributed? Beginners	3	800	March 5, 2015
New Client Demo video Videos	4	1792	April 13, 2016
Storage of Government data Beginners	9	556	February 22, 2019
Bug: self_encryption of small files, not encrypted Development	4	1002	June 22, 2015
Update 09 June, 2022 Updates	20	2407	June 13, 2022

Self Encryption on the SAFE Network

Related Topics