A node could use header information, a self-encryption signature, and Shannon’s entropy equation to statistically determine whether or not the chunk is random encrypted data If it’s not, then a good node would reject it with a msg like, “encrypt it pal, and then try again.” I agree that it is good practice for the nodes to encrypt the chunk once again, regardless. The reason for rejecting a chunk that has been detected as not encrypted is to train clients about proper upload practices. This way, the chances that a “bad node” would ever come across unencrypted data is very low to impossible. This also protects against potential malware that would disable encryption in the client without the user knowing.
I was under the same impression, until a few years back when we tested this.
Unfortunately, the entropy of an encrypted file and jpeg for instance is almost the same. As data formats have gotten more efficient, they are smaller with more entropy.
The final conclusions of this study raise many doubts about the possibility of distinguishing whether a chunk is encrypted or not and, in the end, it confirms what David is saying.
From the research performed and the results achieved, it is clear that the use of
pure mathematical techniques such as Shannon entropy, Chi-square and Monte Carlo
calculations as an indicator for crypto-ransomware encrypted files is not ideal. Especially
when applying these calculations to a broad and modern data set, such as NapierOne, it
can be seen that these calculations struggle to differentiate consistently between
encrypted files and other high entropy files such as compressed or archived files.
Nice. This article gives a good example of such tests. There is a difference in entropy, small difference. I agree, it’s a “not ideal” method though.
I suppose we interpreted these conclusions differently. To me, “not ideal” is not the same as “not possible”.
Regardless, the situation for uploading to the safe network is a bit different. One could place the burden on the client to provide a “proof of encryption”. Perhaps a zero knowledge proof of some sort could be crafted or other mechanism. I may ruminate on this a bit more.
I found this discussion on a stackexchange that discussed exactly what we were talking about. One of the answers describes how a zero knowledge proof could be used to prove that a ciphertext was encrypted with a particular public key. This would allow nodes to enforce that all chunks are encrypted by the client before upload.
Now if I encrypt the message (maybe a jpg or binary file) with my encrypt key, then decrypt it with my decrypt key then will the test see the message as encrypted if I send the original with the decrypt key which I call my public key. What then
The ZK thing is interesting for sure. However, a maiicious client could store lots of “bad” content all over the network and then publish the keys and chunk identifiers far and wide, including immutably on the safe network.
In result, these chunks have effectively been blacklisted, and node operators could plausibly know the contents of these chunks. This then provides a hook for authorities to demand that node operators include code that filters out such blacklisted chunks. precedent. slippery slope, bad PR, etc.
Still waiting for them to ban storing the full bitcoin database with its reported stored illegal file stored in the data areas of transactions
Not sure. You are so crafty neo.
We’ve discussed how chunks can be encrypted again (was it by elders back then?) prior to being sent to the nodes. The network architecture has changed now so who knows… might not protect against your scenario though. Seems like the more that clients are required to do, the better. It places responsibility on the user/client rather than the node operator.
Hi neo, thinking about this more, I think we could defend the scenario you described. It would require 2 zk proofs.
- Client provides an encrypted chunk, a public key, and a zk proof that the chunk was encrypted with that key.
- Client provides a second proof that the public key was derived from a suitable private key (dsa, ECDSA,etc.)
But this requires the use of asymmetric encryption, which is much slower than symmetric encryption.
But this requires the use of asymmetric encryption, which is much slower than symmetric encryption
The benefit of not requiring nodes to encrypt anything, while also having proof that anything which enters the network has already been encrypted, would be worth it. Presuming the cost of proof verification is small… again, it’s the safe network, not the speed network.