What about nodes sniffing data?

Due to the de-duplication design, if an attacker wanted to identify people who retrieved a known document, the attacker would calculate the hashes for each chunk in the document. Any request for the document should involve a request for the hash of each chunk. If an attacker could map an IP to a GET request containing the hash of single chunk, the attacker could prove the IP requested the entire document. This is more likely to happen with public or shared documents, and could be an issue for those wishing to create “whistleblower” functionality on the network.

I think its also worth mentioning that nearly all messages between peers are encrypted, which includes GET requests for documents. I think some messages in the initial connection phase cannot be encrypted, but that should be the only time (I haven’t scoured the initiation code much at all). Encrypting all messages between peers prevents attackers sniffing on the edge network in routers, etc, like @Blindsite2k mentioned. So an attacker in that position cannot map GET <--> safe_id or a safe_id <--> IP since the messages are an opaque encrypted blob.

The next question is whether any safe node could map IP <--> GET request easily. Based on the design, the GET messages could be encrypted to the DMs, which means any intermediate nodes (including nodes with direct connections to the client making the request) would not see the GET request. Since the DM is also unlikely to have a direct connection to the client, it is unlikely to know the IP. Thus an attacker would have to control both peer of the targeted client and a DM of the file being requested. This would be harder to control, but the probability should increase as the file being stored increases in size (more chunks being stored). Unfortunately, the code appears to be sending GET requests directly to the closest match in its routing table, which means the DM is directly connected to the client, allowing it to map IP <--> GET. However, I’ve only groked a small portion of the code thus far, so I could’ve easily missed something.

This network should be able to provide the privacy and anonymization that TOR provides. More comparatively analysis will need to be done, so that issues TOR had to correct aren’t duplicated in this network though.