A discussion on the Privacy of the SAFE Network

privacy

#1

Hello everyone! I am currently in the midst of writing my dissertation for the SAFE Wiki project and in a small section I am discussing user privacy.

The privacy of users is something that has been on my mind and I would like to start a discussion with you all to hear your insights and comments. As far as I understand it, user privacy is a feature of the network. Through Routing/Crust, the only nodes that ever receive your IP address are the Proxy Nodes. Deeper into the network, all vaults know is the XOR Address of the account accessing/maipulating the data. Please correct me if I am wrong.

This abstraction does provide some level of privacy to a user. However, if you manage to link an XOR Account Address with an IP address (through malware or a Proxy Node acting as a ‘bad party’ etc) then I assume anonymity of that individual starts to break down. Now obviously through the use of a VPN or similar service their ‘real’ IP address could be obfuscated, for the casual user though this wouldn’t be the case.

With that in mind, has there been any ideas thrown around on how this could be addressed? My gut feeling is that some level of integrating with Tor could be used to mitigate this. Or perhaps could Tor’s system of onion routing be baked into the Proxy Nodes themselves? Having this be a feature (or at least an option) for people to use could help increase the utility of the network.


New Members: Start Here!
#2

I would say SAFE already does have something like onion routing in place internally through randomised routing and the process of churn which ensures that addresses are constantly changing. I’m not sure how malware within the network could ever work backwards to find the IP address. Not saying it’s impossible but I can’t see how it would be done. If the user really wanted a belt-and-braces approach they could use Tor or a VPN to disguise their real IP address from the proxy node though.

100% privacy like 100% security is impossible, but I’m pretty sure it will be much better on SAFE than it is on TOR where exit nodes can be compromised, the last hop is unencrypted and traffic can be profiled.


#3

As far as private data is concerned even if your IP was linked to a XOR address by a bad actor, it wouldn’t matter because all data is encrypted client side not to mention that all connection info is encrypted in crust. Vaults simply don’t know what they’re storing either. I would say good luck wasting your life to anyone trying to de anonymize someone on the network without using social engineering. Targeting someone is very difficult and the whole point of routing’s disjoint sections. Nodes are constantly shuffled about as the network dynamically changes, vaults coming online or offline.

TOR is useless now imo. The govt can find the servers and shut them down and run as many exit nodes as they want and sniff the data as it comes out BEFORE it even gets to you. How stupid is that?! It worked for awhile and it’ll always work decent as long as there are brave souls with routing/exit nodes but the world needs better. Proxy nodes are basically TOR built into the network as I understand it. I’m sure there are others here that could give more detailed reasoning but I thought I’d chime in.


#4

I’ll let more skilled persons answer the OP, but I wanted to point out that privacy is not the same as anonymity.

Privacy is achieved when a 3rd person doesn’t know what the message content is, even if they know who is tzlking to who . Think a paper mail in a closed enveloppe. The postman can see the adresses on the enveloppe, yet he doesn’t know what is written inside.

Anonymity is achived when you don’t know who is talking / listening, even if you can read the message. Think words painted on a wall with a spray can. You can read the text, yet you don’t know who wrote it. Or think someone listening to FM radio. Anyone can hear the broadcast, yet you can’t figure out who is listening.

Privacy and anonymity are not exclusive, a mix can exist , or you can have one but not the other.

I think it would make sense for your analyse to look at how the safe network behaves for both aspects.


#5

Once again, I have to rollout this excellent post from @polpolrene from some time ago. Dated in a couple respects of implementation but still gives the picture. All the encryption layers for SAFEnet


#6

Sorry I should have been more specific, I meant malware on a users computer (client side).

Now that is sadly very true haha!

Really good points you bring up, thank you for sharing.

Fabulous read! Thanks for sharing. I really was not aware of the concept of identities…

If this works the way i think it does then could this be a way to help increase anonymity?

I suppose this really does get to the crux of the matter. As @nice said above privacy is not the same as anonymity. The SAFE Network could definitely be said to ensure privacy, certainly when encrypted data is concerned.

What I am really concerned about is websites like Wikileaks etc being hosted on the network. These will be made up of ‘public’ files on the network, there addresses are known to anyone who can access the site. For the people uploading the data though, their XOR Account Address (as far as I am aware) is tied to that data. So for example, if a user uploads a cat picture (that’s ‘public’ and I know the address) I can see the exact XOR Account Address that uploaded that data. So, for example, if I as an organisation performing an investigation into a website like WIkileaks can somehow act as a ‘bad party’ Proxy Node, I could figure out the IP addresses that correspond to particular XOR Account Addresses. That is how I understand it anyways. So doesn’t this really hurt the usage of the network by people that need anonymity?


#7

Even if something is posted as public and human readable it is still encrypted in transit. The person trying to catch someone would be fishing for a long time even if in high numbers because nodes don’t get to choose where they go, who they serve, etcera in the the network. I’m not saying it’s impossible it just seems like it’s unlikely. First you have to either put malware on the persons computer manually or have them download it, then link the specific data to the XOR address. I would like to hear someone more technical speak on this though.


#8

@Nigel I totally hear what you are saying. The avenue I think that is worrying is for example, through some means, someone identifies an individual by their XOR Account Address. From this they could simply visit resources like Wikileaks and see which pieces of data ‘belong’ to that account. As far as I understand it they could do this completely after the fact, with no need to unencrypt data after transmit. They would simply need to know the XOR Address that that individual has.

  1. Someone uploads incriminating data to the network (government leaks etc)
  2. Government/whoever can see exactly which account uploaded that data
  3. Government manages to tie that account to an IP address (through bad-party Proxy Node/malware/etc)
  4. Individual responsible can then be deduced (perhaps)

That is the flow I am ‘worried’ about. I am curious to know if there is anything built into the network to mitigate against this kind of threat.


#9

Immutable data on the network is not “owned” and it’s creation is not traceable to any account. Only if you have the data map (which is what would be available on Wikileaks) you can access the data.

Additionally, data can be stored and/or retrieved via one-off IDs.

Is it possible to pierce all this (and more) in some small degree? Theoretically maybe, but what does one have to do to find out anything actionable? Remember, each file is chunked and self-encrypted before being sent to the network to store. From the network’s perspective, each chunk is solo, i.e., has no attachment or dependency on any other chunk.

The relationship between real space and XOR space is fluid and always changing. There have been a lot of discussion on the forum in the past about “if one has the XOR name of a chunk known to belong to an offending file, can’t one trace it back to some storing node at least, and thus start to unwind the ball of yarn?” Might be worth searching those out and exploring the different questions and answers and scenarios relate to all this. It’s an interesting exercise in theoretical detective work against a developing network architecture.


#10

If the government has malware on the uploader’s computer this is true, but then they can see you’ve got the files already on your computer, so I am not sure this is what you are suggesting. There would be no issue related to SAFE network accounts.

So what I assume you are thinking is…

If I upload to a public id of my own, obviously anyone can see that the file is associate with that public id. In which case the government wants to discover which account owns that public ID - is that correct?

If so I don’t see how this can be discovered unless they have malware on the uploader’s computer. No other node would know which computer owns the public ID because there the only record is encrypted, and only ever decrypted on the owner’s device.

So I can’t see a way to tie an account to the public IDs it owns, except by snooping on the device being used to operate that account.


#11

I can’t say I’d agree that privacy had been achieved in this example. Meta data is private data.

The difference with anonymous communication is that the recipient of a message would know what the message content is but not the identity of the sender.


#12

this is an interesting point of view, making even more sense in the case of electronic communication, where massive and fast analysis of metadata can tell a lot about someone.
I suppose my conception is rooted in the era of paper mail, and in the idea of privacy as in ‘inside home walls’.
A quick search shows me that the concept is slowly evolving in courts and laws around the world.
Thank you for bringing the nuance out.


#13

Sorry to take so long to reply to all this!

I think there is some fabulous points being made. The way I view it is that interacting with the SAFE Network is not ‘anonymous’ . If we take the definition of anonymous from Wikipedia…

Anonymity, adjective “anonymous”, is derived from the Greek word ἀνωνυμία, anonymia, meaning “without a name” or “namelessness”. In colloquial use, “anonymous” is used to describe situations where the acting person’s name is unknown.

Vaults know when they serve chunks of data the ID of the account that is being used to retrieve that data. Through whatever means, that is an identifiable piece of information that could be used to tie network access to an individual. Now it is up for debate whether the ID that is being used could be considered a ‘name’ but it is something tied to an individual account and hence not truly `anonymous’ if you see what I mean?

From the points lots of you have made, the network does ensure ‘privacy’ and does that very well…

Privacy is the ability of an individual or group to seclude themselves, or information about themselves, and thereby express themselves selectively.

That definition of privacy is really interesting to me. I agree that…

Is an example of privacy. To give another example, Imagine I had a safety deposit box in a bank. The objects contained within it are ‘private’ and only I have access to them and know what they are, but the people that work there will know my identity similar to the postman. I think for meta-data to be truly private then this would inherently mean anonymity.

The two are not mutually exclusive of each other as many have said. You can have anonymity without privacy.

Then to the other extreme you can have privacy and anonymity mixed together, think the cryptocurrency Monero for instance.

Thanks to everyone for the more technical discussion too, still learning about the core of the SAFE Network works every single day. :+1:


#14

Vaukts don’t know this. No node in the network knows the id or, or anything associated with the user /account accessing the data.

Some nodes (probably not the vaults but the data managers) know the temporary random xor address that an encrypted chunk is being sent to, but you should check as I don’t know the details of the implementation.

Also, note that a chunk is just an encrypted piece of information, with no easy way to determine anything about it unless you already have the original of the file to which it relates (and I’m not sure which nodes would /would not be in a position to check this due to the different layers of encryption involved - again, check the details if you need to know).


#15


From the SAFE Primer.

Unless I am interpreting this wrong, the SAFE Network primer says that they know its public key and XOR address. And again reiterates that Vaults to which the user is connected might know a little about what the user is doing on the network but they can only identify the user by their XOR address and not their IP. To me this implies they do know the XOR address of the users account, or at least know an address that is linked to an account.

Creating a type of ‘honey pot’ by doing this is an approach I can think of that could be used to trace activity back to an account if vaults can indeed only identify the user by their XOR address.

Possibly I am misinterpreting what the SAFE Primer is saying or not understanding it fully.


#16

I don’t know if vaults know the public key of the requestor - I don’t see why they would so that needs checking.

I agree vaults might know the xor address, although this needs checking also as I said, because I don’t see why they need this. I would think only the data managers need it, although it might be more effient to let the vaults see this too.

However, the xor address is random and temporary and so not associated with an account beyond a certain time (I don’t know the limits - but no more than a session at the most).

So there is a very small window for any vault to collect data about a user, and any vault is quite likely to see that same xor address only once anyway because one vault holds such a tiny fraction of all the data any single user will access during a session.

The question then is who does see the public key. I don’t see why vaults need it, so this may be an error - the wording you quote mentions it in the first instance but drops it from the second.

Could be a good catch for a documentation error, and also an issue to explore more - who can see a public key and how could this be used to gather data on an account.

There is a limit even here though. If you are one vault who can see the public keys of every user accessing the chunks you hold, how much information can you gather on a particular account? Very, very little, because you hold such a tiny fraction of all the data they might access.

So to do any significant snooping an attacker would have to collate the logs from many different nodes.

Also, they would have to gather and store all that information - because they have no way to focus on a particular public key, until they have linked it to an account.

So, it needs clarifying, but even the worst case is not as serious as making the network non-anonymous.

Can anyone clarify what public keys of an account are exposed and for how long? I have a feeling these are only session keys - so thrown away like the xor address after a relatively short period - but I can’t confirm that.

Good digging @DaBrown95, don’'t stop :slight_smile:


#17

I’d agree with the first definition of privacy, but not the second.

Communications are only really private when messages and their metadata (the when, where, who etc.) of the communication are known only to the sender and receiver of the message, and not to any third party; like the postman, for example.

People using a communication service may perceive degrees of privacy, and perhaps consider that when they use a communication service that allows a trusted 3rd party to have a degree of information about a message and its metadata, they afforded privacy; but it falls short of what we should all consider private. Goodness knows how many people have just had this realisation thanks to FaceBook and Cambridge Analytica.

A considerable amount of information about an individual and their lives, and the lives of those they communicate with, can be gathered from metadata.

The classic example would be call records that show someone regularly dialling a number late at night which is not their spouse’s, then one shortly followed one day by a call out to an STI clinic, followed by the Samaritans. This information being accessed or processed by a 3rd party should rightly be considered an egregious violation of privacy.

Anonymity is related to privacy, but i’d consider it orthogonal.

I could, perhaps, have a truly private conversation with my wife on the dark side of the moon, but it wouldn’t be anonymous.

Likewise two individuals could have a publicly broadcast conversation, without their identities being revealed. Not private, but still anonymous.

It seems to me that the SAFE Network will provide—by default—private and pseudonymised communications (email or messaging for example), with both fully anonymous and public options available; should they be desired.

Jim


#18

This older post should be of assistance in getting the perspective. I’ve found it very useful. All the encryption layers for SAFEnet Some of the exact details may have changed, but not meaningfully in this context.


#19

One question, as we would have control of our own data and no one else would be able to delete it. So what about the illegal and immoral stuff (the really serious ones)? If someone uploads it, it would be there forever. Will it be possible to delete them in any way?


#20

Sure you can delete them. You just need a weapon able to destroy the Earth. This is the point of the whole network. Nobody can delete what you upload.