Attack Vector against relay nodes from client machines run by Investigators

I don’t think this is necessarily true everywhere.
In practice only your telco’s (or “investigating organization’s”) logs could be enough. Your interpretation of what might constitute evidence seems a bit too casual (see example below).

The judge authorised handover of IP records for three purposes:

  1. seeking to identify end-users using BitTorrent to download the movie
  2. suing end-users for infringement
  3. negotiating with end-users regarding their liability for infringement.

(source - http://www.lifehacker.com.au/2015/04/what-evidence-does-using-bittorrent-leave-behind/)

That’s apples and pears. In Bittorrent you share chunks with other who download the same file. So you have to be downloading/uploading the file and know that you’re doing that. Another point to make is that on Bittorrent the chunks aren’t encrypted. So you just pass around un-encrypted chunks from a file you are downloading/uploading as well.

There is a protection for this, now I have had a wee while to think about it (nice investigating @neo), the routing layer will know when delivering to a bootstrap/relay node. It can tell this from the address of the client which contains (NodeAddress, Client_public_key) so the node that is connected to the relay node will know it is delivering to the last hop. In seeing this then the routing node can encrypt the whole message to the client, this essentially means the relay node (bootstrap node) cannot tell what the client is receiving and the relay is then protected.

This is pretty easy to achieve and we can give this more thought. The reason not to end to end encrypt the whole message is to allow caching. So this is for ImmutableData only, all the other data types are encrypted end to end anyway, making this attack not possible there.

6 Likes

With SAFE you also know you’re downloading the file (I thought also uploading if you’re caching it).

Below your comment David said they currently aren’t.

Of course, you’re getting the chunks you want to reconstruct the file. The HOPS cache for you, so the last HOP that gave you the file. I don;t think you’re caching it yourself as well. but don’t know for sure.

I agree, so you could see the chunks come by now. But the chunks are still part of self_encryption. So compared to BitTorrent you almost know nothing.

see the edit in the OP. I made a mistake.

It is the IP address of the node sending the packet that needs to removed/faked if that was at all possible.

The court case in Australia showed that this is not a defense as that is exactly the situation with the TOR exit node where everything was encrypted, even the packet exiting the node was encrypted and yet a conviction was obtained against an innocent person.

It was the fact that the PC was used in the communications. For copyright infringement civil cases this would also apply equally. Only thing that mattered was that the PC send the info, it was a “supplier”.

Ignorance is not a defense in these cases. Yes it shocked those in IT who heard about it, even the current PM supported using TOR to defeat geo-boundries, but then the police in Australia successfully convicted someone running a TOR encrypted exit node. Unintended consequences of laws, but it also gives precedence for Intellectual Property Trolls (investigative companies) to use the same to (exhort) get money for compensation to copyright infringement (in Australia & Germany at least)

1 Like

Guys, guys…this:

So, is it possible to send a chunk to a client from a relay node without revealing the sender’s IP address? (Mr. Network Engineer - @dirvine - let me know if I get anything wrong, I’m still in the process of studying for my Network+ cert) Let’s find out:

MTU

A quick note about Maximum Transmission Units and why we need packets at all.

The maximum transmission unit, or MTU, is the single largest frame or packet of data that can be transmitted across a network. The exact nature of the maximum transmission unit will be determined by the configuration of the network and what type of protocols is in place for the transmission of data.
WiseGeek

While a chunk on the SAFE Network may be 100kB (SD) or 1MB (ImmD) the packets that are used to transfer these chunks point to point may not necessarily be that large. Therefore, there exists the real probability that these chunks being sent across the network will need to be split up into packets.

IP Packet Construction

Each datagram [AKA packet] has two components: a header and a payload. The IP header is tagged with the source IP address, the destination IP address, and other meta-data needed to route and deliver the datagram.
Datagram Construction - Wikipedia

That means that there is necessarily an IP address in the packet. Need it be the relay’s though?

Spoofing

IP address spoofing can be defined as the intentional misrepresentation of the source IP address in an IP packet in order to conceal the identity of the sender…
David Whyte - Carleton University

And as for how:

These IP packets have the proper source and destination addresses for reliable exchange of data between two applications. The IP stack in the operating system takes care of the header for the IP datagram. However, you can override this function by inserting a custom header and informing the operating system that the packet does not need any headers.
Cisco.com

So by forging the header, the relay can hide itself from recognition on the client’s side. Discussion of potential ramifications of just who’s IP address is inserted there will be left alone for the time being. But does this mess with the protocols at all?

TCP vs UDP

UDP vs TCP Spoofing - one of the most significant reasons TCP is more secure than UDP is the difficulty in spoofing TCP communications. UDP spoofing is trivial since there is no notion of connection. Trying to [utilize] an established TCP session, however, is very difficult if the [relay] is unable to see the packets flow on the wire. This is because the 32-bit sequence number must be guessed by the [relay].
InfoCellar.com

So spoofing packets would not work for TCP, where it certainly would with UDP.

SAFE Network Protocols

TCP connections are always favoured as these will be by default direct connected (until tcp hole punching can be tested). TCP is also a known reliable protocol. Reliable UDP is the fallback protocol and very effective.
Maidsafe.net - SystemDocs

So, where the network will be transmitting packets, it will use either TCP or the Reliable User Data Protocol (rUDP) to do so. This, as mentioned elsewhere on these forums, will allow many features such as end-to-end encrypted communications, reliable reception of packets, and NAT transversal. (AKA hole-punching)

However, since these are connection-oriented[1] protocols, the packets that are sent back as either confirmation or resend requests to the sender must be sent then to the sender’s IP address in order for the sender to recieve the information and act accordingly. This necessitates that the correct one be included in the header.

Conclusion

While hiding the sender’s (relay’s) IP address may be possible using IP spoofing, this would not be feasible on the network due to it’s connection-oriented approach between the sender (relay) and the recipient (client).

If the network were to implement a purely UDP-based approach, spoofing would be feasible, but the network would lose many of the features and reliabilty that come with the connection-oriented approach. (not to mention the restriction to one type of protocol enabling an easier profiling of network traffic)

Therefore, any solution to this problem would need to be implemented higher up in the Maidsafe stack, such as in Crust, Routing & Sentinel, or some other aspect of the network.

[1] Note that “connection-oriented” is not mutually exclusive with “stateless”, as connection-oriented is implemented at the transport layer, and stateless is implemented higher-up in the Maidsafe stack.

P.S. This was so much fun to research! (I’m such a nerd)

P.P.S.

The link in the OP was for a case in Austria, not Australia (although you do state that the Aussies have a similar law as well) Sorry, that was bugging me!

4 Likes

I edited the OP and I am ashamed for only skim reading the actual article used as a reference in an Australian news story. Although the copyright issue is an issue in a few countries at least and needs discussing.

We had a project in Australia when network filtering at the ISP level was to be legislated for. We developed a 1/2 proxy where the request went to the 1/2 proxy and the proxy would on send the URL to the server with the return IP address as the original IP. This meant the proxy only handled the the packet in the request and the rest of the traffic went between the server and originator.

There are methods say to have the relay node send the packet to the client with a “sender IP address” as one of the client’s group node and then any errors are handled by the client’s group. This would require the client’s group to handle the ACK/NAKs from the client receiving the packets, and a pain to implement.

OR we can have the relay node give fake IP address (127.0.0.1 might work) and the client requests from the client’s group nodes (or data manager’s) a resend of the required packet on errors. (IE chunk bytes xxxxxx-yyyyyy)

This way the senders IP address from the relay node is faked and the system still works, but slower on errors. The other nodes do not fake their source IP address spnce there is no problem to solve

By implementing one of the above then it can be made to work for TCP

Another solution

The relay node uses a VPN when sending packets to a client. All other traffic can be through the ISP. This way the client never sees the real IP address and only the VPN.

Obviously this would only be used by people who need/want to.

Majority of traffic to/from a PC will be through the normal ISP connection and only when its acting as the last node sending a chunk to a client it can go through a VPN.

I considered this, and figured that it was the relay’s responsibility, even though it was the client’s decision to download the data. So by that reasoning, all relays should implement anything solving this problem just in case they were used by a malicious client.

Even though this seems like the ideal choice, it may increase centralization. Although thinking on those lines, there may be an IP address designated for client → relay responses - like a pool of these responses that are sent to all relays. The relay knows which one is intended for it and can then filter the rest and act accordingly. That would be difficult though, if not impossible using IP. But that’s leaning towards your “sender IP address” theory.

We’re mixing layers here. The packet transfer is a layer 4 connection. Above that you’re starting new connections, and any other machine contacted wouldn’t know about the failed packet transfer. Only the relay would know that. And only during the connection. (also 127.0.0.1 does not exist outside of your own machine - it’s a bogon)

Accepting spoofed addresses is a security risk. Most firewalls drop those packets. Proxies aren’t workable either; who’s going to run them and how do we know those aren’t the very investigators that we need to protect ourselves from? Why not use Tor in that case? At least Tor doesn’t have to be trusted.

Having to use a VPN would be comical. The other day when someone suggested using this (topic below) I almost laughed at the idea thinking how the need to do something like that would mean the SAFE network couldn’t do what it’s supposed to. But just a few days later that is precisely what is being proposed.

1 Like

Really only people in certain countries would this be needed, and it seems countries like the USA would not.

Think spoofing IP packets.

The client knows which parts of the chunk failed to be received and can request the failed ones again. Even if it does not know the relay node.

1 Like

Ok, second point granted.

And I may be going full tin-hat conspiratarded here but I would be very surprised if this isn’t implemented everywhere. I mean hell, look at the freakin’ TPP…

3 Likes

No one suggested a proxy. It was an example of a ISP filtering issue and how packets can be redirected and still work.

Not quite when its only for one function of the node. Most of the operation of the node does not use it. In AU many people are using proxies to get around much of the surveillance being brought in. The speeds are so much better than years ago.

Also for uploading the speed is the same as native ISP connection. And that is the only way it is being used for this. The issue of more centralization of relay node uploading might be an issue. This assumes that not everyone is using VPNs.

If VPNs were to be used for ALL the Node’s operation then it would be silly.

Also this is just one thing to consider and hopefully spark an idea we had not yet thought of.

AND NOTE, I would not be happy if I had to use an VPN and think that the above solutions are not really great

Still, to require VPN + SAFE to solve the same problem that was solved by VPN + BitTorrent combo would look funny. And $40-60/year.

Address spoofing would require changes on home firewalls of those who want to use workarounds based on that, and then represent a new risk. Imagine a botnet Sybil attack where they all are telling the downloading nodes they should ask the SAFE seed nodes or www.whitehouse.gov to deliver you the last chunk or packet.

1 Like

Just a bit of a thicko question, but doesn’'t each relay node only have part of a file raher than a whole file, so any accusations about having an encypted copyrighted file could be defended against? Sorry if this has been addressed, I couldn’t follow the thread as a bit too technical. :smiley:

The question was answered in the OP: the file is public and therefore not encrypted.

Copy right includes part of the work. It only specifies a significant part. It used to be like anything over 10 seconds is significant, but may have changed with the new trade agreements. And the USA has fair-use, but not countries like Australia.

And as Janitor says, its a public file so the keys are made public which are in the public file’s data map.

4 Likes

I think public data is also encrypted. I’m not sure how the decryption works in this case, but I think it is impossible to tell what data you are hosting, public or private. If that’s incorrect, it raises some other issues! :smile:

It is, but the data map is not encrypted. So the chunks are but on unwrapping the directory the chunk names are in the clear.

3 Likes