Vault routing burden

The uploader would have to prepare the vids just for the SAFE network.
That can be hardly considered a fix.

You’d have to define what “works” means. As long as I can download the shit I need (at pretty much any speed, 10 KB/s is just fine), it’d work for me.
But then there’s this group of “sharing economy” fanboys who don’t want to spend any money and expect Netflix-like QoS. They’ll likely be very vocal about the inconvenience of having to wait till their video downloads.

Yes, by easy I mean that the SAFE network need not be burdened by video standard limitations like for web browsers. It will require a lot of development. And I’m wondering about the download speed. A good video streaming experience may need a couple of years to appear. Fortunately there is exponential progress of information technologies.

Chunks can be requested in parallel, so I’m pretty sure streaming HD video’s will be possible from the start. The only limitation is your own connection.

2 Likes

That’s great! Is it also possible stream a single file, or must the whole file first be downloaded and then decrypted?

EDIT: Oh, already answered: “I’m pretty sure that’s not true, you only need the datamap (all the hashes) and then it’s possible to decrypt every chunk independently of the others.”

Almost like random access of files.

Ah, you’re right. And I’m actually glad about that.

In the previous sections, we described the process of self
encrypting data. However, it did leave an important question
unanswered. How do we reverse this process to retrieve the
plain-text from the cipher-text chunks? The answer is data
maps.
In the II-A steps 1, 3 & 7 we collected important data. [This
data alone is enough to reverse the encryption process] and
this is stored in a structure we refer to as a data map. This is
described in the following table.

http://maidsafe.net/Whitepapers/pdf/SelfEncryptingData.pdf

1 Like

Yes, that’s basically what I said before.

I’m pretty sure that chunks will not necessarily be delivered in the requested order.

Depends on the client software, I haven’t looked, but since it’s open source it should be possible to change it to GET them in proper order. Or it could be done through an app.

1 Like

I’m trying to highlight the fact that your request is just that.
You don’t get the file until the vault and intermediate nodes pass it on to you.

But to you point, you could download larger chunks with breaks in between, assuming your client s/w had its player or plugin to do that.

Your client doesn’t request a file, it requests chunks. It has to work like that, or else it’d be impossible to download private files (encrypted datamap). So for streaming your client should first request the datamap, and then gradually request chunks as you watch the video to keep the buffer a few seconds ahead.

1 Like

I know that, I was talking about the scenario where a video is cut into many files. Yes, the client should work like that, I agree.

I’m really curious how big that caching aspect will be in routing. Is it just cached chunks in RAM up to like a GB? Or will there be a TEMP on a drive as well, which could add some extra to that, maybe up to even 5 GB? Maybe @qi_ma or @anon86652309 know that? I think it would make a great difference. And the chunks in the Vaults themselves are non-persistent, but maybe someone could reuse files that are in cache? I mean, these chunks aren’t the 4 that are close to the XOR-address… There’s no incentive for people to do so, other than to prevent others from Farming in the hope the price could go up. And to help the network. This is how it could work:

  • You start your Vault, Chunks that are close to you based on XOR will come your way (until you disconnect from the network).
  • You cache Chunks that “come by” in RAM. They’re encrypted with the key to unlock in RAM as well.
  • On your harddrive you can provide extra cache in a TEMP directory. You’re free to make it as big as you want, it’s extra next to the normal cache. The TEMP will stay on your computer when you disconnect from the network. The Chunks will be added to the cache once you reconnect again. So within 5 minutes of connecting to the network you could provide a lot of Chunks from cache, which will make the network faster.

Yes, if that’s possible then that’s much better. Then the player could play ordinary MP4 files etc.

I would expect that if this were the case then changes would HAVE to be made.

One change could be that vaults can be of two types.

  • vaults that just store data and give up that data on request
  • vaults that are nodes and can pass the data on. (like the current vaults)

This way we still have the massive data store and still have the security of many hops between data store and requester. The slow links are just data stores and simply farm for rewards. All the other nodes also farm and do the other functions of being a vault.

Also as time progresses this problem will be reducing in magnitude as the worlds internet speeds improve.

That would be wow, but “Wow look the security/anonymity is largely lost”

I expect that at least one, if not two intermediate nodes are needed to ensure that no node has access to both source/dest IP addresses, and to prevent small scale surveillance being able to capture traffic from nodes in its surveillance net and get sufficient traffic to determine source/dest otherwise for any significant amount of SAFE traffic.

There has to be some dynamic limits to all this because of you’re vault’s (in)ability to pump data out to the network. If its a 1Mbit/sec upload link then your cache can only grow so far before chunks are tossed out because of inactivity.

And like in operating systems software where caching is one area that is reviewed constantly because of its ability to affect performance, so it will be in the SAFE software.

You know what, I think human nature will play a big part in what this figure will actually be.

If it worked out like you suggested, then the network will be deemed slow for a new network. Although with 1 million clients, it would have also been deemed worthwhile for its security, anonymity, etc. So people will then change their usage patterns. It would change from using it as their main storage to something like

  • use it for backups. Write once - read rarely
  • use it for document type files - small files upto say 5 MB
  • not use it for streaming, but d/l HD movie hours before to watch later (I guess like bittorrent/NNTP users do now)
  • not use it to instantly access large files.
  • work out ways to only access the sections of the file they really want. APPs will appear that facilitate access to files in a segmented way. That is only grab the parts of the file currently being displayed/used and using local hard disk to buffer the parts already read.
  • plus more changes to usage.

Basically we would regress to the days when mass storage on main frames was on 9 track TAPE. It took time to move the tape to the point where the file resided and only the portion needed at the time is read and the tape is released to be used by another task.

Operating systems and PEOPLE using tape systems had a completely different mind set to accessing data to modern day people with PCs and “instant” access to data for reading/steaming/writing. Programs were written to use the data in a different manner when interfacing to people and written differently to optimise processing/data flows.

But hopefully by not requiring the whole file to be read at once will reduce the problem. Ask yourself “how many times do you just look at the start of a video/movie and decide against it”. Just the fact that a file only needs to be read a bit at a time for many applications (and SAFE does this normally) will reduce the estimated figure for global GETs a day.

Congrats, you noticed it and explained why the idea about optimizing the path has its limits.

1 Like

Yes. And why it would be tricky to implement. How to know which nodes to skip when trying for every 2nd or 3rd or whatever, and yet make sure there are enough hops. And do this without carrying enough state between nodes to compromise security. Maybe just a bit that flips state between nodes on the way to (or maybe from) the vault. that indicates if node is just a relay node or node to accept the chunk and pass it on.

But if it can be done simply & securely then it would be a worthy idea to implement when the net grows large. A lot of “excess” bandwidth would be saved.

One of questions that was asked before was what percentage of vaults would have to be controlled by the attacker for him to (not always, but sooner or later) find itself closest to the vault and to the user at the same time.

I envision a smallish (node number-wise) network of several tens of thousands of nodes, so in that kind of network an attacker with 5-10 VMs could do a good amount of information gathering, But if the network become larger as Seneca proposes, it will be harder to do.

Does the GET fulfillment go through the same channel that the request goes through?

If node(A) requests a chunk that happens to be on node(Z):

  • the closest node that (A) knows about is (C)
  • the closest node that (C) knows is (Y)
  • (Y) knows (Z)
  • (Z) has the chunk
  • (Z) knows (B) who is closer to (A) than (Y), so sends the chunk to (B)
  • (B) forwards chunk to (A)

In that case, the route taken back to the requesting node will always be the same length if not shorter. And shorter is what would ease the strain on the network bandwidth.

True, so whatever method was used to have relay nodes, would also have to ensure there are enough non-relay nodes to prevent this happening as much as possible. I’d guess that a few nodes in a very large network is all that would be needed and for small networks every node would have to be a non-relay node. But then again even relay-nodes add to the security against the issue you mentioned, maybe just not as much.

Oh well some hard figures would have to be analysed before it could be considered a viable possibility, but one to remember if its a problem for large networks.

1 Like