Traffic sizes on the SAFE network

The smallest overhead for files managed by NFS is 1 chunk.

If the datamap itself exceeds 3 KB then self encryptor will split it in 3 chunks, which makes an overhead of 4 chunks (the datamap of the datamap and the 3 chunks). This happens when original file size is n MB where n is the numbers of entries in the datamap that makes its size cross the 3 KB limit (but I don’t know the concrete value of n).

3 Likes

I think we need a table because I’m still struggling to understand this.

The leftmost column would be file sizes, starting with <3KB for example, and increasing with each row.

Columns:

  • 1: file size as a range, starting at <3KB, then 3kb-1MB etc
  • 2: file chunks
  • 3: approximate datamap size as a range based on file chunks (starting with 0-3kb),
  • 4: datamap chunks
  • 5: total chunks (i.e. datamap chunks + file chunks)

I know there may be some question about the exact size thresholds, but that’s fine.

I think for small files the content is in the datamap so I can imagine the smallest total might be 1 chunk for both file and datamap, but above 3KB I’m still unsure.

1 Like

Using basic_encryptor example program from self_encryptor crate, I empirically found that the formula giving the datamap size in bytes is 12+92*n where n is the file size in MB.

So the threshold where the datamap crosses the 3KB threshold is for 34MB file size.

Then we can derive the following table:

File size File chunks Datamap size Datamap chunks Total chunks
0-3KB 1 0 0 1
3KB-3MB 3 288 1 4
4MB 4 380 1 5
nMB n 12+92*n 1 n+1
33MB 33 3048 1 34
34MB 34 3140 4 38
nMB n 12+92*n 4 n+4
367GB 376119 33MB 4 376123

It is valid for files up to 367GB, above that level the datamap of the datamap crosses the 3KB threshold which creates a third level datamap (and I need to watch Inception again).

14 Likes

That’s marvelous @tfa. Thank you very much, and I think Inception is good choice but I prefer The Adjustment Bureau :wink:

2 Likes

I agree, probably this issue will be ‘self-solving’.

But if someone were to have said back in 2010 that ‘double sha256 algorithm for bitcoin mining is possibly not the best because it can be slightly optimised ah-la asicboost’ people would have said ‘I doubt that will actually be a problem’. But then it turned out to be quite major for several reasons.

So I’m not ready to totally concede latency as a nonissue just yet. But I do greatly appreciate the balance bought to the topic from different perspectives.

To me also it’s not about whether it’s enough to ‘justify the cost’ but is it enough to ‘cause departures’ by the victims. This may become self sustaining after a while since the dropouts further strengthen the surviving vaults. And to some degree that’s desirable so under-capacity resources don’t slow the whole network down. It’s a balancing act.

3 Likes

Nor am I, it is an important issue. And if it is a problem in the future than we may need to consider new ways to reward farmers that is not totally reliant on latency. Actually I’d prefer a better balanced system that does not reward the mere fastest.

For instance the fastest to respond may not be the fastest to deliver the whole chunk. For instance low latency does not mean fast link. Often fast links have better latency, but they sometimes don’t too.

If we really want home users in preference to data centres then we need to consider latency as only one factor in who supplies the chunk and gets rewarded.

Just thought a bit of deeper consideration might suggest some other insights.

2 Likes

Yes, this is how safe sites should be implemented, but this isn’t what I observe in a local network. I think this a bug so I have raised an issue.

3 Likes