Statistics on the latency between nodes for different network sizes

Tim87 · March 7, 2017, 8:33pm

I compiled some statistics about the distribution of one-way minimum latencies across the network for different network sizes in hope that it may serve as a helpful reference when certain design decisions are considered.

SAFE, with 8 close-node connections, forms a random 8-regular graph, and from this recent paper and these slides it seems that 1) the right way to compute the average hop count for such a network of size n is x = log(n)/log(8), and that 2) it will be more and more narrowly centered around x for sufficiently large networks (i.e. most nodes are almost exactly the same distance away from each other) so we can model the distribution of the hop counts very simply.

Methodology. I distributed one million nodes across the globe based on population (I used the quarter-degree grid from here), then I measured the distance between ten million randomly selected pairs to acquire a sufficiently large sample of distances to draw from. Please note that those one million nodes are not about the network size.

After this, I generated ten million samples for each network sizes: for each sample, I took a normally distributed random number with mean set to the average hop count and standard deviation set to 1, rounded it, and then I summed up this many random samples from my previously generated 10 million random distances.

I may update this post with proper density graphs later.

Caveats:

I didn’t account for differences in access to technology (i.e. wealth)
I pretended there was a direct telecommunication line between any two nodes, the shortest possible around the globe, which I treated as a perfect sphere.
I set the speed to the speed of light in vacuum, so I worked with the lowest latencies that are physically possible without crossing the earth’s body. For realistic values, these numbers must be multiplied with the velocity factor of copper or fiber optics (the latter of which is especially low, around 0.7), and IP routing delays must also be taken into account.

The average hop distance calculation only works for larger networks, so the values for smaller ones (under a few hundred thousands maybe?) are not to be trusted much.

          1,000                2,500                5,000   
  ----------------     ----------------     ----------------
  hop:    3.322        hop:    3.763        hop:    4.096   
  min:    0.000 ms     min:    0.000 ms     min:    0.000 ms
  05%:   22.660 ms     05%:   30.565 ms     05%:   36.636 ms
  10%:   32.340 ms     10%:   40.971 ms     10%:   47.533 ms
  med:   75.177 ms     med:   85.701 ms     med:   93.662 ms
  avg:   78.245 ms     avg:   88.634 ms     avg:   96.518 ms
  90%:  127.885 ms     90%:  139.927 ms     90%:  149.046 ms
  95%:  144.227 ms     95%:  156.679 ms     95%:  166.048 ms
  97%:  155.098 ms     97%:  167.812 ms     97%:  177.427 ms
  99%:  176.191 ms     99%:  189.340 ms     99%:  199.358 ms
  max:  334.499 ms     max:  359.320 ms     max:  371.453 ms
  std:   37.107 ms     std:   38.470 ms     std:   39.472 ms
                                                            
         10,000               25,000               50,000   
  ----------------     ----------------     ----------------
  hop:    4.429        hop:    4.870        hop:    5.203   
  min:    0.000 ms     min:    0.000 ms     min:    0.000 ms
  05%:   42.754 ms     05%:   50.913 ms     05%:   57.205 ms
  10%:   54.093 ms     10%:   62.851 ms     10%:   69.516 ms
  med:  101.563 ms     med:  111.980 ms     med:  119.939 ms
  avg:  104.337 ms     avg:  114.700 ms     avg:  122.579 ms
  90%:  158.072 ms     90%:  169.980 ms     90%:  178.985 ms
  95%:  175.388 ms     95%:  187.689 ms     95%:  196.997 ms
  97%:  186.946 ms     97%:  199.435 ms     97%:  208.927 ms
  99%:  209.149 ms     99%:  222.139 ms     99%:  231.969 ms
  max:  348.279 ms     max:  377.982 ms     max:  389.120 ms
  std:   40.433 ms     std:   41.684 ms     std:   42.604 ms
                                                            
        100,000              250,000              500,000   
  ----------------     ----------------     ----------------
  hop:    5.537        hop:    5.977        hop:    6.311   
  min:    0.000 ms     min:    0.383 ms     min:    2.609 ms
  05%:   63.392 ms     05%:   71.767 ms     05%:   78.097 ms
  10%:   76.118 ms     10%:   84.995 ms     10%:   91.715 ms
  med:  127.805 ms     med:  138.273 ms     med:  146.143 ms
  avg:  130.397 ms     avg:  140.819 ms     avg:  148.645 ms
  90%:  187.989 ms     90%:  199.868 ms     90%:  208.735 ms
  95%:  206.237 ms     95%:  218.468 ms     95%:  227.720 ms
  97%:  218.358 ms     97%:  230.836 ms     97%:  240.195 ms
  99%:  241.756 ms     99%:  254.605 ms     99%:  264.315 ms
  max:  424.117 ms     max:  413.486 ms     max:  416.820 ms
  std:   43.526 ms     std:   44.687 ms     std:   45.549 ms
                                                            
      1 million           2.5 million           5 million   
  ----------------     ----------------     ----------------
  hop:    6.644        hop:    7.084        hop:    7.418   
  min:    4.094 ms     min:    8.333 ms     min:   10.930 ms
  05%:   84.533 ms     05%:   92.949 ms     05%:   99.410 ms
  10%:   98.478 ms     10%:  107.367 ms     10%:  114.139 ms
  med:  154.081 ms     med:  164.477 ms     med:  172.324 ms
  avg:  156.531 ms     avg:  166.891 ms     avg:  174.716 ms
  90%:  217.665 ms     90%:  229.456 ms     90%:  238.314 ms
  95%:  236.881 ms     95%:  249.055 ms     95%:  258.164 ms
  97%:  249.553 ms     97%:  262.029 ms     97%:  271.274 ms
  99%:  274.005 ms     99%:  286.877 ms     99%:  296.459 ms
  max:  477.599 ms     max:  449.907 ms     max:  473.298 ms
  std:   46.382 ms     std:   47.513 ms     std:   48.315 ms
                                                            
     10 million           25 million           50 million   
  ----------------     ----------------     ----------------
  hop:    7.751        hop:    8.192        hop:    8.525   
  min:   13.118 ms     min:   14.255 ms     min:   20.720 ms
  05%:  105.935 ms     05%:  114.545 ms     05%:  121.092 ms
  10%:  121.035 ms     10%:  130.041 ms     10%:  136.894 ms
  med:  180.303 ms     med:  190.642 ms     med:  198.541 ms
  avg:  182.630 ms     avg:  192.960 ms     avg:  200.830 ms
  90%:  247.259 ms     90%:  258.833 ms     90%:  267.686 ms
  95%:  267.321 ms     95%:  279.270 ms     95%:  288.401 ms
  97%:  280.607 ms     97%:  292.761 ms     97%:  302.036 ms
  99%:  306.161 ms     99%:  318.727 ms     99%:  328.219 ms
  max:  485.252 ms     max:  490.940 ms     max:  513.454 ms
  std:   49.125 ms     std:   50.130 ms     std:   50.907 ms
                                                            
     100 million          250 million          500 million  
  ----------------     ----------------     ----------------
  hop:    8.858        hop:    9.299        hop:    9.632   
  min:   20.335 ms     min:   12.439 ms     min:   27.559 ms
  05%:  127.585 ms     05%:  136.357 ms     05%:  142.971 ms
  10%:  143.732 ms     10%:  152.845 ms     10%:  159.746 ms
  med:  206.431 ms     med:  216.840 ms     med:  224.698 ms
  avg:  208.677 ms     avg:  219.068 ms     avg:  226.901 ms
  90%:  276.473 ms     90%:  288.148 ms     90%:  296.859 ms
  95%:  297.393 ms     95%:  309.409 ms     95%:  318.388 ms
  97%:  311.207 ms     97%:  323.509 ms     97%:  332.605 ms
  99%:  337.736 ms     99%:  350.449 ms     99%:  359.818 ms
  max:  526.276 ms     max:  541.583 ms     max:  562.844 ms
  std:   51.672 ms     std:   52.667 ms     std:   53.374 ms
                                                            
      1 billion           2.5 billion           5 billion   
  ----------------     ----------------     ----------------
  hop:    9.966        hop:   10.406        hop:   10.740   
  min:   35.537 ms     min:   40.018 ms     min:   42.853 ms
  05%:  149.582 ms     05%:  158.358 ms     05%:  164.969 ms
  10%:  166.688 ms     10%:  175.805 ms     10%:  182.750 ms
  med:  232.587 ms     med:  242.949 ms     med:  250.882 ms
  avg:  234.768 ms     avg:  245.134 ms     avg:  253.011 ms
  90%:  305.638 ms     90%:  317.235 ms     90%:  326.001 ms
  95%:  327.385 ms     95%:  339.334 ms     95%:  348.350 ms
  97%:  341.777 ms     97%:  353.901 ms     97%:  363.060 ms
  99%:  369.208 ms     99%:  381.737 ms     99%:  391.377 ms
  max:  581.850 ms     max:  564.087 ms     max:  595.455 ms
  std:   54.092 ms     std:   55.053 ms     std:   55.780 ms

happybeing · March 7, 2017, 8:55pm

Great work… Again!

And very encouraging. If these figures are representative they are much better than I was expecting.

Average latency of up to a quarter of a second means a round trip access time of around half a second, plus some device latency at the far end (eg disk access time), but with caching data these averages will reduce for popular content.

Is that a fair interpretation?

I’d be interested to know how this compares to the current web/internet too. If anyone can dig out stats, please post them.

Tim87 · March 7, 2017, 8:59pm

Things to keep in mind:

these are one-way latencies, so double them for round trip
wires don’t actually connect all possible locations along straight lines
data travels about 1.45x slower in fiber optics
routers add latencies, too
i may have made some ridiculous mistakes

In no way. We’ll be running this thing on top of the current internet, which is why we need to account for the additional delays compared to the idealistic scenario that these statistics are based on.

anon81773980 · March 7, 2017, 9:04pm

data travels about 1.45x slower in fiber optics

Odd. I wonder why it would be slower?

Tim87 · March 7, 2017, 9:07pm

Not really odd. Nothing can travel as fast as the speed of light in vacuum. As for how fast light can travel in glass, that depends on its refraction index or wutt.

mav · March 7, 2017, 11:21pm

This is really cool. 10 hops for 5B nodes with med 1way latency of 250ms is impressive. And that’s not considering improvements in latency due to opportunistic caching. I’m curious how the influence of high population low tech countries like Africa, India and China affect the model.

My initial feeling is datacenters will be the primary place for farming, so latency will probably be better than these results in real life due to high chance of close proximity.

Also group size of 8 is temporary (if memory serves me correctly) and in the future will be larger for security reasons. Any clarification on the eventual group size? Larger group sizes will reduce the number of hops and thus the latency.

neo · March 8, 2017, 12:37am

Good work Tim.

I think you find it worse than that because of repeaters that are required. So not only the slower light speed due to the medium, but also repeated induced delays. Australia to USA involves trips to different countries for both traffic to/from that country (NZ) and to provide the repeater. Then the trip to USA involves a number of repeaters

For AU to US it takes time in AU then from the AU fibre exit point to USA entry point is around an average of 144mS. A distance of around 12000Km to 13000Km for the fibre to follow That is around 8 x 10^7 meters/sec This means our fibre with all its repeaters (NO NZ hop for this test) is running over 3 times slower than c

And through raw fibre it depends on the fibre and the mode of operation. It does vary a lot from one mode to another. I had one technologist call me an idiot for saying light travels at different speeds depending on what its traveling through.

Tim87 · March 8, 2017, 1:14am

Yup. These numbers are the smallest physically possible ideal case, as already stated. We can safely assume at least 2.5x longer latencies: 1.45x simply because of the slower propagation in fiber optics, some more because we don’t take the great circle (shortest) path between the nodes, some more because of the delay at each router, and so on.

That sounds a lot worse than I expected…

Again, let me make sure it’s understood as the physically possible smallest latency. In our less idealistic (but delightfully real) world we’ll encounter 2-3 (or even much more; see above) than these values.

I’m not sure that will be a huge difference. We’d need several consecutive hops happen within the same datacenter, and regularly so, for that to matter.

Not as much as you’d think, actually. I ran the same simulation for 16 nodes out of curiosity, and the values were not much different:

         1,000                2,500                5,000      
    ----------------     ----------------     ----------------
    hop:    2.491        hop:    2.822        hop:    3.072   
    min:    0.000 ms     min:    0.000 ms     min:    0.000 ms
    05%:    8.017 ms     05%:   13.780 ms     05%:   18.106 ms
    10%:   16.115 ms     10%:   22.670 ms     10%:   27.520 ms
    med:   55.394 ms     med:   63.215 ms     med:   69.207 ms
    avg:   58.725 ms     avg:   66.461 ms     avg:   72.370 ms
    90%:  105.017 ms     90%:  114.086 ms     90%:  121.052 ms
    95%:  120.555 ms     95%:  129.938 ms     95%:  137.147 ms
    97%:  131.005 ms     97%:  140.552 ms     97%:  147.839 ms
    99%:  151.173 ms     99%:  161.038 ms     99%:  168.600 ms
    max:  334.499 ms     max:  300.443 ms     max:  314.432 ms
    std:   34.341 ms     std:   35.478 ms     std:   36.315 ms
                                                              
           10,000               25,000               50,000   
    ----------------     ----------------     ----------------
    hop:    3.322        hop:    3.652        hop:    3.902   
    min:    0.000 ms     min:    0.000 ms     min:    0.000 ms
    05%:   22.659 ms     05%:   28.596 ms     05%:   33.092 ms
    10%:   32.342 ms     10%:   38.833 ms     10%:   43.699 ms
    med:   75.199 ms     med:   83.058 ms     med:   89.006 ms
    avg:   78.252 ms     avg:   86.031 ms     avg:   91.917 ms
    90%:  127.854 ms     90%:  136.911 ms     90%:  143.712 ms
    95%:  144.185 ms     95%:  153.534 ms     95%:  160.555 ms
    97%:  155.040 ms     97%:  164.607 ms     97%:  171.765 ms
    99%:  176.156 ms     99%:  186.069 ms     99%:  193.563 ms
    max:  385.540 ms     max:  330.466 ms     max:  368.666 ms
    std:   37.105 ms     std:   38.123 ms     std:   38.892 ms
                                                              
          100,000              250,000              500,000   
    ----------------     ----------------     ----------------
    hop:    4.152        hop:    4.483        hop:    4.733   
    min:    0.000 ms     min:    0.000 ms     min:    0.000 ms
    05%:   37.682 ms     05%:   43.746 ms     05%:   48.358 ms
    10%:   48.632 ms     10%:   55.172 ms     10%:   60.130 ms
    med:   94.961 ms     med:  102.785 ms     med:  108.762 ms
    avg:   97.821 ms     avg:  105.594 ms     avg:  111.493 ms
    90%:  150.587 ms     90%:  159.529 ms     90%:  166.306 ms
    95%:  167.621 ms     95%:  176.914 ms     95%:  183.924 ms
    97%:  179.028 ms     97%:  188.488 ms     97%:  195.634 ms
    99%:  201.001 ms     99%:  210.772 ms     99%:  218.155 ms
    max:  330.752 ms     max:  378.611 ms     max:  378.589 ms
    std:   39.645 ms     std:   40.599 ms     std:   41.314 ms
                                         
        1 million           2.5 million  
    ----------------     ----------------
    hop:    4.983        hop:    5.313   
    min:    0.000 ms     min:    0.000 ms
    05%:   53.060 ms     05%:   59.233 ms
    10%:   65.099 ms     10%:   71.688 ms
    med:  114.715 ms     med:  122.583 ms
    avg:  117.392 ms     avg:  125.206 ms
    90%:  173.064 ms     90%:  182.016 ms
    95%:  190.846 ms     95%:  200.091 ms
    97%:  202.686 ms     97%:  212.106 ms
    99%:  225.538 ms     99%:  235.302 ms
    max:  369.810 ms     max:  396.017 ms
    std:   42.001 ms     std:   42.926 ms

Unfortunately for you, I got impatient and stopped it at this point. If you really want, I can run it again for 16 close nodes, or less, or more. But, you can also compute the average path length from the simple (but I hope: correct) formula in the OP, and then you can interpolate the numbers for any combination of network size and close node count.

Warren · March 8, 2017, 1:40am

@Tim87 Waiting for something like what you did for so long, really, really appreciate it because it give some sense of how much of a clearnet replacement SAFE is and even possibly insight into how quick the uptake might be. I appreciate the idealized estimates too, as those are most interesting to me. Now have to wait for David to weigh in, to be sure.

Would something tachionic reduce the latency to close to zero or would the node still introduce a latency that the formula or a varient could describe- sounds silly but would like to know for farther horizon or sense of ultimate limits.

Care to comment on use case implication?

tfa · March 8, 2017, 7:29am

Unfortunately, I don’t think that’s true because hops are defined in the xor space and not in the physical space. This means that on average data travels a quarter of earth circumference at each hop.

Tim87 · March 8, 2017, 8:03am

That’s exactly what I was trying to model.

However, instead of assuming 1/4 (actually, ~~that would be 1/2~~ EDIT I’m stupid) of the circumference of the globe, I distributed the nodes more realistically: based on population. Then, I connected them at random, because that’s what the XOR space thing will result in. I noted that it’s still not quite realistic: 1) high population places are not always wealthy enough to have access to technology, 2) networks are not straight lines between all possible locations.

Different network sizes result in different approximate distances between the nodes. Supposedly, graphs with n nodes, all of them with 8 connections (called random r-regular graphs or random k-regular graphs, depending on counting the edges or the spikes) give us a distance of log(n)/log(8), and the bigger n is, the more narrowly centered around that number this distribution is. So, I picked this many network hops at random for the given network size.

Tim87 · March 8, 2017, 8:13am

I’m not sure non-existing technology can help us reduce latency, if that’s what you mean

These calculations are just supposed to give us the lowest physically possible bound for what we can expect in terms of latency when the network gains certain levels of adoption. It’s a pretty decent number, but certain use cases, when real-time back-and-forth communication is required, we’ll may need workarounds (like how the video chat test app did, which I think switched over to a direct IP channel after the connection was agreed upon over the SAFE network.)

The most important thing comes from the idea that the average path length, thus the latency, grows proportional to the log of the network size. The actual numbers are useful guidelines which may be helpful until actual measurements are possible.

Warren · March 8, 2017, 4:36pm

Thank you. That does it. And the example of the work around is very helpful. I wonder if that workaround can become a SAFE provision/facility. So as I understand it, the latency as bottleneck might be more oriented to initial handshake where the work around would suffice. So it feels at this point like SAFE can provide a rapid or almost total replacement for clearnet.

Tim87 · March 8, 2017, 5:39pm

No, not generally. More like, almost never? Only in special cases, when the need for anonymity can be relaxed. Otherwise, it’s a big no no because 1) the participants can see each other’s IP addresses, 2) agencies such as the NSA can again collect useful metadata.

That example was cool for now, but I wouldn’t think of it as an example for our future “best practices.”

We may need to come up with workarounds for some situations that require real-time back-and-forth communication, but they should not compromise the way this example did.

neo · March 8, 2017, 11:29pm

We had discussions a while back on this “workaround” and David @dirvine mentioned that the use of relay nodes may help with the anonymity issue. Each end point talks to a relay node with the lowest latency and the relay nodes talk with each other. That way no one node/PC knows the IP addresses of both end points and the latency is kept reasonable.

Tim87 · March 8, 2017, 11:58pm

This discussion escaped my attention. Very cool!

Richard_Bradley · February 1, 2019, 4:41pm

I’m a bit late to the discussion about latency as I’m new to this forum - but this topic is fascinating.

Since you shared the minimum latency estimates, has there been any comparison with observations in a real-world implementation?

neo · February 2, 2019, 12:22am

When we have Alpha 3 up and running then there can be some measurements. What i am not sure of if there will be a specific measurement between nodes in alpha 3 or if it will be an overall time that will be measured.

Warren · February 2, 2019, 5:08am

Most annoying question ever asked on this forum but any very loose personal estimate on Alpha 3 release?

neo · February 2, 2019, 5:13am

You’ll know as much as I do, if you read the updates.

Taking all the tiding up being done on components then its a lot closer than when PARSEC was announced. That is it looks more promising now than it did back then which had so much work to be done elsewhere and much of that work is near finished.

Reading the updates we see that testing & optimisations are being done on PARSEC which is a major component of the alpha 3 release.

Topic		Replies	Views
Network efficiency and performance improvement algorithm proposal Beginners	25	765	June 3, 2021
Network size -- How will we know? Features	5	1053	October 18, 2014
Real-time, low-delay streaming between nodes? Beginners	10	1896	March 2, 2016
Internals of the Safe network Development	10	608	October 27, 2022
Speed of the safe network Features	47	1530	February 10, 2020

Statistics on the latency between nodes for different network sizes

Related Topics