MaidSafe Dev Update - 10th May 2016

Cool. The same issue, of optimizing the use of asymmetric bandwidth, has been well-studied in the bittorrent world. For example: http://phix.me/dm/

EDIT: A quick-and-dirty approach could be to test each new vault to discriminate between ones with symmetric bandwidth (i.e., droplets) and the ones on ADSL. Then throttle down the latter’s transfer of data chunks. This would allow continued testing while future versions of safe_vault were given more capability for traffic shaping on those home connections and mobiles.

15 Likes

I think this will help SAFE to gain more trust from outsiders :thumbsup:. Are there any drafts on this one? I can’t find anything in the RFC-channel on GitHub. Or is that RFC still work in progress?

5 Likes

Maybe looking into Project Maelstrom
http://blog.bittorrent.com/tag/maelstrom/
will give some insights to further progress…
I’m sure SAFE can benefit from inspirations :chipmunk:

Yes it’s a couple of RFC’s actually being worked on as we type :smiley:

2 Likes

There has been no mention of MVP for the last 3 weekly update. I guess it got pushed much further.

What are the minimum feature / specification of what you would call an MVP ?

1 Like
5 Likes

Thank You for the update!

4 Likes

And it’s only the beginning, the future of safe network looks bright.

But I have some questions: Is it an approximate guess or a precise measurement and what formula did you use?

1 Like

This is the total number of nodes that connected to the 200 nodes we wee running. There may have been slightly more.

1 Like

Feel like doing some math here :smiley:

With approximately 590 outside nodes joining the 200 droplets there is total 790 nodes. With group size of 32 there is a chance of approximately (589/789)^32=0.00008654168 an outside node does not join a single droplet (every time the outside node joins another outside node one node is removed from the set so the exact chance is slightly lower but we can take upper bound). Then we have a more than 0.9999 chance an outside node connects to at least one droplet.

Considering 590 outside nodes we have a (0.9999)^590>94% chance that all the outside nodes are connected to at least one droplet. Initially when joining the network an outside node has a bias towards joining one of the droplets? Depends a bit on the churn then if chance is even a little higher than 94% that all nodes are counted.

So we have about 5% chance that one or more nodes weren’t counted.

4 Likes

Ok, thanks.

I am trying to guess the total number of nodes in a safe network from the distance between a node name and its furthest node in the close group. Empirically I have determined the following approximate formula:

max_value / (distance / group_size)

where max_value is 2^512 - 1

If we average these values on several vaults we can get an estimated number of nodes. I found that the result can be improved by subtracting a fraction of the standard deviation of the measured values:

average - 0.3 * standard_deviation

Do you have a more precise formula than this one?

My simulations are open sourced on GitHub. Program is NodeCount.linq (sorry not rust but C#) and results are in NodeCount.xslx.

Here are examples of error percentages I got for different network sizes (from 1000 to 100000 nodes) and different counts of measuring nodes (from 12 to 48):

| Network | 12 nodes | 24 nodes | 36 nodes | 48 nodes |
|---------|----------|----------|----------|----------|
|   1 000 |  30.28 % |  13.06 % |   7.14 % |   2.86 % |
|   2 154 |  22.00 % |   9.03 % |  11.51 % |   4.85 % |
|   4 642 |   9.84 % |  -0.77 % |   0.15 % |   1.15 % |
|  10 000 |  11.10 % |  -0.83 % |  -1.31 % |   1.37 % |
|  21 544 |  14.70 % |   2.06 % |  -0.23 % |   0.66 % |
|  46 416 |  21.71 % |   2.52 % |  -1.89 % |  -4.01 % |
| 100 000 |  18.82 % |  -2.05 % |   3.69 % |   1.32 % |

Clearly, the more measuring nodes are used, the better the result is.

4 Likes

Yes the more nodes then the more balanced the binary tree is (consider whole network as single binary tree). Initially there is huge imbalance but after a fe thousand it gets much better. When large enough distance measurements add to security. So if you get a close_group all pretty close then the network cna be considered “big enough” but we do not know that number just yet.

When we do then we can get some great statistics from it.

3 Likes

What are these exactly? You mean error to PUT, to GET? To reach an address with a message?

No, no network is running in my simulation.

My program only creates a set of 512 bits names, each representing a node. Then I compute the close group of a number of nodes and apply a formula supposed to compute the total number of nodes. Error percentage indicates the difference between the computed number and the real number of nodes of my simulation.

2 Likes

I can’t open this link. Anyone else?

It’s working on my end.

1 Like

I’m wondering what the spread of network traffic is… will that be evenly distributed but for some weighting relative to a nodes performance? Might that be a lot of traffic for small nodes to deal with or is a weighting and balance algorithm taking bandwidth response into its thinking? That is, will a small resource offering see less traffic relative to a high resource droplet type node??.. what parameters are considered?

I haven’t studied your simulation so please forgive me if you have already answered this somewhere:

Your simulation results would apply to the droplet network only if the conditions also apply. For example:

  1. There are nine droplets hard-coded into the launcher config and, presumably, your vaults are not. I know for a fact that hard-coded vaults get a fuller view of the network, as measured by routing table size. They will also receive more client connections. Did your simulation also have such an asymmetry?

  2. Your simulation was not based on running actual vaults but calculations based on the algorithms, correct? If that is the case, then you don’t know how well the actual vaults conform to their theoretical ideal as represented by the simulation.

  3. How “lumpy” (for want of a better term) is the actual network? You don’t know.

Yes, my simulations suppose that:

  • The vaults names are randomly dispersed in the XorName space

  • The close group of a vault really contains its nearest vaults

This isn’t necessarily in contradiction with your observations about the routing table because the close group is a subset of the routing table and I believe that a node becomes a vault only when the inner part of the routing table is the node close group.

But currently the main cause of imprecision is the low number of measuring vaults. I hope I will able to do better later when I succeed in doing measurements outside of a vault.

You say that the size of the error is due to the low number of measuring vaults but intuitively I would have guessed that it would actually be due to the ratio of number of measuring vaults to total number of vaults. Why is my intuition incorrect?

Also, looking at your table, although there is a trend of lower error with greater number of measuring vaults, there is also an awful lot of bumpiness along the way. Why is that?