Profiling node performance

Vaults with Disjoint Sections

It’s been a long time between tests, so let’s see how vaults perform with the new Disjoint Sections feature.

Results first: I couldn’t get the network started. But the test was still very interesting for other reasons.

Versions

Same versions as Test 12.

Modifications

Methodology

Same as prior tests:

  • Start 28 vaults on 7 pine64s
  • Time the duration to upload a large file (ubuntu-16.04-server-amd64.iso 655 MiB)
  • Repeat nine times and take median upload time

Results

The network never got started so no uploading could be done.

Observations

Resource Proof

v0.13.0 of the vault introduces another new feature besides Disjoint Sections: resource proof (ie upload speed must be at least about 6 Mbps)

Resource proof finds its way into the vault via the routing source code.

Of most interest is the verify_candidate method.

The candidate must provide a certain size of data (RESOURCE_PROOF_TARGET_SIZE = 250 MiB) in a certain amount of time (RESOURCE_PROOF_DURATION_SECS = 300 s) which ends up being a little over 6 Mbps.

This is an overly simple calculation, since there are other factors of timing to consider such as the time taken to traverse routing paths. From the source code comment: “This covers the built-in delay of the process and also allows time for the message to accumulate and be sent via four different routes”. This added complexity is very interesting from a security perspective, as it potentially allows nodes to alter the perceived performance of other nodes on their routing path. Enabling the network to ‘monitor itself’ creates many interesting considerations. The links above are a good starting point for more detail.

Private safe networks would most likely want to modify these parameters to make it faster to join the network, although at this stage the bandwidth requirement should be trivial to complete for any local network.

The implementation is quite elegent, and is easy to see how it can be extended to a cpu proof of resource.

My main doubt of the current resource proof is it uses the SHA3-256 hash function as the basis of proof (with trivial difficulty), yet the majority of current operations on the network are signature validation operations. The real-world performance of a node (especially one with custom hardware) depends on signature verification, so proving they have fast hashing isn’t necessarily going to determine how useful their ‘real’ performance will be on the network. Again, this is a slightly-too-simplistic look at things, but is a starting point in the consideraiton of resource proof. Hashes are perfectly useful as a means to determine bandwidth, but I have doubts about how long it will remain that way due to their disconnect with actual resource demands.

Proof Of 6 Mbps

I first tried running the vaults with the original 6 Mbps proof setting. The gigabit network should trivially handle this proof, and the logs showed the expected message:

Candidate XXXXXX… passed our challenge in 0 seconds.

However shortly after, log messages began showing the challenge taking some nodes 10+ seconds.

The consecutive times to pass the challenge from the first vault log were 0 0 1 2 2 4 7 5 8 6 10 12 8 31 18

This is still way under the 300s threshold, but the degree of variation seems like it can get quite large. It begs the question ‘what exactly causes it’ and ‘how far can it go’ and finally ‘can it be exploited by other nodes to their advantage’?

The variation is concerning to me, but resource proof is a complex topic and one I’ve only just started exploring. I’m sure there will be many interesting developments in the near future as the topic is explored further.

As to a reason for this delay… subjectively, there were a lot of log messages ‘Rejected XXXXXX… as a new candidate: still handling previous one.’ Unfortunately I don’t have the time to investigate this more deeply just now.

It’s tempting to draw conclusions from observations, but I think it’s important to take observations as-is and not make incorrect assumptions about the potential causes. I’m not familiar enough with resource proof to draw much meaning from this test, but find the observations interesting in their own right.

Ultimately my network never got started with 6 Mbps resource proof. The largest I saw the routing table get to was 7, which was from a sample of five attempts each given about half an hour to start.

Proof Of 27 bps

I changed the proof from

TARGET_SIZE = 250 * 1024 * 1024 = 6 Mbps

to

TARGET_SIZE = 1024 = 27 bps

The reason for lowering target size and not increasing allowed time was because I wanted the vaults to acknowledge quickly and the network be operational sooner, not simply to accept lower bandwidth.

The network would still not start. The cause isn’t clear from the logs.

Proof Of 0 bps

My expectation for 0 bps proof is still failure, since it seems that messages are not getting through regardless of the proof requirement.

0 bps resource proof also failed to start. I’m not sure how the Test 12 network was started in the first place. If there’s anything I might have overlooked that could help get the network started, I’d be interested to know.

Conclusion

I couldn’t get a network to start, so the performance of vaults with Disjoint Sections could not be tested.

The implementation of resource proof is extremely interesting. I’m looking forward to seeing how it progresses in the future.

14 Likes