Profiling vault performance

The creative in me says yes, there are ways it may be used.

However the engineer in me says no. Randomness (when required) should be taken from a source with a known quality of randomness. Since this variability may (will?!) be reduced in the future, it should not be used as a source of randomness.

Sorry for late reply, I’ve been away :slight_smile: :palm_tree:

2 Likes

That makes a lot of sense to me. Along that line we are aware that SAFE has min number of nodes threshold for security- along the same line and above that threshold or above such a measureable and consistently knowable quality of randomness could this randominity be useable and potentially reliable? I understand we want to eliminate it and don’t want to rely on it as that could be a point of breakage or vulnerability. Still, if Ive understood correctly new approaches use noise as a channel and it makes me also wonder about noise as a clock or time signature. Say an index of the network’s potentially unique noise gathered sequentially as clock. All the nodes of another network might have a tough time being in the same place at the same time. The network’s biometric print across time? Could be hard for another net to spoof or estimate this to internal precision.

1 Like

This test focuses on the effect of group size and quorum size on upload time and network performance.

Summary

As usual, the results first:

Upload time increases dramatically as group size increases.

Upload time does not change significantly as quorum size increases.

Methodology

This test uses the same software versions as the last test, primarily centred around testing vault 0.11.0, only modified for additional logging.

The file uploaded for these tests is Hippocampus (downloadable from vimeo), and is 96.2 MiB.

  • Compile vault with custom group and quorum size
    • Deploy and start 28 vaults on 7 pine64s
    • Wait for network to be formed
    • Upload 96.2 MiB file using demo_app
    • Record the time to upload and the number of hop messages received by vaults
    • Stop vaults and remove old logs
    • Repeat 5 times with the same group/quorum size
  • Repeat with a new group/quorum size

The tests spanned group sizes from 4 to 14 and quorum sizes from 2 to 12.

The results below show the median from five repetitions.

Upload Times

How many minutes did it take to fully upload the 96.2 MiB file? (I’ll put a dynamic table here if the forum can allow <table> tags)

Hop Messages

How many million hop messages were received by all vaults? (I’ll put a dynamic table here if the forum can allow <table> tags)

Variation Due To Group Size

The effect of different group sizes can be observed by keeping the quorum size the same. There is a strong increase in upload time and the number of hop messages as the group size increases.

Variation Due To Quorum Size

There is relatively little increase in upload time as the quorum size increases, but hop count does seem to increase slightly (but not as much effect as changing group size).

Raw Data

The raw log data for all tests is available on the alpha network at http://www.mav.safenet/2016_10_10_group_size_test.7z. Download is 80 MiB, and when decompressed is about 4.8 GiB. For a more permanent link, it’s also available on mega.

Discussion

These results seem to match intuition, but the increase in upload time was more than I expected.

It’s reassuring to see quorum size only affects security / durability of data and is not a factor in network scaling (as expected).

I’m not sure how disjoint groups will perform, since there will be significant variation in group size. Not saying it’ll be bad, just saying I don’t know.

As @AndreasF pointed out in a previous post about expected number of hops, there is room for improvement on hops but security must come first. Will disjoint groups improve the hop messages situation?

Does the small size of my network (28 vaults) affect the results? I wouldn’t think so, since any node only sees part of the network anyhow, but I’m open to speculation about this.

Asides

  1. To compare with the alpha network (group 8 quorum 5), it took 8.5m to upload the same 96.2 MiB file to the alpha network. This is an average rate of 1.55 Mbps. speedtest.net shows my connection can reach 35.16 Mbps upload, so it wasn’t saturated. I reckon that’s pretty good performance, at least compared to the pine64 network which was about 20.5m for similar group and quorum sizes. I’d say this is due to two main differences: alpha has more nodes, and the nodes are (presumably) on more powerful machines than the pine64s I’m using.

    This result was a surprise to me. I expected alpha to be slower than my dedicated network. Shows that cpu performance matters when latency is low.

  2. Received hop message count includes those required for the network to start up and registering an account etc (the point being this was consistent across all tests).

  3. There was a huge variation within the results of each configuration, which is still very surprising to me. Hover over any of the cells in the tables above to see what I mean (hovering shows the data for all five tests).

  4. The last 1% (going from 99% to 100%) always takes much longer than any other. If a normal percent passes in 25s the last percent will pass in about 400s. What is happening in that last 1% that takes so long? It’s very frustrating!

  5. The first 10% (going from 0% to 10%) is almost instant (300ms). I don’t know whether this is because the progress bar is faulty in the demo app or if the vaults are really saving 10% of the file very quickly.

  6. The amount of time for the network to start (ie populate the routing table) increased as group size increased. I didn’t measure it, but subjectively it was quite noticable.

  7. There appears to be a loose correlation between upload time and hop message count. This wasn’t the point of the test, but anyway here’s the chart for anyone curious:

  8. The Disjoint Groups RFC has an interesting point in the drawbacks section related to these tests:

    The group message routing will involve more steps, either doubling the number of actual network hops, or making the number of direct messages per hop increase quadratically in the average group size!

    Although this seems to be addressed in the Test 11 update:

    we will implement the new group message routing mechanism. It will be slightly different from the one specified in the RFC, however, delivering the same level of security but without the huge increase in the number of hops or hop messages.

    It’s still not clear how this will perform relative to the existing hop mechanism but it is being considered.

  9. There were some tests that did not complete and were discarded. This was either due to very slow bootstrapping of the network (mostly with large group sizes), or the demo app showing a Request Timeout error.

  10. The safe network is amazing. Seeing it come together is a real privilege.

Main Point

My main takeaway from these tests is the underlying performance characteristics of the safe network arise as a result of being primarily a messaging platform, with data storage being ‘merely’ the end result of that. Messaging is the key. This conclusion is not surprising in hindsight, but the tests have shifted my balance of thought strongly toward messaging and away from data.

25 Likes

A fascinating read. We all owe you a big thanks for the time and work put in!

10 Likes

Awesome, thanks for the detailed analysis!
Most of the numbers are intuitive, although something weird seems to happen with quorum size 2, going from 8 to 10 nodes.

Disjoint groups are not about performance, and at first, they may well impact it negatively: the average group will probably be at least 150% of the minimum group size and we’re not working on optimisations yet. That RFC is about defining the web of trust of the node keys, and how to authenticate message senders.

But in many cases, those two things - the cryptographic web of trust and the actual web of TCP connections or who sends direct messages to whom - won’t need to coincide, and that’s where we’ll be able to significantly reduce the number of hop messages in the future: if A signs something for B it doesn’t matter what actual route that signature takes in the network. In theory, not even nodes in the same group have to be fully interconnected. The notion of “connected” that the RFC really refers to is just “having each other’s public key”.

12 Likes

This test is around performance of vaults with structured data. I admit to being slightly provocative because it may be compared to the bitcoin transaction rate (which is currently under some heavy contention).

The test is to modify structured data as rapidly as possible. This should hopefully find the maximum transaction rate for my personal safe network (which isn’t especially powerful).

Motivation

SafeCoin will be implemented as a structured data, and is intended to scale very efficiently, both in cost to the end user and load to the network.

Since the cost to update structured data is zero, it’s worth investigating the impact of high loads on the network, and what the transaction rate of SafeCoin may be.

Setup

Using Test 11 software versions, which is required for low level apis.

  • safe_launcher 0.9.2
  • safe_core 0.22.0
  • safe_vault 0.12.1 (assumed from release date of 2016-10-19)
  • routing 0.27.1
  • custom script to modify structured data

Modifications were only to remove check for peers on LAN. Account limits don’t matter for this test, since updates to SD don’t count toward usage. Group size is 3 and Quorum size is 2.

Hardware for the network is same as all prior tests:

  • Network of 7 pine64s running 4 vaults each on gigabit ethernet
  • Client is laptop with quadcore i7-4500U @ 1.80GHz, 8GB RAM uploading on wifi at 300 Mbps

Method

The general idea is to rapidly modify structured data until an issue comes up.

The test script does the following:

  • Create a structured data containing 30 random characters
  • Measure the time taken to update that data X times with 30 new random characters
  • Repeat several times to obtain an average update rate

Results

Update Rate

How long does it take to update SD, and how many updates / second can be achieved?

Whether updating 1 time or updating 100 times in rapid succession, the rate was usually between 120 - 130 updates per second.

However there was some variation where up to 300 u/s were seen.

The good news is that increasing the load did not cause any increase in update rate (as expected).

CPU load across all vaults was negligible.

CPU load on the client was 100% of one core (presumably due to the single-threaded nature of the launcher and safe_core).

Maximum Updates and Failure Point

The maximum number of updates before an error showed was about 1000 updates in a row.

The error returned is simply ‘EOF’, so the launcher server seems to drop requests (ie not the vaults flaking out). This is good news in one way (the launcher is a relatively easy fix) but bad in others (the launcher should be more stable than this). The other aspect to this bottleneck is it makes it difficult to test the limits of vaults with respect to DOS using SD updates.

I had intended to test many concurrent requests to the launcher, but it threw errors with only a low synchronous load so didn’t see much point pushing it further.

Hopefully the revamped authenticator and safe_core will expose a more robust interface to the network.

Conclusions

The update rate for structured data is between 100 - 300 updates per second per user (on my private safe network).

The current bottleneck to increasing this is the launcher, which throws errors when the load becomes too high (around 1000 updates in a row).

It costs nothing to update structured data, and it can be updated at a fairly fast rate. This leads to the notion that some vault rate limiting may be required when dealing with structured data updates.

I’m not terribly satisfied with this test, as it doesn’t really profile the vaults at all. But it satisfied a curiosity about the approximate order of magnitude to expect for structured data performance.

Asides

  • The launcher UI froze after running these tests. The server was still running but the tabs wouldn’t change. I assume this is from the logging tab which has a lot of work to do in a very short time. Once the UI locked it never recovered, needing to restart the launcher to make the UI work again.
  • The value of the SD was not checked, so it’s assumed it was updated to contain the correct value and there were no race conditions. This is a pretty big assumption, but these details were outside the scope of this test.
  • I didn’t test this on the live test 11 network.
  • The rates in this test are for a single user, and the overall network transaction rate would presumably be much higher with many concurrent users. I didn’t get into modifying the launcher to test this, mainly due to the imminent overhaul to safe_launcher > authenticator and structured_data > mutable_data.
  • Other simultaneous upload activity such as immutable data may affect this rate in the real world.
26 Likes

You are doing absolute great work for SAFE. Thank you very much :+1:.

Is it safe to say that each Disjoint Group (say between 8 and 20 nodes) can handle at least 100 Ts/sec.? And does this mean that we could scale to 10.000 Ts/sec. with 100 groups?? Like you say, it all depends on how many other structured data objects need to be handled by that group as well. but this looks promising.

7 Likes

In this test the group size is 3 and quorum size is 2 and with bigger group size we can expect that the number of SD(MD) updates will be lower because we need more messages between nodes to reach consensus.

But even if we divide the final number of transactions by 10 or 20, a single disjoint sector (group) is capable of more transactions than the whole bitcoin network. Multiply by hundreds, thousands or millions of sectors, which the network is able to support, and the network capacity is simply amazing.

10 Likes

As @digipl says, there are a few factors that affect the tx rate…

  • Group size and quorum size will be larger on the real network and thus expected to be slower than my test network. However since the vaults in this test never came anywhere close to breaking a sweat the result of 100 tx/s is extremely conservative (from a vault perspective). The actual tx/s that vaults can handle would be much higher but couldn’t be tested because the launcher couldn’t handle that much data. To put out a haphazard guess: since cpu load didn’t change on the vaults during testing and is measured in 0.25% increments, there’s potential for at least a 400-fold improvement in tx rate from the vault side taking the estimate to at least 40K tx/s/group.

  • Global transaction rate increases as the network size increases (ie number of groups), so there is no upper bound on global transaction rate. This is an amazing property of the safe network and is so different to bitcoin that comparing tx rates between the networks is basically an instant red-flag for trolling (guilty!).

  • The rate will be affected by other work groups have to do such as storing immutable data, churn etc. However prioritizing certain data may help retain the high overall tx rate (if that’s considered a priority in the first place).

12 Likes

Vaults with Disjoint Sections

It’s been a long time between tests, so let’s see how vaults perform with the new Disjoint Sections feature.

Results first: I couldn’t get the network started. But the test was still very interesting for other reasons.

Versions

Same versions as Test 12.

Modifications

Methodology

Same as prior tests:

  • Start 28 vaults on 7 pine64s
  • Time the duration to upload a large file (ubuntu-16.04-server-amd64.iso 655 MiB)
  • Repeat nine times and take median upload time

Results

The network never got started so no uploading could be done.

Observations

Resource Proof

v0.13.0 of the vault introduces another new feature besides Disjoint Sections: resource proof (ie upload speed must be at least about 6 Mbps)

Resource proof finds its way into the vault via the routing source code.

Of most interest is the verify_candidate method.

The candidate must provide a certain size of data (RESOURCE_PROOF_TARGET_SIZE = 250 MiB) in a certain amount of time (RESOURCE_PROOF_DURATION_SECS = 300 s) which ends up being a little over 6 Mbps.

This is an overly simple calculation, since there are other factors of timing to consider such as the time taken to traverse routing paths. From the source code comment: “This covers the built-in delay of the process and also allows time for the message to accumulate and be sent via four different routes”. This added complexity is very interesting from a security perspective, as it potentially allows nodes to alter the perceived performance of other nodes on their routing path. Enabling the network to ‘monitor itself’ creates many interesting considerations. The links above are a good starting point for more detail.

Private safe networks would most likely want to modify these parameters to make it faster to join the network, although at this stage the bandwidth requirement should be trivial to complete for any local network.

The implementation is quite elegent, and is easy to see how it can be extended to a cpu proof of resource.

My main doubt of the current resource proof is it uses the SHA3-256 hash function as the basis of proof (with trivial difficulty), yet the majority of current operations on the network are signature validation operations. The real-world performance of a node (especially one with custom hardware) depends on signature verification, so proving they have fast hashing isn’t necessarily going to determine how useful their ‘real’ performance will be on the network. Again, this is a slightly-too-simplistic look at things, but is a starting point in the consideraiton of resource proof. Hashes are perfectly useful as a means to determine bandwidth, but I have doubts about how long it will remain that way due to their disconnect with actual resource demands.

Proof Of 6 Mbps

I first tried running the vaults with the original 6 Mbps proof setting. The gigabit network should trivially handle this proof, and the logs showed the expected message:

Candidate XXXXXX… passed our challenge in 0 seconds.

However shortly after, log messages began showing the challenge taking some nodes 10+ seconds.

The consecutive times to pass the challenge from the first vault log were 0 0 1 2 2 4 7 5 8 6 10 12 8 31 18

This is still way under the 300s threshold, but the degree of variation seems like it can get quite large. It begs the question ‘what exactly causes it’ and ‘how far can it go’ and finally ‘can it be exploited by other nodes to their advantage’?

The variation is concerning to me, but resource proof is a complex topic and one I’ve only just started exploring. I’m sure there will be many interesting developments in the near future as the topic is explored further.

As to a reason for this delay… subjectively, there were a lot of log messages ‘Rejected XXXXXX… as a new candidate: still handling previous one.’ Unfortunately I don’t have the time to investigate this more deeply just now.

It’s tempting to draw conclusions from observations, but I think it’s important to take observations as-is and not make incorrect assumptions about the potential causes. I’m not familiar enough with resource proof to draw much meaning from this test, but find the observations interesting in their own right.

Ultimately my network never got started with 6 Mbps resource proof. The largest I saw the routing table get to was 7, which was from a sample of five attempts each given about half an hour to start.

Proof Of 27 bps

I changed the proof from

TARGET_SIZE = 250 * 1024 * 1024 = 6 Mbps

to

TARGET_SIZE = 1024 = 27 bps

The reason for lowering target size and not increasing allowed time was because I wanted the vaults to acknowledge quickly and the network be operational sooner, not simply to accept lower bandwidth.

The network would still not start. The cause isn’t clear from the logs.

Proof Of 0 bps

My expectation for 0 bps proof is still failure, since it seems that messages are not getting through regardless of the proof requirement.

0 bps resource proof also failed to start. I’m not sure how the Test 12 network was started in the first place. If there’s anything I might have overlooked that could help get the network started, I’d be interested to know.

Conclusion

I couldn’t get a network to start, so the performance of vaults with Disjoint Sections could not be tested.

The implementation of resource proof is extremely interesting. I’m looking forward to seeing how it progresses in the future.

14 Likes

Always love to see your results and really respect your very unbiased and logical approach. I’ve been seeing commits on github improving log messages so perhaps those changes will help and stick around! There is talk about test 12b coming soon!

7 Likes

I have a very old Intel Q6600 and I have succeeded in running a local network of about 50 nodes on it. Besides removal of check for peers on same LAN like you have done, I have modified handle_candidate_identify() method of routing/states/node.rs module:

        let (difficulty, target_size) = if self.crust_service.is_peer_hard_coded(&peer_id) ||
                                           // A way to disable resource proof in local tests but not in global network
                                           self.crust_service.has_peers_on_lan() ||
                                           self.peer_mgr.get_joining_node(&peer_id).is_some() {
            (0, 1)
        } else {
            (RESOURCE_PROOF_DIFFICULTY,
             RESOURCE_PROOF_TARGET_SIZE / (self.peer_mgr.routing_table().our_section().len() + 1))

This is my way to implement the 0 bps proof of resource. The advantage is that the same binary can be used for both local network (with 0 bps PoR) and global TEST network (with PoR as programmed by Maidsafe).

I also do not start the vaults all at once. I start the first one with RUST_LOG=info ./target/debug/safe_vaut -f and then I launch a group of 10 vaults with a 6 seconds delays between each vault with the following script:

for i in {1..10} 
do
	echo "Launching vault $i"
	./target/debug/safe_vault&
	sleep 6
done

I launch it 5 times, but each time I wait for the routing table size to reach the next multiple of ten before launching the next group. It can take several minutes before this and the CPU can be high.

Lastly, I have modified vault_with_config() method of safe_vault/vault.rs module:

        chunk_store_root.push(CHUNK_STORE_DIR);

        // To allow several vaults on same station
        use rand::{self, Rng};
        use rustc_serialize::hex::ToHex;
        chunk_store_root.push(rand::thread_rng().gen_iter().take(8).collect::<Vec<u8>>().to_hex());

The aim is that the vaults on a same station do not share the same chunk store directory, because I am afraid that there is a lock contention on the chunk file created by vaults handling the same chunk. It doesn’t explain why your network doesn’t start up but I wonder if it could explain the observed slowdown during uploads and its variability.

In conclusion, I would say that Maidsafe made it very difficult to run a local test (I mean a real one not a mock one) with following obstacles to overcome:

  • check that vaults are not on same LAN
  • sharing of a common chunk store directory
  • costly Proof of Resource

Ideally, they could add a simple -l flag in safe_vault program to allow such a use case. I would thank them a lot if they implement it.

10 Likes

Great test and write up @mav, thanks. Also @tfa thanks for sharing your experiments. I love reading these :slight_smile:

@mav:

As to a reason for this delay… subjectively, there were a lot of log messages ‘Rejected XXXXXX… as a new candidate: still handling previous one.’ Unfortunately I don’t have the time to investigate this more deeply just now.

You may know this, but if not it could help… Only one node is tested per section at any time, so any others trying to join that fall into a given section get rejected and will keep trying until they get to be the node under test.

The smaller the network, the fewer sections, so for these tests it’s not surprising you see a lot of those messages as your nodes start up and begin clamouring “me next, me next”! :slight_smile:

5 Likes

This is very promising. I’ll try adding your changes to my vaults.

I was initially using a delay of 3s between starting vaults, which worked well on past networks. I changed it to 60s just to be sure (26m startup time!!) but it made no difference.

Could you use chunk_store_root in safe_vault.vault.config? That’s how I’ve been managing multiple chunk stores on the same machine. Are you aware of any benefit to your technique instead of the configuration option?

I agree but I think maidsafe are focused on the right stuff at the moment. I’m in no hurry to get a ‘local network’ feature. It gives me a genuine motivator to dig into the code. Good list of obstacles though, it’s handy for future reference.


Having implemented the change to handle_candidate_identify the network still isn’t starting.

The first vault to be tested (ie the second to start) passes the challenge and is accepted into the routing table.

The second vault to be tested passes the challenge but is never accepted into the routing table. The log message that seems relevant to this is
Timed out waiting for ack - the full log message is:

TRACE 22:12:58.311444296 [routing::states::common::bootstrapped bootstrapped.rs:103] Node(8f734b..) - Timed out waiting for ack(hash : b994..) UnacknowledgedMessage { routing_msg: RoutingMessage { src: Section(name: 319d4c..), dst: Section(name: 319d4c..), content: CandidateApproval { candidate_id: PublicId(name: 319d4c..), client_auth: Client { client_name: 1452df.., proxy_node_name: 8f734b.., peer_id: PeerId(9e17..) }, sections: {Prefix(): {PublicId(name: 8f734b..), PublicId(name: d2bcb9..)}} } }, route: 2, timer_token: 16 }

Tempting as it is to chase this rabbit, I’m going to wait for a new release. The sort of testing I’m aiming for in this thread is meant to be simple to report and simple follow (eg comparison of upload timing), and since this test is getting beyond that scope I’m leaving it there :slight_frown:

4 Likes

I suppose that I would have either to duplicate the binary file together with its config file (50 times!) or to modify the config file before each invocation which is more complex to program than what I did (and which is risky because I don’t know if the config file is only read at start up).

1 Like

FYI, one more obstacle added recently for running a local network:

ERROR 15:45:33.575315300 [crust::main::bootstrap mod.rs:184] Failed to Bootstrap: (FailedExternalReachability) Bootstrapee node could not establish connection to us.
INFO 15:45:33.575814800 [routing::states::bootstrapping bootstrapping.rs:210] Bootstrapping(0ebae3..) Failed to bootstrap. Terminating.

The workaround I found, was to pass constant CrustUser::Client instead of crust_user variable in the 2 calls to start_bootstrap in src/states/bootstrapping.rs, eg:

                let _ = if self.client_restriction {
                    CrustUser::Client
                } else {
                    self.crust_service.set_service_discovery_listen(true);
                    CrustUser::Node
                };
                let _ = self.crust_service.start_bootstrap(HashSet::new(), CrustUser::Client);

With all the mentioned modifications in this topic I still can run a local network on my station with current master branch, but I am afraid that one day this won’t be possible anymore.

3 Likes

Disjoint Sections (part 2)

I managed to start a network using Test 12c versions of software. The performance of disjoint sections appears to be about 2.5 times slower than pre-disjoint-sections.

Parameters

vault 0.13.1
routing 0.28.2
launcher 0.10.1
demo_app 0.6.2
group size 8 (default)
quorum 60% (default)
removed limits such as upload volume and peers on lan
upload file is hippocampus.mp4 (96.2 MiB)

Results: Local Network

The upload on local test12c network took 50m.

The comparison with pre-disjoint-section would be with a group size of 8 and quorum of 5. Using the table from a prior post this upload would take about 21m (group 8 quorum 6). So disjoint sections appears about 2.4 times slower in this test.

The upload profile has that similar look as pre-disjoint-sections, where it starts out looking fast but ends up taking quite a bit longer, and has a sharp increase at the last one percent.

Results: Alpha1 v Test 12c

Alpha took 8.5m to upload 96.2 MiB
Test 12c took 24.1m to upload the same file.

This is about 2.8 times slower, so again not a great result for disjoint sections.

The ETA profile is familiar.

General Observations

  • The LAN routing table never got above 12 nodes. There should have been 27, so this is something that warrants further investigation.
  • The network starts very easily and much more quickly. The delay between starting each node can be drastically reduced. Before disjoint sections, I was delaying 20s between each node starting, but with disjoint sections I leave only 2s.
  • The main errors I’m seeing in the vault logs are
crust::main::active_connection active_connection.rs:171
{ code: 104, message: "Connection reset by peer" }

crust::main::connection_listener mod.rs:137
{ code: 11, message: "Resource temporarily unavailable" }
12 Likes

Strange. Your result show an upload speed of 66,5KB/seg but I made my own upload test with12C, with directories of about 100MB, and the speed of the real network is near 300KB/seg.

Seems logical to think that the local test should have been faster than the real network.

2 Likes

Like your tests but I have read some time ago that there is no optimization for network performance done yet.

Maybe a team member could give us some info on it because last days some topics involve networking speed and I’m starting to get curious too. (Maybe they have no idea till optimization is done)

1 Like

Agreed. I am dubious of the quality of the pine64s, but they have a gigabit ethernet port and quadcore cpu so they should be great. Testing with a normal network file copy shows they do function as expected. So… I can’t really explain it.

I’m verymuch considering moving away from the pine64s for testing.

2 Likes