SAFE network uses two types of asymmetric keys, ED25519 keys and BLS keys.
Sometimes the choice may be due to functionality, eg multisig is built in to bls whereas ed25519 requires custom code to do multisig.
Sometimes the choice may be due to performance (otherwise why not just use bls everywhere?).
Out of curiosity I had a look at how much difference there is between the two for signing and verifying.
All values are in microseconds per signature/verification, and is the average of 1000 samples. BLS is using a single keypair, ie not requiring any combining of signature shares, as a standard safecoin wallet would be doing.
BLS
ED25519
BLS/ED25519
Sign 32 bytes
2220.78
19.36
114.72
Sign 1 MiB
5822.96
5826.58
1.00
Verify 32 bytes sig
5983.90
50.52
118.45
Verify 1 MiB sig
9478.69
2624.92
3.61
Signing small messages, BLS is about 100 times slower.
Signing large messages, they are identical, which was a big surprise to me.
Verifying signatures of small messages BLS is about 100 times slower.
Verify signatures of large messages BLS is about 3 times slower.
The info is I suppose not produced with any intention in mind, but I thought it was interesting to see the difference so I’ll put the figures here.
As a random question, what does this mean for the ultimate performance of SAFE? How many verify operations are actually carried out? With a naiive reading, I see 8ms/verify32 = ~ 120 verifies/second per core? How many is a vault actually performing?
IMO This shows we should consider BLS a lot more. For multisig it’s great, however when not multisig then we should be using ed keys. Right now clients are all BLS, but that may need to change. Not a huge deal (it’s just another algo) so Signature for us may need to become
So not a showstopper in any sense. It’s things like these checks @mav and others do that allow us to be significantly more efficient. We knew bls is much slower, but seeing it laid out is a compelling picture and stops us from ignoring that fact. Many client actions will be small messages so it makes sense, if a data item is individually owned then we use ed keys, otherwise, we use BLS keys. So all BLS use will be multisig and never individual key use, which is actually nice.
I thought that its typically the hash of a message youre signing, not the message itself, because of the slowness of pubkey crypto? But BLS seem not to be that bad for large msgs.
So the comparison, despite being different lengths, is valid since it allows us to see the portion of time spent during each signature on the hashing.
This made me wonder though if we could remove the dependency on sha2-512 and use only sha3-256, since ed25519-dalek has a function sign_prehashed. But then I saw within that functin sha2-512 is called anyhow so there’s no avoiding it. Nice little comment to go with it:
performance was benchmarked vs the Rust bls12_381 crate on a 32GB, 3.7GHz i5-9600K CPU. FPGA is running at 200MHz. … see a 2.9x speedup in the final ate pairing.
I would have thought more than 2.9x speedup. They list 4 points for Future Optimizations, but anyway, this was an interesting result to me, I would have expected greater speed improvements.
supranational have a bls implementation with rust wrappers, it’s “a BLS12-381 signature library focused on performance and security. It is written in C and assembly.”
I added this to the benchmarks (commit d71b596), it’s fast.
(TC is threshold_crypto currently used by sn_node etc, BLST is supranational blst crate)
Test
TC
BLST
TC/BLST
Sign 32 bytes (µs)
2379.36
192.81
12.34
Sign 1 MiB (µs)
5934.10
2736.71
2.17
Verify 32 bytes sig (µs)
6235.49
846.80
7.36
Verify 1 MiB sig (µs)
9937.81
3471.24
2.86
Signing small messages blst was about 12x faster than threshold_crypto
Verifying small messages blst was about 7x faster than threshold_crypto
Signing and verifying large messages blst was about 2-3x faster than threshold_crypto
There are two modes for blst, one for minimal-pubkey-size operations and one for minimal-signature-size operations. This test was done with minimal-signature-size operations, although I did test with minimal-pubkey-size operations and it was a little slower but still much faster than threshold_crypto.
Be aware this is a very basic test and group multisig operations haven’t been tested, but on the first impression it seems pretty fast.
Veryify 32B with / without avx2: 36,452 ns / 57,232 ns
Veryify 1 MB with / without avx2: 2,359,297 ns / 2,818,746 ns
Using avx2 is up to 1.57x faster signature verification on my laptop.
It was a bit messy for me to get working, I couldn’t get the --features=avx2_backend flag to work since it’s a curve25519-dalek feature not an ed25519-dalek feature.
Potentially an easy win to implement at some stage since the default is not to use avx2.
# The original packed_simd package was orphaned, see
# https://github.com/rust-lang/packed_simd/issues/303#issuecomment-701361161
packed_simd = { version = "0.3.4", package = "packed_simd_2", features = ["into_bits"], optional = true }
To get avx2 working on my machine I changed my local copy of ed25519-dalek Cargo.toml like this:
-curve25519-dalek = { version = "3", default-features = false }
+curve25519-dalek = { version = "3.1.0", default-features = false, features = [ "avx2_backend" ] }
Also to build with avx2 I had to use rust nightly which is not really ideal. Stable gives this error:
error[E0554]: `#![feature]` may not be used on the stable release channel
--> /home/ian/.cargo/registry/src/github.com-1ecc6299db9ec823/subtle-2.2.3/src/lib.rs:12:34
|
12 | #![cfg_attr(feature = "nightly", feature(external_doc))]
| ^^^^^^^^^^^^^^^^^^^^^
Good to keep a record of this messing about while it’s fresh.
Is it possible to get stats for how many sigs vs how many pubkeys are sent over the wire?
I’d imagine it would be many more sigs than pubkeys.
The two modes give:
mode
pk bytes
sig bytes
minpk
48
96
minsig
96
48
blsttc uses minpk because that’s the same as what threshold_crypto implements for their pubkeys and sigs (threshold_crypto doesn’t have a minsig mode; all public keys are 48 bytes and all sigs are 96 bytes in threshold_crypto)
So if we send many more sigs than pubkeys it might be worth using minsig. Or if we send many more pubkeys than sigs it might be worth using minpk. Or if they’re about even then it makes no difference.
Curious to know if there’s room for saving bandwidth by using one mode over the other? Probably only a tiny improvement but anyway figured I’d point it out. Might be interesting to measure this on a testnet.
Need to also consider dkg has been developed and tested using minpk (48 byte pubkeys) and may not necessarily work the same way for minsig (96 byte pubkeys).
An inconsequential point to note is the secret key stays the same for both modes.
They are almost entirely even as we include the pub key with the sig in all cases. We need to check it’s 100% but I think it is and has to be as the id of the node etc. is a pub key and for data we sign including pub key, but we perhaps don’t need to as the data itself defines the key(s) that must be used to sign, paging @bochaco and @chris.connelly