Observation about packet serialization efficiency

While trying out the recently released demos (0.3.0), I took a look under the hood at the wire format

What surprises me is that a large part of every network packet consists of 0x18 bytes, sometimes long spans (one marked in green). The “inner packet” seems to be interleaved with them.

Is this intentional? I haven’t investigated far - just that the packet serialization is based on CBOR, but it seems inefficient to me. At least it would compress very well :slightly_smiling:

10 Likes

We are looking at cbor at the moment quite closely. It seems to be a more future proof than bincode. Bincode is small and much faster, so under investigation atm. Adam in the office created a report on this, we will get it public and post a link here.

5 Likes

Thanks for the information. I’m not sure CBOR itself is at fault here, it could be the specific serialization implementation that has a bug in choosing the size of consecutive spans.

100%, although we see the same with 2 implementations of cbor right now.

2 Likes

This is the report @dirvine is talking about: https://docs.google.com/document/d/1bjnqhbAFbgzK3RCPfWrTlwmYNWo8yH_i9i7f-pMAoGA/edit?usp=sharing

4 Likes

Thanks @adam that helps. Hope this helps @wumpus (note for speed etc. only release mode tests are valid, rust can be quite mad in debug mode in term of profiling.)

Right, maybe it’s not the CBOR implementation, but the use of CBOR.
I’ve noticed that byte arrays don’t use the CBOR “byte array” major type, but are represented as arrays of small numbers. As there are many byte arrays due to nested serialization, this seems to be the main overhead. In a rough re-encoding test I’ve noticed that using the proper type for byte arrays saves more than half the packet size at least, sometimes two thirds.

Thanks @adam.

3 Likes

Yes we have seen similar, encode the same thing twice and it’s almost twice the size. Not had time to dig as we really wanted to use the more future proof looking version. We do have a mechanism to upgrade and change serialization though, similar to GitHub - multiformats/multihash: Self describing hashes - for future proofing which will make sense.

2 Likes

Hi Wladimir, great to see more eyes scrutinizing the project…we welcome your involvement.

6 Likes

You could thank @ioptio for that, she brought this project to my attention :slightly_smiling:

7 Likes

We will :smiley: Testing bincode and msgpack as well during this phase so we will see how it all goes. I like CBOR for the more future proof (perhaps), but thanks again and welcome to this place. Hope you stick around as you will be a great source of valuable feedback for sure.

3 Likes