RFC 58 - Reliable Message Delivery

Discussion topic for RFC 58.

20 Likes

Wow. Keep them coming. And I thought my reading was finished for the night.

10 Likes

Final Comment Period

The Reliable Message Delivery RFC will remain open for the next 10 days to allow any final comments to be made.

Thank you to all who have contributed! :slightly_smiling_face:

3 Likes

if the current section contains the destination, broadcasts the message to the nodes belonging to the destination

Does this mean that a single message can be sent to a section, targeting multiple nodes in it?

The Delivery Group is a subset of the Elders of the section that contains at least a third of the total number of Elders in the section. This can be chosen eg. as the third of the Elders that are closest to the message hash, or closest to the message destination, but other ways are also possible.

Choosing by closest to message destination seems less secure IMO, since the path will be much less dynamic. Canā€™t say if itā€™s at all significant, or how it could be exploited, but it seems to me that part of the defence is the pseudorandomness of the message path, and the longer the time span that the path is the same, the wider the window of attack.
Message content is on the other hand decided by senderā€¦ Canā€™t say if this concern is even remotely close to any feasible attack.

In an edge case scenario, all Elders from Y will end up in Y1. Y0 will then have new Elders

Just thinking, isnā€™t it desirable to at least approximately maintain the elders evenly separated in a split? Creating a new section with an entire new set of elders from adults on a fast-lane seems like bypassing - at least to some degree - the selection process for becoming Elder? (But IIRC current section split rules doesnā€™t allow for any influence on Elder destinations, so that practice would be incompatible with the logic.)
Also, wouldnā€™t it be desirable as to avoid latency in form of this:

but it will take some time for them to connect to us

Edit: Probability for all Elders to end up in one section, is very small. But with a normal distribution Iā€™m sure the aggregated latency introduced in the network as a whole, is - at least potentially - more than insignificant?


What size of Elder group is secure? This is critical to prevent an exponential increase in per hop messages.

Do you mean that by knowing this size, we can put a cap, and thereby prevent the exponential increase (by preventing Elder group increase over this cap)?

2 Likes

What are the use-cases for inter-section messaging? Will messages often need to be passed to non-neighboring sections? For instance, is OwnerGET for UnpublishedImmutableData one of them? If so, it seems like this routing is inefficient.

e.g. Letā€™s say that the SAFE network is small and has 10 sections. Iā€™ve got a graphic below depicting 10 numbered sections in a ring shape to illustrate the neighboring relationships. (Iā€™m guessing that after the SAFE network is released, there should be hundreds, thousands, or more sections.)

If Iā€™m understanding the RFC correctly, when Section 1 (s1) needs to send a message to Section 3 (s3), it would need to go through Section 2 (s1 ā†’ s2 ā†’ s3). If s1 needs to send a message to s9, it would go s1 ā†’ s0 ā†’ s9. Not too bad. But if s1 needs to send a message to s6, it would need to go s1 ā†’ s2 ā†’ s3 ā†’ s4 ā†’ s5 ā†’ s6 (or take the opposite path which is equally as lengthy).

This example is with the SAFE network having 10 sections (i.e. not many users). When the SAFE network is live, wonā€™t there be hundreds or thousands of sections? Sequential traversal seems to be a potentially major source of congestion if there are use cases for sections sending messages to other sections more than a couple hops away.

2 Likes

A neighbour is a section thatā€™s one bit away in xor space, not a sequential distance.

So maximum hops is len(section_prefix) since each hop gets 1 bit closer to the destination section.

9 Likes

Which is also the binary logarithm of the total number of sections (log2 (n)). For example with 16 sections the maximum number of hops is 4.

2 Likes

The section is the Elders in the section in this case. so if the destination is that section all Elders get the message.

I feel the same, initially, we used the message signature (less deterministic) for this reason, but cannot find an attack using the destination address yet. It allows us to exploit this certain route to help with securing the delivery as we know the nodes we spoke to, to update the section knowledge are the same nodes that will receive the next message. So helps with lag, but atm it seems 50/50. So we do this for now through tests and measure any issues.

The chance is small but covered in any case. The pre split elders create the 2 new elder groups and their keys before the split. I think the BLS RFC will show that.

6 Likes

Reliable Message Delivery is now moving in to the implementation phase.

Progress can be tracked via this project board: Reliable Message Delivery (N/3) Ā· GitHub

8 Likes

From assumptions

Elders: their responsibilities are network control, messaging and consensus layer.

Do elders store chunks, or do they only do ā€œnetwork control, messaging and consensus layerā€?


From General mechanism of delivery

A subset of nodes in section X decide to send a message. (This can be, for example, a single Elder or Adult, or a whole section. It can also be a client connected to section X.)

Is RMD used when clients do GET requests for chunks (it seems so, ā€œcan be a client connected to section Xā€)?

In that case, how does caching work? Since no adults will see the request (RMD hops only happen between elders), thereā€™s no way for cache to be used.


From Complexity and reliability analysis

A hop X with delivery group DGX passing the message to the next hop Y with delivery group DGY will generate up to |DGX|*|DGY| messages

If a node is able to respond from cache, how do they put a ā€˜stopā€™ to the chain of hops continuing to the source? Since there are several routes happening simultaneously in RMD some of them may have already been dispatched.

1 Like

Only when they have to.

IF we had only direct connect / Get then probabilistic caching would not be used. Thatā€™s caching as we know it, however deterministic caching (i.e. increase holders of data) is probably still viable.
Right now we donā€™t have direct connect/Get clients. It is something always worth looking at though.

The would stop only their own forwarding of the request. The rest continue, but again these will all probably get from cache or perhaps next hop if the data is popular. This needs measurement and one we should set up in testnets to confirm the behavior.

5 Likes