Eventually consistent global counter ("SafeTime")

EDIT/NOTE Can we avoid mixing this idea with NTP and Real Time and other ways that require to connect the Safe Network to external sources? The very essence of this idea is to avoid that.


UPDATE I ran a simulation with over a million nodes for this method. The summary is in the 18th post.


UPDATE #2 Code for the simulation can be found here: http://pasted.co/6acd2429


I searched the forum about this topic and it seems Safe doesn’t know about time. However, it is trivial to add an eventually consistent global counter (“SafeTime”), which is almost as good.

This would enable something very important: signing contracts natively on the Safe Network. If everyone knows the current SafeTime, they can accept or deny a request to countersign a contract without checking the time from outside the Safe Network.

How else could it be used? That’s for future app developers to decide. Maybe to record the order of some events for the app. Maybe to set deadlines relative to network latency. There are many possibilities.

Finally, the algorithm: SafeTime is sent in every message. The current SafeTime is calculated as the median of the latest values of SafeTime received from the node’s peers plus one (but never less than the previous value).

Nodes are connected randomly across the world, so the counter would be kept in sync globally (with a small but expected error) while also growing at a steady but throttled rate.

11 Likes

Any ideas on implementation?

Would be useful.

It’s there. Maybe it got lost on you because of how simple it is:

2 Likes

I think maybe it was because you used the quote format for that paragraph. :wink: Anyway, it’s an attractively simple idea with plenty of use cases on the face of it anyway. I wonder how it would interact with PARSEC which will impose its own idea of the ‘correct’ order of events?

5 Likes

It doesn’t need to, it can stand on its own. Maybe it can use information about the trustworthiness of peers from other layers to ignore bad peers but it’s not absolutely necessary because the median is already robust against a small number of rogue agents.

It doesn’t have to be used for ordering, that’s just a use case. PARSEC does that in a different context anyway. SafeTime would be useless to impose strict ordering for events following each other too quickly.

I apologize. I tried to emphasize it only to achieve the opposite.

1 Like

The biggest problem with this is that there is no global consensus happening. Each section would have its own event counter, but each section has events happening at different times and more importantly different rates.

There simply is no one event timer for the network. That is a point of decentralisation, that there is no central authority or event timer/keeper.

Contracts are still signed by one or more of the parties involved and as been discussed before in a few topics, the best way to have a timestamp is to have the parties agree on the timestamp themselves. These could be people or machines or even IoT devices.

I can see a use for the highly accurate NTP servers moving over to SAFE and providing signing services where people prefer them to their own PC’s time. The signature can incorporate the NTP server’s signature to authenticate the time stamp. Where as with any theoretical global event timestamp only gives a impossible to verify timestamp and the reason is that its impossible to know the time from an event number. Events occur at different times and the variance can be many times the difference between events.

So not only is it impossible (without centralising the network) to have a global event timer, it would be impossible to connect an actual time to an event number.

So the best so far is what I mentioned above which is what others have come up with in the past.

1 Like

Maybe we should still try to exploit latency events :loop:
abusing them for a chronological clockwork service :curly_loop:

We have been exploring in 2016 this Timekeeping idea

1 Like

There is no need for connecting actual time to event number. BTC blocks does not do that either. Counter is enough for smart contract logic, humans writing the contracts will do approximate mapping from global time to global counter. But I don’t know whether it is possible to create global counter on SafeNet. Can’t local groups vote for some global counter? Time delay between increasing counter events can be longer to prevent overload. If global counter is increased once every minute it is still 10x better than BTC has.

6 Likes

Building a SafeClock to keep SafeTime is an inherently global process. Clock syncronization is an old problem Poincare and Einstein and telegraph operators were infatuated with over a hundred years ago. In the end it requires a lot of message passing back and forth, a lot of mapping between values, and a significant amount of trust. There are a lot of very clever ways to go about it, but the point of SAFE is not to be a big clock syncronization machine, so I would say this one is better left to the app layer.

As in the case of neo’s suggestion above, you could have an NTP App. Users could pay puts to periodically broadcast “time chunks” that would be broadcast to all sections in the network, be incorporated via PARSEC into data chains, and serve as marker events baked into the network with a global “real-world” time period / frequency.

This would/could be a rather expensive process, and maybe it should be. People who really care about imprinting time on a timeless network will need to donate or find ways to pay for the feature. The makers of the NTP app could perhaps charge a subscription, but then again, who pays for NTP services now?

6 Likes

NOTE Can we avoid mixing this idea with NTP and Real Time and other ways that require to connect the Safe Network to external sources? The very essence of this idea is to avoid that.

I’m not 100% familiar with the design, but I assumed nodes are connected to nodes from other sections too. How else would traffic flow between the sections otherwise?

If you look, this method works with messages regardless of section membership and it is naturally robust against rogue nodes by its use of the median.

I understand. A decentralized eventually consistent global counter would be useful. That’s why I suggested this method.

As long as there is no better way. This suggestion is for such a “better way.” It would be better because the parties wouldn’t need to look outside the Safe Network for a time reference.

That is outside of the scope of this proposal. It’s a counter with no regard to time, as @Antifragile rightly noted.

Again, this idea is not about actual time, and definitely not about a service that would link the Safe Network to external references.

On the contrary, it is about avoiding external references.

I argue that my method adds a timestamp-like functionality that is enough for signing contracts natively, within the Safe Network.

Which of these is better for the network?

  • Access to a timestamps-like global counter with a value that all nodes agree on (plus/minus a few steps) without referencing anything outside the network.
  • Not having such functionality.

Dear @Neo, before I would go on and accept that my idea is impossible, may I ask you to please demonstrate why it is impossible or why it would require centralization? Please note and answer my first comment, the one about sections and nodes, while at it.

Yes, I saw that one. It’s similar to my idea but it’s much more complicated.

1 Like

Let me answer this one separately, because it demonstrates a misunderstanding about my proposal.

Clock synchronization is complex, it does require many loops back and forth, but my proposal does none of that because it is not about a synchronized clock.

My proposal is a simple one-way counter. There is no synchronization: no feedback loops, not even additional messages: all messages contain SafeTime, but no message is about SafeTime.

Let me explain the algorithm again. It won’t waste much space because it’s one sentence and a single expression:

A node will keep in mind the last value of SafeTime it received from connected nodes (regardless of section boundaries) and it will calculate SafeTime as:

my_safe_time = max(my_safe_time, median(safe_time_from_peers) + 1)

(The calculation uses the median because it is naturally robust against outliers, may they be caused by rogue nodes or mistakes.)

Subsequent messages sent to any node will contain this value and those nodes will use it to update their own idea of SafeTime. This will result in a number that grows one step at a time locally and that is synchronized globally by virtue of nodes being connected to other nodes across the globe.

Some areas will run one or two steps ahead the average and others will fall one or two steps behind. Nevertheless, the randomly connected nature of the Safe Network will ensure that these differences can’t grow large.

4 Likes

This thread is about a method I propose for that :slight_smile:

The rough idea is that if half the nodes agree on the current value of the counter, it’s time to increase it by one. We achieve it not by explicit agreement between the nodes but by passively listening to what other nodes think the current value is. We use the median so outliers (from rogue or mistaken nodes) are ignored.

4 Likes

Each section is processing its own events at the rate required for the events it has to do. Section #1234 might be completing 100 events per second but section #1235 might be completing 20 events per second and section #1236 might be completing 5 events per second at a particular point in time.

So there is no correlation between sections and their internal events.

While nodes in a section can be talking to other nodes (hopping chunks etc) there is no way to correlate event “counters” nor is it desirable as once you have 100 sections then there is 100*99/2 (4950) communications between sections to try and correlate event “counts” As one pair correlate they change between their neighbours. You get the idea, the network would be weighed down trying to have a global event “count” because it would be out of sync before you get a few neighbours away.

The solution to that would be a centralisation of event “counters”, but then that centralisation of a function, then there would be calls for centralisation of other functions. Even one function centralised means you do not truly have a autonomous decentralised network. You would have a autonomous partly centralised/decentralised network, with a significant amount of traffic to attempt to have a central event “counter”

The reason for mentioning NTP servers is not to bridge between SAFE and them, but rather to have dedicated time authorities on the network. This is how the current internet solved the time sync issue and is a reasonable way, since it is how all time pieces are synced in the non-internet world. There are time “authorities” that keep super accurate time and sync with each other.

This syncing of the time “authorities” is a complex task and requires there to be a minimal number of them, and the sync is not a internet dependent operation (even though they may use it as a transport layer now-a-days). Similar to the reason why a “centralised” event “counter” on SAFE would fail since there are too many sections to sync.

Thus my suggestion of the time “authorities” having time authoritative machines on the SAFE network is to provide for those people who cannot for whatever reason decide on a timestamp for themselves. The idea is that the piece of data that needs securing is hashed and that hash is sent by secure messaging to the time machine to be timestamped with the ID of the authoritative time machine and returned by secure messaging. Obviously there has to be a public declaration of these authoritative time machines so that any time stamp can be reliably authenticated.

This removes the need for a global event counter and provides for time to be included in the temporal stamping of data. Linux event counters are often the time (secs since epoc)

How that is paid for is to be determined but I can already think of a few and that would allow the authorities to recoup some of the costs where it current is not being done.

Well I am saying that I cannot see how it can be done in a economical (network traffic/coding) way. And why I offered an alternative that actually provides the required functionality and will actually be implemented anyhow if SAFE takes off. Like GPS helps anyone even though its a defence paid for system, NTP servers are paid for by governments, yet everyone can use them. And if SAFE takes off then by necessity the governments will implement a SAFE network version of time authorities.

You seemed to have grossly misread what I wrote, or I am really bad at explaining things.

Time is a very good counter. Linux uses it everywhere a “event counter” is needed. It has served the “event counter” requirements of the computing world since it because available in UNIX many decades ago.

Use it as just an event counter or as a actual time stamp, the choice is up to the developer. It is a multi-purpose solution

Here we go, the word “contracts”. Wow contracts has multiple meanings.

I can buy/sell whatever on the SAFE network, including coins, and that may need an absolute time or it could only require knowing which came first.

So when you say contracts then you are in fact including timestamping. But you then later clarify it as only know which came first.

I hope the above summary analysis helped for you to understand why a centralised event “counter” is not practical in the SAFE network, especially when sections may number in their 1000’s or 100,000’s and still have the “counter” decentralised.

As soon as there is no “feedback” you then are describing a system that progresses from a point to all other points in the system. By definition a centralised system, and the central point is where the progression starts from. Basic control systems theory.

You want a “counter” but call it time. I understand this, but does not help. BTW the linux time is also a event “counter”

The problem with this equation is that for an Application running on a PC does not have any event counter. That only resides for each section. The PC is not connected to “a section”. It depends on what data the PC is trying to transfer to or from the network that determines the section contacted.

Now lets assume that the Application can access a section by using a XOR address it determines from a standard piece of data.

  • That section might have an event time value of 1000000
  • the neighbouring section might have 1234567890123456 as its event count
  • another neighbouring section might have 999999999000 as its event count
  • and so on

So you get a value derived from 3 or 100 peers and its a huge number

But then as the network merges sections and splits sections the other Application that you are trying to compare event counts with might (statistically possible) get mostly new sections split off from others

So then XOR address could yield a newly split off section now looking after that piece of data and the counter calc could be using

  • first section has event counter of 1000
  • the peers could have their event counters as
    • 123456
    • 99221
    • 987654
    • and may more

So then the other Application comes up with a really small event count. But this was done maybe days or months or years later.

Which was first.

How can you correlate these?

Its not a question of being a great idea, but it is fundamentally flawed in a completely decentralised network made up of a really huge number of sections.

7 Likes

This requires a consensus of those nodes. PARSEC is working with less than 30 or 50 nodes, you are talking of millions of nodes when the network is out of early stages.

The communications for PARSEC with 30 nodes is on the order of 435 (30*29/2) (actually there is a couple of communications per possible communication paths).

But for even 10000 nodes (tiny network) the communications is on the order of 50 million.

OK if you do that then you get what I mention in the last part of the previous post. Some will have small event counter value and other parts of the network will have massively large event counter values. There is no correlation unfortunately.

The issue that I think is being missed is that Nodes do not see much of the network traffic. They see only a really really tiny amount of it.

2 Likes

Here is a potential solution and like my other one, does not require core code to be modified at all. Changing core code to implement things is not the best way to go in nearly all cases.

This one is simply have a MD.

The MDs version number is a forever increasing number, cannot be modified by users and could thus be used as a event counter.

The applications that want to have an event counter then simply read the MD and use the version number.

The application then writes new data to the MD and this causes the version number to increase.

Problem solved.

2 Likes

A few questions comes to my mind:

You mean 1 MD per application instance, or per app type, or one global for the network?

Write contention can be quite severe. The core developers would have to enlighten us on the limitations there, but having everything write to one MD I think can be problematic.

And if not using a common MD, how is this useful?

1 Like

My idea has nothing to do with sections. As long as any kind of data is passed around (which is the reason for the network to exist) SafeTime can be attached to the messages, and then it spreads across the world very quickly.

Nodes don’t need to see much of the network traffic. This method requires local action but results in global synchronicity. There’s no “correlating events” and “weighing down the network” because each node has to **keep a counter for the dozen or so nodes it sends any kind of messages (for example, when passing on a chunk as a step between two other nodes) and that’s all.

A “contracts” is anything with signatures that need to be timestamped. We’re good as long as we have a one-directional counter that the participants can agree the value upon so absolute time is unnecessary and the required precision is flexible.

Please stop moving the goalpost. Noise unrelated to the topic:

  • “Time is a very good counter” indeed but it requires an external reference and I already highlighted multiple times that this thread is explicitly not about that.

  • Yes, there are times when absolute time is necessary, but they are out of scope for this proposal so let me not address your comments about that.

  • Why are you bringing up the counters for mutable blocks? That solves a different problem, it doesn’t solve this problem (it’s centralized), so it isn’t related.

  • When I say “the rough idea is that if half the nodes agree on the current value of the counter” you respond with “this requires a consensus of those nodes” which amounts to ignoring everything I previously wrote about how the method works though it’s just a single sentence that I already had to repeat a dozen times.

While sections, just as absolute time, are irrelevant to my idea, let me note that I’m afraid your are mistaken to think some sections will handle significantly more events than others. There will be differences but it’s impossible for events, which are uniformly distributed by XOR space, to exhibit the variance you suggest on the timescales longer than a few seconds. But this thread is unrelated to sections so let me not worry about that more.

Feedback or not. The counter can’t move forward until about half the other nodes caught up. Arguably, it is a kind of feedback because my peer’s decision affects me which, in turn, will affect my peer. Excuse me for the confusion.

Some statistics will follow in a separate post.

1 Like

Let me show it then. I simulated my method for a little over a million nodes, with a median of 16 neighbors (originally 8, but I pruned nodes with fewer than 5 peers and it pushed up the median), initialized with SafeTime between 0 and 99 inclusive, and then updated to be consistent with my algorithm (to contain their peers’ median + 1). After this, I simulated a large number of random messages.

On average, each node sent or received one message every second from all of its neighbors. (If messages would be more or less frequent, scale the results accordingly.) These are not messages for SafeTime but messages that are part of the regular traffic on the network. SafeTime just piggybacks on them as 8 additional bytes somewhere in the packet.

The results: After just one second of simulation time, half the nodes were within 2 steps of the maximum SafeTime and 95% within 6 steps, and this is where they stayed for the rest of the simulation. It took about 2 more seconds for the minimum (of over 1 million nodes) to settle between 30-40 steps within the maximum, where it stayed. I found the counter is increasing 21 steps each second (occasionally 22), and that means 50% of the nodes are within 0.1s of the maximum, 95% are within 0.3s, and even the worst 5% were just between 0.3s and 2.0s behind. Network latency was not simulated so the real values are expected to be somewhat higher, but this does not alter the conclusion:

Contrary to your worries, my method moves towards synchronicity even on a large and extremely sparsely connected network, though each node sees only a tiny part of the whole (no, I did not misunderstood that about the network).

As for whether it’s “economical”:

  • SafeTime needs a few (8?) bytes in each message
  • nodes need to store the latest value of SafeTime received from their peers: 8 bytes for a dozen or so peers
  • nodes need to calculate the new value after receiving or before sending a message: calculating the median of about a dozen numbers, then adding 1

Let me ask you, does this sound like a tough coding task or something that would waste a lot of bandwidth?

8 Likes

Thanks for the clarification. Yes, I see what you were getting at now. I don’t think you would need to worry about peers for your counter though. Any given node might receive a message from any other node with a value of the global eventually consistent counter. Per your original idea, the receiving node would then update their own counter according to :
LocalCounter = max(RemoteCounter, LocalCounter +1)

I think the only way to keep this from getting out of hand with regard to gossip/communication loads and unstable feedback loops is to rely on parsec and tie the counter to the length of the section datachain. (Forgive me Pierre if I’m not recalling the parsec paper correctly) Each time a section reaches consensus and adds a block of data to the local datachain, then the section counter gets incremented by one. The updated counter then gets attached to messages during gossip with neighbors and propagates from section to section through the network as you have discussed above. If relying on parsec in this way, the addition of a block would control counter incrementation + 1. Therefore updating the a local section counter with regard to a remote neighbor’s counter that was received in a message would be a simple max operation.
LocalCounter = max(RemoteCounter, LocalCounter)

Edit: yes, I glossed over a lot of consensus and trust details in the above comments.

4 Likes

I do apologize for getting impatient.

I thought while nodes can get messages from any nodes, it always happens through a small subset that the node is directly connected to at that moment.

If I am wrong that would be better because nodes could rely on a more diverse set of inputs. My original idea would be changed to use a small LRU cache instead of a known list of its peers. Which gets us to your next point, which is a misunderstanding about what I proposed:

You are right, this would quickly get out of hand, but it’s not the formula I proposed.

I use the median of the SafeTime received from all neighbors, so each update works with the whole set, not just the latest piece of information. This stabilizes the process without any of the complicated additions you mention.

The important part, which your simplified update mechanism removed, is that we use the wisdom of a crowd which, in turn, draws from the wisdom of an even larger crowd, and so on, from an exponentially growing number of sources. However, we don’t have an infinite number of nodes, so this process quickly saturates the entire network as the steps going outwards loop around.

3 Likes