Can you elaborate briefly on the qualities of the two schemes age/list?
Yea, no worries. Sorry Mark, this became way too long. I hope it is understandable (post midnight brain dump)
Current node age is measured by 2^AGE churn events. We don’t keep lists or accurate measurements. What we do is check a churn event against the mod division of the event. So for age5 we would do
if CHURN_SIG % 2^AGE == 0; relocate and AGE++
So not totally accurate, but good enough and means we don’t keep lists of who seen how many churn events etc. as that is unwieldy. For old nodes the list is massive.
So the thing I have been looking at is
Hold members in an ordered set (so mergable) where order is by FIRST_SEEN. So we simply order the nodes when we seen them first. So old nodes are well in the set and agreed, but brand new nodes may be slightly out of sync as they churn fast. That out of sync for new nodes is not an issue for me as they are so young they have time to settle in the set.
Then age is simply the oldest and without an integer representation.
There is another reason for my here. I hate relocations to age a node. They cause traffic and data movements etc. That is a waste.
Node age bad bit (and fix)
However node age is very powerful for sybil defence and this is due to some pretty clever side effects. I will go over them here to remind us all
- As a node ages it is forced into a new section
- This causes attackers nodes to be diluted in a random way that is incalculable up front (the CHURN_SIG is an aggregate signature we all agree on but cannot guess up front).
- In addition, as a node has a certain name (key) we can use that as the wallet address to pay him. As we use one time keys, then he cannot cash in quickly and really has to keep behaving, or he risks losing revenue. Kinda POS like, but more like staking properly. (no rich dudes get more stake, only well behaved nodes get that).
A small issue we have is that we allow nodes to create keys for new sections. That allows a wiggle room for attackers to bunch into that section. So on a split they all go the same way, unless they are relocated, which they will be before they are Edlers. So ying yang there.
So the investigation I am doing is
- Force a mechanism where a node gives us how pub key.
- We give him random data we all agree on (a section sig)
- He creates a new keypair based on the old key and random data.
- We can confirm it is correct and therefor not calculable up front.
Let’s assume we have that (I have some tests etc. for this one and some code)
That seems like the Sybil defence we are looking for. Untargeted random id’s. But no, it’s never so simple (bugger).
A bad node can just keep trying to join and get randomised until he get’s to the section he wants. There is little cost in creating keys. So alone this is not good enough for Sybil defence, but it does help us prevent the bunching attack described above. So this helps node age, as is.
Back to investigation
So this is where I am.
- We can force random id’s, but it’s useless without relocation (so far)
- We can order nodes by FIRST_SEEN but relocations breaks it or at least causes lots of complexity
- Using FIRST_SEEN is much more granular and removes the 5-255 index which feels like a wrong thing to have anyway.
- Relocation gets buy-in from nodes and each relocation towards elder forces is much more good work. So it’s hard for them to keep trying for new IDs in sections they want to target.
i.e. It feels like a huge win if somehow we could have nodes have a random ID they keep forever BTW we can make IDs not public keys, but force a particular ID And tie to a public key with a network packet, kinda like NRS, so we can make this network allocated ID really quite targetted fro best network balance.
So there’s the rub. A list order by first seen where nodes are allocated addresses is brilliant. No relocations and much less network hassle/traffic/code/synchronisation. However, nodes will just keep retrying as infants until they get into the section they want. This is where node age is really really good.
Bottom line. I am not certain we can get a good resolution here, but it would be marvelous if we could. So here is very welcome.
Tried but don’t like so far
- Relocate once but randomly (too hard to figure how random this can be)
- Force up front payment (excludes poor people)
So some mechanism that did relocate at least once after a period feels like it’s what we want. Every step I try node age is just better, but it has relocations and I hate them as well as that integer age thing.
All comments are welcomed.
Maybe a compromise.
Young nodes relocate much more often than older nodes. Basically when a node has proven themselves and aged there is less need for them to relocate, but still need them to relocate sometime to prevent the collecting of bad nodes that relocate till they happen into the section at the right age that relocations never happen again.
While this statistically will happen a lot less it will happen given enough time. For example if the bad node is not in one of the sections they are targetting then it shuts down and starts over (or gets its age reduced) so then has another chance to collude with other bad nodes in a section the attacker is targetting. The targetted sections could simply be ones where other bad nodes are already there with enough age.
tl;dr It would seem to my simple thinking that relocating is still needed no matter how much a node is trusted. But we do trust them more so only need to relocate much less often.
young nodes prob don’t hold as much data so quicker to relocate. Especially if younger nodes have their max storage at some fraction of more mature nodes.
If a simple fee is required to join that doesn’t seem too onerous - even poor people can afford a small fee if they can afford the hardware to be a node. This then becomes another upfront fixed cost of running a node.
I strongly favor this as it becomes another solid layer of protection for the network.
I’m curious how new nodes “find” a section? Is it random in terms of time when applying to join? If I were an attacker, I’d probably want to put a lot of nodes up all at once in hopes that they’d get into the same section … but would that work? Or is there an early phase relocation of pre-nodes to random sections?
Me too. My guess is an attacker fires up (say) 20 nodes. He checks the sections of each accepted node
safe networks sections and looks for matches. Repeat until he gets two in one section, Then hammers on until he can get a third into that section. I dunno…
I presume when we launch we will have many hundreds of trusted nodes ready to form many sections where we know we have no bad actors. If we see a flurry of aborted new node joins - because the attacker has noticed his new node is not in one of his target sections - then we should presume malice and react accordingly - discuss
This is a whole new thread of course - what will the launch parameters be?
But thats for later
EDIT: Could we build a functional client that does NOT accept RUST_LOG=trace and has safe networks sections functionality disabled? So the attacker has no way of knowing which section(s) his nodes are in?
Probably many many reasons why this is unworkable but tell me anyway. Security by obfuscation but anything that makes it harder…
I totally agree with this.
The relocation mechanism makes it very long for a node to target a specific section. And when it reaches this section it will stay here only temporarily.
Also an added bonus of current relocation formula is that the quicker it gets to this section the briefer its visit to this section will last.
Just to be sure that everyone is aware: this is already the current implementation as the relocation probability is equal to 1 / 2^age.
It would be trivial to fork the code and reinstate tracing of current section.
This is part of the issue. Relocation for a young node is cheap for the node, too, with not too much to lose. So it’s still easy for the attacker to add young nodes until they get to the section they want.
This is also the problem. For the fee to be enough to put an attacker off then it’s almost certain to be enough to put off the poorest from joining. Plus the fee has a circle jerk type approach. Poor may farm to get initial cash, but they cannot if they need initial cash to join.
They just join and see where they land. If it’s not where they want and if joining is cheap for them then they restart and do it again.
is answered perfectly well here
I think you can see now why node age and relocations are a powerful mechanism. Good to see it for what it is, but yes, relocations are a PITA for sure and having an integer value for age is also a pain.
I think this is a worthwhile investigation, though, for sure. If nothing is will cement the power of node age and it’s a loose relationship with a staking-type protection. Still, it feels too complex and I feel there is a clever simpler move here. We will see, though, but it never costs us to think about it deeply. There is a lot at stake here.
Both literally and in the formula (two at the power of node age)
Trusting less the new but never entirely trusting any node, might be a useful option too?.. some attackers play a long game.
How about a standing secret shopper challenge… data that is solely for challenging a node is doing storage correctly… data that has its public key included but in a way it does know which data is that. Those could be thrown from random directions rather than be public for all other nodes to know.
New nodes can jump other hoops that would have less impact if a flood of bad nodes occurred.
An idea is we could do both.
- Relocate up to age XX (say 20)
- Use the membership ordered list from there for age related rewards ratio and elder selection
The work to get to age 20 is huge and the stake you have then should be pretty large. But what this achieves is not relocation really old nodes, i.e. not causing havoc in elders being relocated (dkg and demotions etc. which are horrible processes).
Maybe, just maybe this is a good move. In a stable network up to age 20 from 5 is a load of work and should be a load of cash, it will cost a lot in time for sure.
This gives us more stability in the very oldest of nodes and we could even do something like
- Have nodes <20 be part of a replica set, but not counted as a replicant in the required count (4).
- They must still give data etc. and they will be monitored and killed for bad behaviour.
That way these nodes for a very long time are like a network tax, but one that keeps us secure. Even after 20 it’s unlikely a node will get to be an elder for many months or longer.
Interesting So tax the network with new nodes, but give great stability for the older more trusted nodes and leave them where they are. They don’t relocate again, the network is stable up there and the churn levels per section for the >20 nodes are very low.
Putting attackers off is a combined approach. They can’t be stopped with only one method. Death by a thousand needles is the way to go I think. The more methods employed to make life difficult for them, the better for the health of the network.
I think that is part of the failure-type testing that happens as opposed to the node age (building trust) algorithms. To be clear though, we never trust any single node at all. Even elders are monitored by the elder set who should have a majority of good nodes.
Maybe add in a pre-node proof of work and proof of resource test as well … Something that would further slow down attackers - further along the lines of the death by 1000 needles.
This would also cause legit nodes to work a bit harder to stay online and so reduce churn.
We have this and can do that easily. However, it can cause havoc with small nodes/mobles etc. that could have provided good behaviour. It’s also quite easy for an attacker to use a huge POW machine to work this out for his attack bot army.
Could that trust building not just be duplication of what is normal, wrapped in a warning to check it… then new node work is potentially adding value. The novice’s work is checked by an experienced supervisor but the load is on the new worker. The best test is can a new node do what is required, rather than something different. Trust is the reward for doing it right over time.
This is exactly what the Failure model we have does. It continually checks for good behaviour of all nodes using the trust of the supermajority of the eldest to prove bad behaviour and punish it.
This is an important part of the protection. So node age is a mechanism for getting nodes “bought-in”. That means showing they have invested a load of time and have also a load of money at risk (staked) to keep behaving well. The combination of money and, more importantly, time is what gives us protection.
Now think of node age as the algorithm that measures a node’s " well behaved work" over a time period long enough that an attack is super super expensive.
The failure tests are critical to making sure the nodes actually behaving
I can’t imagine too many mobile’s acting as nodes for a few years with data being a bit expensive still … and if they have unlimited data (network plan), then a proof of storage should be possible and this is a proof of work all on it’s own.
Anyway, I know you are looking for the more elegant approach, so I won’t bother you further on this point - but if the more elegant approach doesn’t manifest, remember the 1000 needle approach
… on another rather different note, maybe the mobile army could be good for archiving - lots of space on a global scale but not the speed and higher cost of data transfer. If being an archive node was an option, then that’d be something I’d be more keen to use a mobile for personally.
I do not see this as an advantage for security but a flaw. Now attackers know when they can safely say they will not be relocated anymore. And as been stated some attackers are in for the long haul.
In my thinking relocations at older ages is where the real protection of relocation is most evident. An attacker that can get 2 or 3 adults in the one section at age (say) 20 now knows that they have secured 3 spots and can then target that section when another of their nodes get there at age 20.
Yes its a very long game but some people are sick like that and remember they are receiving rewards to help finance this attempt and can increase number of nodes as time goes on. The statistics show that its not as unlikely as one would think that you could get 3 bad nodes into the one section at age (say) 20. Then just time before another then another and …
But relocations that still occur at the older ages will ensure this will be many (10s of?) times more difficult. And hopefully the collusion could not occur in our life times with a growing network.
Let becoming the elder be the time when the node is not relocated according to age.
EDIT: Remember that some of these btc million/billionaires are also people who love to upset things like markets etc. Any costs to entry will not affect any serious attacker, only honest/genuine people who cannot afford and the casual attacker
Very interesting. I’ve quietly thought this as well. Didn’t want to bring it up for various reasons. But since it’s being considered at this stage… it would be a good upgrade to make after release to minimize bandwidth waste (because bandwidth will very likely be the limiting resource in the network). One of the potential solutions I considered is to actually have elders be the ones to relocate and have adults stay put. So relocation becomes a property of, burden for, and defense against the decision makers, who can actually benefit from sybil attacks. Adults can’t harm the network with a sybil attack since they don’t have decision making powers. They can only sabotage.
At the time I weighed this, elders didn’t store, so their relocating wouldn’t have led to data reshuffling and bandwidth waste. With elders now storing, the primary benefit of which is for them to make decisions on network storage levels without consensus, there are a few new considerations, including:
- Elders’ storage bandwidth waste with relocation could be further minimized by having them calculate upon relocation how much storage they would require to store the data they are assigned, but they don’t need to actually download that data. This way, they maintain their local awareness of the section’s storage level without consensus, yet they don’t use much bandwidth.
- Elders’ storage bandwidth use compared to adults is a lot less (and can be even lower if larger sections sizes are used). So if they must, elders can continue to fully store (still, their replica should probably be considered separately from the (at least) 4 replicas at adults to minimize data churn/retention edge cases due to bandwidth).