How short is a "short period of downtime"?

Southside · September 6, 2020, 3:05pm

Vaults should operate 24/7.
In an ideal world.
This is not an ideal world and as @dirvine has said before a short period of downtime shuld not affect a vaults reputation.

This thread is to explore what should constitute a “short period of downtime”

I’ll kick off by suggesting that 20 mins downtime every 5 days is “acceptable downtime”
If your Windows box needs more than this then you need a different OS or hardware.

Dimitar · September 6, 2020, 3:21pm

I didn’t have internet for 5-6 hours today and Storj didn’t throw me out. I hope Safe is so generous too

dirvine · September 6, 2020, 4:17pm

The way we are working, if your an Elder at least then the rest of the elders will decide when yer gone. Otherwise it’s a significant attack surface. I will explain.

You try to get a sybil attack going so you try to get nodes close to an address. A well designed network shoudl not allow that.

Now if you can get vault on the net and switch it off but reserve your address then it gets dangerous.

So an attacker does exactly that, he/she starts vaults repeatedly and stores all the addresses. When you have been able to surround an address you then just start all the vaults near that address you have, then profit

Southside · September 6, 2020, 4:41pm

Could some rule like this work?
if there are more than say 2 or 3 vaults offline in any section, then these vaults DO NOT resume, but can only be readmitted as infants.

Which is tough on the poor vault operator whose Win10 machine went down, but life is tough and the network needs to protect itself.

dirvine · September 6, 2020, 5:23pm

It possibly can, what we find though is before launching any complexity at all can have profound effects. So for now it’s based on how quickly the network needs you back. So if no action is required from you for 6 hours then cool, but if you miss a few actions you should have done then the section will consider you non-responsive.

So if the network is quiet then you can be offline a wee while, but if it’s busy then the actual time you could be allowed off will be much smaller. In all cases though restart/upgrades should be fast enough.

The problem here is, was the vault an Elder, then it would have been replaced as an elder quickly. If it was an adult then data would be relocated pretty quickly. You see the complexity quickly builds. I would really like the network to be the referee in how long here. However what you suggest is an algorithm tweak, just right now it would be a fair bit of work to calculate all the edge cases and possible attacks.

dask · September 6, 2020, 5:53pm

Could there be some mechanism whereby you notify/ask the others that you will offline for say 5 min for an upgrade/restart, they give you the go-ahead (possibly the software triggers it at the right time) and then it’s all good as long as you check in again by the scheduled time? It is a small bit of complexity, but relocating an Adult’s data could be a ton of network traffic etc.

MaxSan · September 6, 2020, 5:58pm

The basic concept of this works well, asking your peers for a fixed amount of time is reasonable… scoring as you go along.

Actually this entire concept is very similar to what the guys are doing with Lightning. Obviously the metrics are a bit different and this is core for reflecting the data stored rather than the money routes.

1ml are doing their own ranking, I believe ThunderHub have a ranking too… Node: bfx-lnd1 | 1ML - Lightning Network Search and Analysis Engine - Bitcoin mainnet

that is Bitfinex, see their Node Rank methods there.

drehb · September 6, 2020, 8:14pm

I didn’t use livepatch before, but there shouldn’t be a need to restart for OS upgrades.

Southside · September 6, 2020, 9:16pm

Not on a grown-up operating system, there shouldn’t be.

Unfortunately…

davidpbrown · September 6, 2020, 4:53am

Does Vault become Elder, simply by length of uptime?.. nice and simple if that is the case. I’d in mind it’s only a certain number that are gifted that status because perhaps they are doing different tasks. If everyone becomes an Elder, does all the work get done?..

Sascha · September 6, 2020, 5:31am

What counts as “going off line”? Maybe a communication break of, say, 5 minutes should not count as “being off line”. The vast majority of people need to reboot every once in a while, to update the kernel of something. And we do want “the vast majority” to be able to participate in a way that is meaningful to them.

neo · September 6, 2020, 5:36am

At this time David said short time. This might be something they will work on when we have multiple sections with vaults at home.

I do think that a vault should be given the opportunity to rejoin after a small power outage or reboot especially since windows 10 can reboot you at the worse of times if you haven’t gone in a changed the settings to not do that.

There would have to be some validation process and maybe lose some age.

davidpbrown · September 6, 2020, 5:46am

and if the thought is that many people are contributing, then fair to wonder that Safe Network is not their sole; only; or most important activity.

So, there needs to be a limiting of Safe Networks use of CPU/RAM
but also that it needs to allow other reasonable actions to occur… pausing and rebooting… perhaps even with an option for pausing manually, in the settings - which comes with a liability of losing status but a choice for the user, is better than killing the process outright.

jlpell · September 6, 2020, 9:56am

That’s up to the vault operator to decide. Chances are that no more than 1 or 2 vaults per ip address is optimal unless you have big upload bandwidth. Think about it, a 40 Mbit upload bandwidth can only supply 5 chunks per second.

davidpbrown · September 6, 2020, 10:18am

Friendly process doesn’t resource hog. not just about bandwidth - although limiting bandwidth might be wanted too. At times perhaps it will need to draw what resources are available and unbounded is greedy and would clash other interests of the host?

On Linux for sure there are controls of process but I don’t know of M$ options.

It maybe that the vault is low impact and that would be ideal of course.

davidpbrown · September 6, 2020, 10:24am

and my instinct would be limit CPU/RAM/bandwidth use to 80% unless allowed directly by the user… which to close the circle around the prompt, would be another indicator for the vdash dashboard.

Topic		Replies	Views
Farming Question Features	9	1926	June 3, 2014
Non-persistent vaults Safe-Node	131	7825	January 26, 2016
Massive vault failure handling? Development	5	961	July 21, 2016
Some Questions about SAFE Features	3	881	July 3, 2017
Disk migration/replacement Features	12	1537	August 13, 2014

How short is a "short period of downtime"?

Related Topics