Vault backup provisions

The SAFENetwork protocol keeps data safe from the point of view of the data owner through redundant and distributed backups. It doesnt however offer the farmer anything to help secure their vault’s data. Perhaps it doesnt have to be that way…

Vaults establish trust status (node-age) overtime through ongoing ‘Proof of Resource’ tests. The higher this trust goes the greater the earning potential of the vault. If a requested file is not served then that proof of resource test fails and the trust level is reduced.

It is in the interest then of a vault to have some form of backup process in place.

A remote NAS taking daily and incremental snapshots of the vault would allow a vault operator to recover a damaged vault and retain most of their hard earned reputation (node-age).

I first imagined setting this up using traditional servers with rsync over tcp/ip. It then occurred to me that the ‘Massive Array of Internet Disks’ SAFENet is of course designed to solve the problem of data storage falability. My mind then segfaulted and rebooted.

Take the example of a farmer (vault operator) who has two servers (A and B) in two separate locations. Which of the following is the right strategy the farmer shpuld employ:
1: Set up A as a vault and B as an rsync backup unit
Pros: A will become a high earnig node over time as can recover from failures thanks to the backup.
Cons: B doesnt earn SAFECoin or contribute to the network. Rsync over tcp/ip is old school and vulnerable, we should be using the maidsafe stack.
2: Set up A and B as indepentent vaults
Pros: Both earn safecon and contibute to network.
Cons: both are unreliable and could lose all data and node-age.
3: Extend the SAFENet protocol to allow for a single farmer to designate two or more independent vaults as mirrors of each other thus becomming a ‘high-availability cluster’.
Pros: The protocol directly recognises and assists farmers who build high quality ‘cluster-vaults’. All of the benefits of online comms on the safenet instead of the clear net can now apply to the private backup vaults
Cons: Still only the primary vault earns safecoin.

What are the economics here? Over enough time should 1 node + private backup earn the same as two unbacked-up and occasionally failing vaults? Due to low failure rates it seems that running two nodes would earn more than one robust node. Maybe thats the way it should be, losing data is discourgaed and punished but not to the degree that someone should have to dedicate 2GB of drive space for every 1GB offered to the network.

2 Likes

@mav pointed out to me that part of the ageing process is that vaults from time to time are relocated in the network and as a consequence their vault’s data store starts afresh and fills with different set of chunks. Thus a backup would only be useful while the vault remains in a particular section.

My thoughts are that most consumer drives have a good reliability rating and unlikely to fail during any stint in a section. Also when you did a restore from backup if your drive failed means that the PC has been turned off to replace the drive and this causes a relocation of the vault anyhow and invalidates the backup you have,

Probably the best “backup” idea would be a RAID so that there is no downtime and no delays in the fault tolerance. The cost is an extra drive and if needed a raid controller (many motherboards have raid ability). Then you don’t have downtime (no loss of age and relocation). But is that minimal extra expense needed considering the reliability of drives now.

Thats my understanding anyhow.

6 Likes

Thanks @neo.

Yes, when nodes are relocated then the backup would have to update too. I imagine relocating will not be a very frequent process otherwise the SAFENet will spend most of it’s energy continually ‘rearranging the library’.

Restoring from backup doesn’t usually mean powering down a device. Often it is just individual files which have corrupted and need to be restored. Occasionally a drive fails but again that can be completely restored while remaining online so long as it doesn’t also contain the OS.

RAID is good of course but I think hits on my economic question raised again which is the decision all farmers have to make as to how best to spend their money… on private backup solutions like RAID or external mirroring, or on more vaults.

That’s what I’m trying to get at really… the SAFENet designers should take into account that equilibrium between the users wanting good data security and the vault operators wanting good investment security. SAFECoin is the general answer, but the specifics aren’t yet clear I think.

Interesting topic but overall it comes down to ‘the network handles it’.

Nor should it, that’s up to the operator. Only they control it.

No, it’s not about having backups, it’s about conforming to the performance requirements of the network (one part of which is data availability, ie backups). If local backups makes it easier for the vault to achieve the required performance then fine do that, but the vault operator must manage the additional cost vs benefit.

2 is the right strategy: Set up A and B as independent vaults.
Because the vault should not require complexity to operate. If operators are finding efficiencies and improvements that are ‘off network’ such as rsync then the network is not designed correctly imo.

The network should manage all redundancy. If the rewards are better by doing ‘smart stuff locally’ then the network design hasn’t worked as intended.

Two vaults should earn more than one vault with backup. Random vault failure is a healthy and ‘necessary’ part of the network.

But it depends a bit on what is categorized as a vault failure; one chunk corrupted, two, ten…? Over the space of one second, one day, one year…? Punishment design will be really important in deciding how robust vaults need to be.

Definitely agree with you on this. It’s hard to monitor non-network optimizations so hopefully the network and incentive design is robust from the beginning and trade secrets don’t end up dominating the safecoin economy.

5 Likes

I would propose a 4th option where A and B are each:

  • a vault
  • the backup of the other.

But, as indicated by @mav this is to be managed by operator not by the network.

1 Like

Thanks for the responses @mav & @tfa, really interesting stuff. I’m glad you have said option 2 was the right line of thinking (2x vaults earn more than 1x + private mirror), it makes most sense to me but I was worried the economics might fail to support that option. Sounds like that is unlikely.

Failure is expected and the SAFENet is designed to strongly mitigate that problem. It seems fair therefore that the downgrading of otherwise honest vaults due to inability to respond to requests should be quite moderate but progressively severe.

I suppose also the partially failed vault will have some protocol calls which will help it rebuild the missing data from the close group (like a bitcoin node sending a getblocks() request).

2 Likes

@opacey I think you raise an important question that remains open. I agree with @mav that we want the design to make two vaults the preferred option, but your point seems to me valid - that the cost of losing data from a vault that is not backed up might exceed the earnings from a second vault. I think this is unlikely, but we do need to do this calculation when the time comes and factor it into the design.

3 Likes