At a very basic level I mentioned that data will be encrypted and sliced up in a million pieces and
backed up across the planet with a RAID type algorithm.
Technically saying RAID is incorrect isn’t it? But its the only way, TO AN ENGINEER!!!, I could explain it in this case. oh man.
RAID-0 : consists of striping, but no mirroring or parity.
RAID-1 : consists of data mirroring, without parity or striping.
RAID-2 : consists of bit-level striping with dedicated Hamming-code parity.
RAID-3 : consists of byte-level striping with dedicated parity.
RAID-4 : consists of block-level striping with dedicated parity.
RAID-5 : consists of block-level striping with distributed parity.
RAID-6 : consists of block-level striping with double distributed parity.
The network has a parameter that controls how many times a chunk is duplicated. Say for example that its value is 8: this means that each chunk is stored on 8 machines. As soon as one of these machines goes down, another copy of this chunk is immediately created on another machine to keep the permanent 8 copies.
So, a simultaneous drop of the 8 machines would be needed to lose the chunk. This is very unlikely because the 8 machines are geographically randomly dispersed all over world.
Maybe we could name this system RAID-Infinite (Redundant Array of Independent Disks - Infinite).
My only outstanding question here is how quickly does the overarching network recognize the drop of a host that had a specific chunk to determine that chunk needs to be copied elsewhere to keep say the 8 replication factor(thinking in terms of apache cassandra lol ) . If it can detect it within say 1-5 mins that likely should be fast enough to prevent a chunk loss, but if it takes say 30 mins to an hour to realize a node went down with that chunk I think we will be in for some problems.
We can’t assume 8 personal random computers around the globe storing data are going to be crazy stable. Certainly some network die hards will be but I think 50% or more of the hosting power will be those interested in testing it but not fully invested into helping the ecosystem(meaning they will cut off after some time when they get bored/not getting rich off their few maidsafe coins collected heh).
Replying to your post takes milliseconds from the point of hitting reply to it appearing on your computer.
My reply is channeled to a URL.
I assume over the safe network, checking the addressed bit of data also takes milliseconds.
Edit: What I am looking forward to finding out over the vauls at home, is the amount of CPU time
a vault uses to do the above.
Gotcha, and that section talks to other nodes that are not geographically within the same location so we are pretty safe there, at 8-22 nodes I still would expect consensus to take 5 seconds or so, not sub 1 second.
I suppose the technique of how it knows comes into question. Does it check all hosts every time a GET is made by a client(which can be seen as client polling, imo not good enough because vaults could go down long before a client ever re-requests the data). Does the network stay chatty where in the background section validates that all chunks are still at a RP(replication factor) of 8 for availability on the network? If this has all been answered elsewhere feel free to link me over to it. Then even more questions come into play well when you have vaults with 100,000+ chunks of data, validating its all present in the bg, seems like scale could play an issue unless there is some “hash” check that a vaults share to check all the data is present it would expect compared to neighbor section vs each individual chunk(maybe in this case an elder has the “trusted” hash other vaults have to match with).
I wish all the low level interesting tech stuff of the SAFE network was well documented in a glossary of sorts with hyperlinks to how every little detail works. Not scattered forum posts.