Are Erasure Codes (Storj) better than Replication for the SAFE network?


#82

AND I was not disagreeing. Just because I say what someone else says does not say I am disagreeing. Any disagreement I have is stated where-ever but only on some specifics, like I suggested reading is 90% of the bandwidth usage where storj are writing as if churn is almost all and erasure codes is ever so much less. But if reading is 75 or 80 or 90% then churn is miniscule compared to the reading bandwidth.


#83

It is really because you do not know which blocks exist, you may need a parity block to make up the file and therefore have to either try 4 and wait (if any of these 4 are missing, which is why you have erasure codes in the first place) and then get a parity block or more or just ask for more to start with. So you could ask for just 4 and if none were missing then both schemes are equal, but you have erasure as you think some could be missing, so you need to get more pieces.

Yes, I see what you are saying here (which is beyond the typical EC pattern, which is great to see), but I think we still have issues.

  • It will expose the whole file (the minimap is the whole file I think?)
  • The single vault holding the block will go offline, so another vault will need to get the other parts and recreate that one. This is a double bad as we a: never trust a single vault and b: cannot trust a single vault will rebuild the file. We cannot trust a single vault even has recreated the block and gave it to the new holder, so that needs a few RPC’s to ack to the group or similar.

You could have vaults recreate the missing chunk, agree on who should keep it and ask the vault selected to ack back to the whole section it got it from the one we decided should give it. It might be a bit more complex, but when this is all happening some client could have asked for the chunk and will need to go elsewhere or we tell him to wait etc. (more bandwidth, slower response etc.).

I see what you are looking at here, but I am not sure it buys anything more than potentially saving space (which is good btw) but I think (I need to write the process in detail) at either the cost of bandwidth and/or possible time lags. The vaults will have to hold maps of minimaps -> holders of the data as well. Then means when a node goes offline with a lot of blocks gone then the section needs a pattern to get those other blocks and recreate those files to get new blocks to store on new vaults and update those maps, possibly while a client is asking for the blocks. So we might need a wait RPC or reply fail, try another block etc. I suspect there are a few more things to consider here though, but I do need to think more.


#84

Ah. I think this is the cause of the confusion. I love the chunks, I truly do; I would not want to do away with them, not ever.

So, no: the minimap is the chunk, not the file. More precisely, it’s the list of the 16 blocks: the 4 primary blocks (that are just the chunk sliced into 4 pieces; I called them primary blocks) and the 12 parity blocks that provide redundancy. Clients always request the 4 primary blocks: they concatenate them to get the full chunk and decrypt them. It would be useful to store the references to these 4 blocks in the file datamap alongside the respective chunks because it would be silly and wasteful to have to download the minimaps for each chunk every time.

The 16 blocks that make up a chunk are stored in different sections based on the XOR distance of the block’s hash, each with a copy of the minimap, which contains the list of the 16 corresponding blocks, which is all the required information to restore the block when it’s lost (that is, there’s no need to even know which chunk the block belongs to.)

The overhead:

  • the minimap (16 x 256 bits = 512 bytes, I think?), stored for the chunk and then a copy for each block
  • potentially more requests / bytes, but that depends on block and chunk size

It’s also not an absolute requirement that there can be no multiple copies of the same block (trusting one vault is not wise indeed). It would still give a much higher redundancy for the same storage space with non-identical blocks scattered over the network than with simple replication.


#85

Ah thanks for that.

Makes much more sense now. I still see a big overhead in keeping these mapped for the chunks (not files :slight_smile: ) So the minimap per section is fine (all nodes will have this plus the map). The 16 sections though where these parts are stored do need to be monitored somehow, this part is the elephant in the room part I was talking about earlier.

If each of these 16 is a single copy in a remote section then how will we know when one vanishes?

Then how do we monitor 16 nodes for every chunk on the network?

I hope you see what I mean a bit more now. (nice ideas here though).


#86

I think you’ve misunderstood … my full statement again:

I wasn’t disagreeing with the first part at all … I was agreeing actually … What I am implying (apparently not very aptly) is that there may be other considerations beyond corruption/loss that may negatively impact … but we cannot know as it’s not tested …

I then went on to say that replication has the same general problem.

One is simpler than the other however and when testing out potential solutions, IMO it’s best to start with the simpler one, before going to the more complex one.


#87

I think it would be technically called MAID-41 :smile: . I was thinking more of how to mix just a little bit of parity and replication in order get a lot of ROI with regard to fault tolerance. A Raid 41 scheme ie. the network replicates (8x) copies of a raid 4 chunk set makes the truly paranoid feel safe. Following the self-encryption process, for every normal N self encrypted chunks, a single parity P chunk could be computed using simple XOR and appended to the data map. AFAIK this simple 4N+1P raid 4 scheme at the chunk level can effectively double the network replication effect for a 25% increase in PUT cost and storage space. When the data is read, there is no need to send the parity chunks to the client nor do any extra computation unless all 8 copies of 1 of the chunks in the 4N+1P chunk set were somehow destroyed. I suppose the datamap itself doesn’t benefit though.

The fanboy marketing war. :grin:

I may have veered off-topic a bit from the storj comparison. My basic opinion with regard to the OP question is “NO, Storj erasure codes are not better than SAFE replication.” I think it’s more a question of pareto optimality with regard to minimizing computation, bandwidth, and storage complexity/cost for while maximizing data security, privacy, and freedom. The storj folks raise some interesting points about bandwidth being the limiting factor which I think do have some merit. A simple Maid-41 scheme seems like a low cost way to boost effective replication levels for negligible bandwidth increase. A MAID-61 would be efficient on SSE/AVX processors (only uses XOR and bit shifting, analogous to Linux Raid-6) and would seem to offer a 3X multiplier to whatever the base network replication factor is. Since the erasure code in this case doesn’t form the backbone of the network, it doesn’t suffer from all the issues storj will need to deal with. Instead it’s just a worst case scenario insurance policy to help keep things safe.


#88

Yes I would also want that, but we need to monitor every part of a file/chunk/part and I cannot see a way to do that efficiently.

I think people are getting me wrong, EC is more space efficient and I completely agree, but in a hostile network that needs to know when a bit goes missing, the maintenance is huge and I cannot see a fix.

I don’t care who has the fix or best pattern and am glad many folk are looking at in different ways. I think EC is clearly space efficient for storage but costly or impossible to maintain efficiently :slight_smile:


#89

It seems we kept talking past each other :sweat_smile:

If the addresses are 256-bit hashes, it’s 512 byte/chunk, stored at 17 separate sections: the chunk’s own section (though it’s not even strictly necessary!) and the sections that store each of the 16 blocks.

The chunk’s own entry in its own section is solely for retrieving the minimap. If the 4 primary blocks are always (i.e. not just optionally) stored in the client data maps, then the chunk doesn’t even need to be stored on the network. I still favor storing it though.

This is where I’m confused. Why would you think so? To refer to a similar situation, do sections monitor chunks they are not responsible for? Once the blocks are stored by their respective sections, they are handled independently from both each other and the chunk that they encode. When a block is lost, the section responsible for it can just send 4 GET requests for a random selection of blocks from the minimap and the sections responsible for those will send them; there’s no coordination or monitoring in place.

To expand on what I just wrote, this situation is in all important aspects identical to how chunks relate to files. There’s no place in the network that monitors if the chunks that belong to a file are still there, especially because that isn’t a matter of public record to start with. We entrusted the chunks to their respective sections and now it’s their job to store them. It’s the same with the blocks in the scenario I outlined with the exception that the minimaps are public (they don’t disclose anything secret about the chunk.)

Let me note that as you previously wrote it’s not okay to trust a single vault within a section to store a block, so let’s do either two vaults or let’s check up on the vaults whether they actually have the block they are responsible for (I’m sure this is done for chunks as well.) As the blocks are content-addressed, it’s really easy to verify if the returned block is really the block asked for.


There would an important positive consequence of distributing each chunk across 16 sections. To perform a directed attack against a single chunk, the attacker would need to take control of not just one but 13 sections. It would provide unprecedented robustness against such directed attacks, enough so that maybe even the section size could be lowered and still have a higher security than without it. Less nodes in a section would also mean less intra-section communication.

I stand corrected, my apologies.


#90

Yes that is automatic really. The group will all hold the same chunks, one member vanishes and another is given all the same chunks it had from the other members (in SAFE this adult would already have the chunks as it needs them to join teh section). It is a very important aspect, otherwise you need huge meshes of communications across sections and it goes very far down a rabbit hole.

This is the point, how will it know the chunk is lost and then, who Gets the parts? (we do not trust individual nodes) and then who decides who will store it?

This is the start of the rabbit hole 2 is not enough for a secure quorum.

Who checks, for security the whole section must and come to consensus if there is a fault. Otherwise it is not secure (again).

As I say though this is automatic, we know they had the chunk and we know they must deliver it if they do not another does immediately and that vault is punished, however, the client gets the data, there is no waiting while we reconstitute etc.

That is nice for sure, but maybe beyond what is required if you see what I mean. If a section can be DOS’ed then there is more at stake than losing a chunk.

The security is way beyond data protection numbers though (i.e. replication value), it covers data manipulation (amending existing data, safecoin transaction etc.), so we must secure sections against such attack. Its done by having a load of nodes ready to take place of Elders, merge and more.

It relates to your above point as well really when you look at it.

In the case the section got killed there would be no section responsible for the block anymore, if that makes sense.

I suspect when you push this to the level of security we need that it will be more data, more management, more bandwidth and actually slower. The complexity is much more than it sees at first.

So you take a chunk (in maidsafe terms) then we split into 16 bits. Store each in 16 sections (more management as we need to know when these vanish and replace them). However we have 4 blocks that the client knows the address of (this would be a group address to allow the client to get to the nodes closest to the address and then they can tell which node holds the data and confirm that node sends the data. Otherwise, they need to reply nack and the client tries the next group, while this one must reconstitute the chunk (so they all need the minimap or similar, that means minimap per group that holds any part). I think it goes much deeper faster and the space savings dimish or actually vanish as the 1Mb chunk now has 8Mb management data in terms of minimaps (if there is a 512b minimap) if you see what I mean.

I may be missing something though, perhaps we can somehow draw out the whole process start to finish of 1 chunk replicated 8 times verses the scheme you are talking about (or even a slightly different version if that is better).


#91

But hey. What if a network does BOTH like RAID 10?!


#92

You don’t understand the draw back of this and you seriously want to create your own network? God help us… :sweat: