Are Erasure Codes (Storj) better than Replication for the SAFE network?


#42

I should add though that this is also why data chains are vital. They allow data to be republished to a network with cryptographic certainty that they were originally created securely on the network. So even a lost chunk should be re-publishable, again without knowing what file the chunk was from etc.


#43

Fyi…post I just noticed on the Storj blog referencing discussions here


#44

That, unfortunately is not framing the issues effectively which is a shame. Hopefully the points I made above can be considered as they are pretty vital to understand and rectify where possible.


#45

Storj blog post about replication being bad and erasure codes being good:


What’s up today?
#46

There is a convo about that over here Storj Launches Version 3 of Its Decentralized Cloud Storage Platform I think that there are some missing issues that require to be resolved before erasure codes could be used in SAFE. I do not think overall that it has any benefit we can make use of.


#47

Where was that thread hiding? I wish stuff like this didn’t disappear from the forum home page, as many of us don’t check all the other topics.

I will have a read and thanks for the link!


#48

And they are targeting “always on” servers as the nodes of their network. Only expecting (very) small downtimes for any nodes. Basically months uptime for the nodes and as such negate some of the drawbacks that occur by trying be very inclusive for nodes (ie they are benefiting those with money to buy quality server style computers at home or data centre). Also erasure codes will also limit the smallest server or client device to something more significant than safe will need.

Clients will need the capacity to reconstruct each portion, each client will need (slightly) more bandwidth to download a specific amount of data. And considering that its likely that the majority of bandwidth is for clients accessing data, then the claimed benefits of lower bandwidth between nodes is not anywhere as significant as they are making out in their papers. Churn might be 10% of the data movement bandwidth required and clients accessing data might be 90% of the data movement bandwidth.

So the (slightly) more data required to download using erasure could actually be more than the “extra” replication requires during churn. Its no good just focusing on inter-node data movement because of churn or errors, but must also include the elephant of the data being sent to the clients.


#49

Yea, I kept losing that one myself as well.


#50

@moderators, can the Related Projects category be not hidden from the front page? I keep seeing people asking for this to be the case (myself included) since there’s some really great info in that category. There’s only 8 topics active in the last 2 weeks so it’s not as if it’s gonna flood the front page.


#51

This is in part because it isn’t on the front page. Let’s try it, but we need to be aware that it’s an invitation for people to post about other projects in a bid to promote them to this community. That’s why the category was created and made less visible, so people could post about other projects without it being too prominent.

I have no problem finding any topic. So another way to avoid missing things is to look at your Discourse settings. I suspect having the setting for tracking topics to ‘start tracking as soon as I enter a topic’ helps a lot. This is not the default.

Another tip: always scroll to the bottom of every topic. This is probably why I don’t miss things unless I mark a topic ‘ignore’, which is good :slight_smile: obviously. It shows me everything I’ve not read.


#52

I think any topic where there is good SAFENetwork related content is good to have on the home page. Other stuff, not so much. So, I suppose it depends on the content and we certainly don’t want to be a bill board for advertising other apps/platforms.

I just remember seeing this topic and reading the start of it with interest and then it disappeared. I’m guessing it was categorised and then filtered from the default display? Perhaps stuff should only be categorised when it has no reason to be on the home screen?


#53

It could be useful to have in the forum an algo that shows users topics matched with their preferences. So that we have a list based on the most popular and the most “interesting” topics for you (the user).
Btw, don’t know if it’s a good option, just thinking out loud:)


#54

Just go to your preferences and under “categories” place the categories you want to follow in “watching” or “tracking” fields.


#55

Originally the posts of this topic were in the ‘What’s up’-topic. It deserved it’s own topic. By choosing the ‘Related Projects’-category it disappeared from the main page.
I’ve just renamed and moved the topic to the ‘Features’-category, which is on the main page.
Can users with trust level 3 or more also Recategorize and rename topics?


#56

lol ok that’s much easier. Thank you @neo


#57

I believe they should be able to. They can rename title and category if they really think its needed.


#58

Are you sure about that? We already need encryption and signatures for everything.


#59

He’s right. The erasure codes require reconstruction of the data and there is built in error correction (is it parity blocks for storj?) and these are used for reconstruction of the data. So if a bad or lost piece is asked for then that has to be reconstructed from the other blocks requested. And I gather normal blocks of data are still reconstructed from 2 “half” blocks using the reconstruction process.

This process is similar/same as the par2 blocks used for nntp and it has long been a problem when used against many GB files. Now Storj is not doing that size erasure codes, but the point still applies that in order to reconstruct a certain amount of capabilities are needed and if desired to be done in a reasonable time then it needs more speed and memory.

So the downside is that there is a limit to how small the device can be. Whereas for the SAFE network the device basically has to be able to web browse and it will be able to surf the SAFE pages/network since the crypto/capabilities needed is pretty much the same as what is required for hhtps


#60

I’m not sure how Storj does it, but that wouldn’t be a problem when done together with chunking. The network would still work with chunks (that is, not gigabytes of data) but the chunks themselves would be split up into numerous smaller blocks, only a few of which would be required to restore a chunk, be it either as a normal read or as part of recreating a lost block.

Data maps would refer to chunks as now, but chunks wouldn’t store the data itself but a public “secondary data map”, pointing to the blocks the chunk is made up of and those blocks would contain the actual data. It’s a trade-off between complexity and storage space, with some latency related considerations thrown in.


#61

Yep, like I said.

But by doing so they cause there to be some lower limit. Whereas SAFE lower limit is around that of being able to securely browse the internet and storj is significantly higher. How higher I am not sure and am not trying to define. What I am trying to show there is a difference and it shows that stroj have a different use case in mind and consequently a different audience they are marketing to.

Their market for the nodes is much higher end than SAFE is trying for. And erasure codes require a (slightly??) higher end user device