Safenetwork sustainability concerns - Bandwidth has an ongoing cost however Safenetwork is a pay once, benefit forever model

I made similar calculations, with similar conclusion, more than two years ago.

About the data recycling, I made a small Excel to calculate the global percentage of garbage data, over the years, based in the rate of network growth and the annual rate of garbage data. According to the two variables the percentage tends to different limits, some examples:

With 50% Net rate Growth
10% annual garbage → 25% final garbage
20% annual garbage → 42% final garbage
30% annual garbage → 56% final garbage
50% annual garbage → 75% final garbage
80% annual garbage → 92% final garbage

With 100% Net rate Growth
10% annual garbage → 18% final garbage
20% annual garbage → 33% final garbage
30% annual garbage → 46% final garbage
50% annual garbage → 66% final garbage
80% annual garbage → 88% final garbage

But I still do not see any problem in the pay once store forever principle. In a very high percentage, old data will never be accessed, but that is totally indifferent to the farmers because their profit remains unchanged.

Well, the random distribution based on the XOR directions should equal benefits to all farmers. Some may have bad luck one day but, I think, there is a very strong tendency to a fairly uniform distribution.
Of course Maidsafe can do all kinds of simulations although I am somewhat reluctant for two reasons. One because many times I prefer to follow logical reasoning than simulations that may contain structural failures. The second because, in decentralized computing, simulations have the bad tendency to coincide very little with the real world.

3 Likes

Is this growth of data or of farmers?

Also the data is not accessed after 3 month is too gross an overstatement and I’d have a lot of problems even suggesting it as a extreme edge case. The studies are more like in the 1st 6 months the data is accessed 1/2 the amount it will ever be then next 6 months another 30% or so.

The range in the first year is like 40-80% of all accesses depending on the data and that is only valid for around 80-90% of all data. The other 10-20% is like regularly accessed Programs and operating system files that are currently in use were not included for obvious reasons and I would not include them either in any SAFE storage usage.

Also you have not considered even remotely the different pattern of database style of data in MDs. This will be huge and things like search engines will have their own usage patterns and may even have large databases access during searches in a way that never follows the drop off we see with static data.

So maybe if you used a function that shows data usage dropping off at a set rate for first 6 months then another rate for the next then another for 4 years and do that for 80% of the data and the other 20% drops off more linearly over 5 years to say about 75% of all accesses. Data is never considered to be never accessed again.

Also the study said that if data is publicly available then the study is at best an approximation as a lot of media gets resurgence from time to time.

NOTE: also this study did not even consider public data that is dedupllicated. So that one movie might have constant usage over 2 years because the audience is varied and learn of its existance (or desire to watch) over time. An extreme example would be say a Disney style of kids movie where it will be watch every year it exists by an approximate increasing number of kids with the increase due to population increase.

NOTE2: There is also the effect of audience increase. As SAFE users increase over the years so too will a lot of public files be access by these new people and the “Disney” effect for other media will occur to some extent for a decade as people start to use SAFE. So unfortunately we cannot even fully follow that study’s pattern for all data.

NOTE1: and NOTE2: are two of the reasons that a 3 month period of usage then none is an extreme edge case that would not happen for a network in any sort of growth.

Also I didn’t mention that in the study (and I wish I could find it again) different types of data (Movies, vid clips, podcasts, business type documents, backup files, etc) all have different and quite varied dropoffs. In the simulation I did I actually had a calc sheet that I entered all the various parameters over time so I could more accurately simulate the effects with different data sets being added then dying off.

4 Likes

Then it makes no sense for all the garbage to be there occupying space FOREVER, safenetwork is designed so everyone’s spare storage space gets used in a meaningful way, not so that those spare space, otherwise can be used at least sometime to store some useful stuff, now filled with data that will be never accessed yet maintained for eternity to come by the network. That’ll make the network be more inefficient than just having spare hard drives lying around with storage space.

Ummm why don’t you talk about how data is valuable. Even if you don’t find it valuable. Don’t need to read very far or look far in a google search to see that data is the new currency. Not only for advertisers but for everybody is data valuable. Your accounts is valuable, your tax filings are valuable, your video of your parents will be valuable in your later life when the have passed on. But no they arn’t valuable are they. Remember that that garage is growing by 10 times every 5 years and could be greater when SSDs take over next year at upto 5 times a year (40TB SSDs being worked on and expected for release in a year or bit)

3 Likes

Well consider you earn safecoin for devoting resources to the network you aren’t currently using. Then the network charges you in safecoin. And consider that once you upload that photo to the safe network it no longer has to take up space on your phone and can be accessed from anywhere. And consider you’ll ONLY be charged if that photo has never been uploaded by anyone else before. I download a lot of memes from the net. There’s a good chance a meme has been download and uploaded by others. So odds are SOMEONE has a copy of half of my meme collection somewhere. If the SAFE network can successfully identify which ones have been already uploaded that saves me safecoin on uploads. Same with music or random music videos or something. You only get charged for UNIQUE data uploads. Also consider that keeping data on your hard drive ISN’T free because eventually you do run out of space and need to buy a bigger hard drive. This is an especially acute problem for those collecting large video files (kareoke anyone?)…

No the user is charged even if an exact copy is on the network. This has been confirmed that it will be the case, This is another aspect of why the model will work. The more popular a file the more often random people will also upload it. Reasons for it include keeping full anonymity and security plus the benefits to “economics”. This is for you uploading it causing the network to process it even if not finally stored (twice) and the expected downloads from a more popular file.

3 Likes

Okay, my mistake, but then how does deduplification work then?

1 Like

Yikes, that’s an interesting observation. You mean I take it that only new files are read much and older files become less and less read thereby generating fewer GETs. But couldn’t farmers be rewarded even for cashed data? Cashed data is continuously shuffled around on the network I guess and generates GETs and farming reward pretty evenly distributed among all the farmers.

Yes this won’t work though, someone has to pay. If it’s like this then everyone can just do this and store everything for free. For everyday people the amount of safecoin you farm is likely not going to meet the demand a normal person storing data, unless you store only tiny bits or that a lot of people stores new data on the network.

Why would you upload it and pay if this data has been uploaded already? You just need to access it. In fact there will an app for you to check if what you are about to upload has been uploaded across all the public files. If you post a popular meme or other popular photo then you just need to use the app to find the access link and not spend any safecoins to upload it. With an app like that pretty much all uploaded data would be unique. And if you talk about the case of two private files being exactly the same, that’s extremely unlikely. I think most photos people take are unique, most data they upload are also.

Sorry if this has already been discussed, but I assume that the majority of GETs overall on the network will be from cashed data. And rewarding farmers for those GETs will then ensure that the farmers keep earning safecoins at a steady rate. The cashed data will not be counted as proof of resource though. Instead, the farming reward for a GET is determined by how much non-cashed data the farmer has stored, and it doesn’t matter how old that data is, it still counts as the same resource as newer data.

So when a farmer earns safecoins for a GET for cashed data, that reward will depend on the reputation level and total amount of stored non-cashed data (both old and new, makes no difference) the farmer has. Thereby no need for special archive nodes or something like that for old data.

The PUT cost is to store something forever (or at least the life time of the network). Farmers know this and users know this. There is no surprise, so the cost will necessarily be baked into the farming reward.

You can argue that is will be prohibitively expensive to store something forever, but to claim it is garbage after X months/years/decades/centuries is impossible; the contract between the user and the network is to store the data forever and the network cannot and should not judge the motivations for this.

Perhaps access details have been put into escrow as part of a will in testament. It may not get accessed for decades. However, it’s value is obvious to those associated with it.

The safe network is all about persistent storage. It remains to be seem how the economics will pan out, but it is a key feature and the architecture pivots around this. To reiterate why:

  • anonymity: the network does not know who has access to what data, never mind for how long

  • security/permissions: the network does not know account balances, as it does not know who owns the safecoin. It cannot transfer them elsewhere either, as it cannot sign the transfer of them without the private key.

  • simplicity: the network does not need to track who has access to what and when. If the data exists, access is possible.

  • indirection: data maps just address data on the network. Data is shared by 0-n data maps. The network cannot read or write a data map, as they are encrypted (for private data at least), so cannot know who has access to what, even with access to the data map raw data.

  • account security: private data stored by a user can only be decrypted when a user provides their login credentials. The network doesn’t know these and nor should it for security reasons.

I am sure there are many other reasons too. It fundamentally changes the way the network works, due to its distributed nature. There are layers to separate concerns and limit what the network can do without the user’s permission and foe good reason too.

This is why persistence is so key to the network and why it has been discussed and analysed over and over for feasibility. It wasn’t adopted lightly on a whim and we will not know the results until we get closer to release. I believe the arguments for it working are sound (as I have outlined at length on this thread), but be in no illusion that it will be easy to change, without a lot of effort and compromise.

If you want a secure, distributed, autonomous network, some difficult design decisions and compromises need to be made. I am sure there will be forks that try to reach a different set of goals, but those designed in right now are Maidsafe’s.

4 Likes

The chunks address is a function of its hash and if when the group goes to store the chunk finds it already existing then the actual store does not happen since its a duplicate. Remember self encryption will cause the exact same file to have the exact same chunks

Well you are wrong.

since when? That doesn’t even happen. The farmer is not retrieving cached data, its nonsensical since cached data is data that has been got before.

No. What do you think caching is?

That example was also a gross over simplication to attempt to see its effects. The reality is not quite so extreme and it varies between data types. See my post for a explanation.

1 Like

Exactly and if people try to make it different then its not the SAFE network anymore, its another network. Rental means you don’t own the data but the one who rents it to you does. Bye bye ownership. Bye bye persistent Data, bye bye valuable data.

This is one reason I and others have so defended SAFE’s model since its important to the world, to change the way we are being railroaded into our data being owned by someone else.

5 Likes

If someone wants an example of people happy to supply free use of their computer, storage, phone line for the purposes of communications, downloading information and sharing information (messages & upload) they can look at the 1980’s BBS (Bulletin Board System) that were prolific around that time.

While it doesn’t directly support any particular model it does show how quite a number of people in each community were willing to donate their spare resources for others to use. The problems of SPAM in these systems wasn’t the same as for a fully interconnected network like the internet or SAFE.

A video from the eighties which is an episode of the “Computer Chronicles” focused on modems and Bulletin Boards. And Yes I often used them for many things.

1 Like

I didn’t mean a spatial distribution, but a temporal one depending on age, something like data is accessed 100 times in the first day, 50 times the second one, 20 times the third one, … with real figures provided by a Maidsafe study.

I meant growth of data.

Usage is like a Gauss curve, meaning that a file can never be considered not used any more. But below a certain usage threshold it should be considered not lucrative enough by farmers. I consider this happens after 3 months and IMO this delay is very optimistic for farmers. You don’t think so, but the problem is that Maidsafe didn’t provide any studies about it.

Yes, my Excel file is too simplistic to model them. There will be MDs that will be permanently updated, but not all. Also to balance that, there will Immutable Data that won’t be accessible at all and so cannot provide any rewards: If you update a file then the previous chunks are still present in the network but are not referenced anymore by the MD entry that pointed to them. Previously this was true only for files > 3 KB but now this is true for any file size.

Data deduplication lowers the ratio popular data / total data. Also popular data will be fetched from cache. All this means that the number of GETs coming from popular data won’t increase farming rewards that much.

It should be Maidsafe’s responsibility to provide such a study.

1 Like

As an example a popular video is downloaded by many people. The video data is stored permanently as chunks all over the network. When the video is downloaded many times the chunks get cashed, copied and held temporarily by many other nodes than the nodes that hold the permanent chunks. And I was thinking that those extra gets from the cashed data could generate farming rewards in addition to ordinary GETs.

I wasn’t even born until 85 so I totally missed this era of the net. :smile:

1 Like

Sorry if someone already responded - I didn’t see someone referring to it within the following 50 or so posts… And I wasn’t capable of reading everything here…

This is exactly how I thought it is planned

2 Likes

You’ve got to stop thinking our data is being owned by someone else with other models, with the recurring fee model or other models, we are still the only ones who have our data. It’s not being owned by anyone. You could also have the data being forever there by pay for its access