Data Recycling Incentives

This could be a serious exploit. I think I’m in favour of charging the full price for storing public data, like private data. The network rewards the uploader of popular public content anyway, and I expect donate/tip features will be very common on SAFE websites, so there’s plenty of opportunity to earn back any PUT costs from uploading valuable public data.

I think this is quite unattractive marketing-wise. With the current system you pay once, and then you can rest assured that your data won’t be lost. Automated deletion also opens up new weaknesses in the network, imagine that this feature would glitch or be exploited by an attacker.


Yes, and that’s a beautiful way to signal the farmers (via the MaidSafe network) that they’re not storing garbage.

I think that would be quite hard (how would they exploit that functionality?).
I believe it actually increases security because if you lose your pin/pass phrase/whatever, you can rest assured that your data will be wiped out (after a while).
But even assuming that security concerns outweigh benefits, I would argue that the cost of additional security (i.e. the act of not deleting disowned data) should be borne by the owner, not by the farmer.

Again, I agree. But (roughly) knowing the cost of data, one could prefund it and/or finance it by revenue from Smart Contract that uses it. It’s the same principle - the risk and cost should befall the owner of data (or Smart Contract), and not the rest of the network.

One more freeloading scenario: let’s say there’s a private company similar to CERN (in any field of science). Imagine the cost of them dumping EB’s of proprietary data on Public network. It’s not encrypted, but it’s saved in proprietary format, it’s compressed and even if someone could figure out the format, they wouldn’t necessarily have any use for those archives. Even if those archives were saved in an open format, they could be useless to anyone but the owner.

on the other hand: if you misplace your keys just for a while you will lose your data and the keys, when found again, will be useless.

  • Also if you’re paying by year, you’d have to lose your key at the end of the year and not find it several weeks.

  • After a week or two at least 50% of people would probably want that data to get wiped out.

  • There was a discussion about password recovery here. The Devs plan to create a “have your trusted connection reset your password for you” type of feature.

  • Regardless, you could always send SAFE to the “refuel” address from another account to keep the data alive.

The data security of deleted data seems irrelevant to me.

If you don’t know about it, it doesn’t exist.

Hard drive space is cheap, and it will continue getting cheaper indefinitely.

A lot of arguing over nothing. It will cost what it will cost.


That’s a major point in the design I think…who decides. Who would have thought in the ancient world, that thousands of years later people would dig their trash and find value in it?

This thinking reminds me of when digital audio first came out and the techies were so enamored with the format. They archived or remastered the magnetic masters to 16bit/44.1khz sony pcm and discarded the tapes.

No thought that in 10, 20, 30 years time technology would allow for much finer resolution…a total lack of foresight, borne from the thinking that something was now redundant/useless …let’s throw it away. So you throw away a perfectly linear signal format for what Neil young would describe as akin to listening to swiss cheese…there is no going back on that decision.

I think the answer should be that market decides, because in all other approaches there’s a problem of externalities.
The price for storage “renewal” doesn’t have to be high - it has to be slightly higher than the network-estimated cost of capacity (including power, say with a 5% overhead for management).
As long as this is not the case, all sorts of garbage would end up on MaidSafe, as opposed to competing similar OSS projects, and ultimately cause its demise (because the ever increasing cost of garbage would be transferred to all users, making MaidSafe price of capacity less and less competitive). In the long term percentage of garbage would approach close to 100% of all content.

The MaidSafe community could organize fundraising for public “data at risk” (tongue in cheek). If something looks remotely interesting to anyone in the whole world, he could pay $10 bucks a year to sustain few TB of public data.
OSS people have had that for about a decade (adopt an orphaned OSS project, etc.). For example, in case of Debian:

So with the current design, the SAFE Network is going to be almost 100% 'garbage, which will cause it’s demise.

Tell me your having a laugh?

If there’s no way to charge for data space, MaidSafe will disproportionately (compared to other similar networks, or maybe its own forks) attract freeloaders.
If the cost of PUT operation contains the cost of storing data for say 5 years, then it won’t (because you’d pay in advance), but then people with short-term storage needs will be able to store data elsewhere cheaper.

Assuming there’s no deletion and the cost of PUT reflects only the current cost of storage, then:

  • Year 1: MaidSafe: 10% garbage; Competing Network A: 8% garbage
  • Year 2: MaidSafe: 20% garbage; Competing Network A: 8% garbage

Why wouldn’t it end up being 99% garbage?
In fact I forgot to mention that considering the cost of storage will be borne by all users, the cost of storing MaidSafe (in BTC) should gradually become more and more expensive vs. Competing Network A, which would somewhat slow down the growth and use of MaidSafe.

Of course I may be wrong, but I haven’t seen counter-arguments yet.

Probably so, but the competing network would also create the risk that your data is gone forever, They also would have to have some mechanism that knew where the data was that needs deleting – a missing feature that is a feature in being missing with MaidSAFE.

If most data belongs in the landfill, just build a landfill. I pay once there too…

Blockchain-based guys plan on having on time-based contracts and it’s visible from the contract when it was concluded and when it expires.
MaidSafe could work without deletion, but I think it would require focus on long term storage and cost would have to be paid up front (since the landfill pricing strategy would be used).

MaidSafe had never intended to be a for-profit entity – It’s plan all along was to give internet access to the world.

There is a plan to Archive unused files – To combine many into one so that there is less overhead for the file supervision – It isn’t like they haven’t thought things through.

I suspect most people are going to pay for their storage with their own hard drive space, and most folks don’t delete a ton of files off of their own hard drives either – You just buy more when they get full… Most folks have mass excess hard drive space. All of my machines are less than half full. It isn’t a scarce resource.

I have webpages up from the 90’s that I haven’t paid for for 10 or 12 years. Hard drive space is cheap, and it would cost more for my former ISP"s tech to go though and delete than it would to leave them be…

I agree on that, I am just additionally arguing against creating incentives for waste (uneconomic use of resources), that’s all.

(Related to your comment on HDD space, I was thinking about this today: as people buy larger HDDs, many won’t migrate their worst-performing vaults, so those would be replicated elsewhere. Having multiple identities (say 1 per HDD) would allow farmers to realize better granularity in vault selection. From earlier discussions we know that there will be reputation-related penalties which may make this option uneconomical, but we’ll have to wait and see whether there will be zero-earning vaults (those can’t cause any damage when terminated).)

I think private storage should be re-purchased bi-annually or so, but public files should be paid once and available forever.

I agree fully with dirvine when he said that he wanted to do something to fix the 404 epidemic of www.

Ideally yes, but can we distinguish:

  • legitimate public data
  • encrypted (private data) stored publicly
  • garbage stored publicly

Or does the archiving mechanism make this unnecessary - i.e can the network handle the level of waste from “parasitic public data”?

In which case, maybe it can handle private data too (i.e without them charges), as appears to be the plan.

Waste will drive the cost of storage up and make safenet storage more expensive. We should certainly try to limit it through economic incentives.

I like the idea that all publicly linked data is retained. This allows everyone to have a stake in keeping data alive.

For private data, it is more tricky to know whether it is useful to or not though. Some sort of recycling idea sounds like a brilliant concept to me !

I’ve tried not to read the threads on this subject not sure why. So I won’t be aware of cover ground but as someone new to the issue

If you charge a subscription for the space and you take anything that looks like profit doesn’t that mean then if SAFE hosts apps like Pop Corn time claims of infringement will have more traction if even if action is still impractical?

I inevitably think of SAFE in mesh terms such that any end user that contributes a portion of a devices resources such as a phone should always expect to get at least that amount of resource in return and probably can’t be expected to be able to calculate the value of some of them abstract benefits of the network.

Curiously the SAFE even on an end user owned net seems a bit like a fractional reserve system. If I am using all of the resources of my device they won’t be accessible to others and if other are doing the same the SAFE becomes inaccessible. Would there be any way to create a stable partition of contributed resources. For a while I thought there could be dedicated producers who would also be paid for reserve capacity or over capacity to help in times of stress. But I see the SAFE data economy, at least for timely access, could crash like a current economy. No guarantees. Storing static data is one thing but once more dynamic services come on line… its why I couldn’t read these threads.

Say we created a large geographic mesh of Samsung Galaxy S6s. Say that no one on the mesh would ever have more compute power or storage than the a single phone provided. Would it not still be possible to provide most of the advantages of SAFE? Maybe this pushes the issue of distributed computing and latency and longer distances meshes. Alternatively a lot of current people on the internet with big resources decide to leave the net and devote their current capacity and reserves to SAFE, but if that were possible would it last?


Data never being deleted worries me. Even David and Nick have admitted that the majority of data uploaded on the internet is never retrieved again after a couple hours. Think of all the temporary data which is simply not needed after a short period of time. SafeX can have thousands of buy/sell orders a day that are fulfilled within an hour. Do we need an indefinite history of buy orders? If I want a car ride from an Uber app, does that car ride request and fulfillment need to be stored forever?

There have been a lot of ideas thrown around on how to be able to do this, and I think eventually one of them will need to be implemented. Another idea is for a flag to be provided on upload to automatically delete the data. I know that I won’t need an Uber request that lasts for more than a month – it can be deleted after that time. Maybe an “importance” flag can play a role in determining what data to recycle (and the cost associated with its upload).

Just throwing some ideas around.

David regards Junk/Garbage Data as ‘temporary data

A problem we need to solve in any case

Archive of old data, potentially removal of stale data (debatable) is an issue the network will have to tackle as time goes by. The network wide proof of resource means no or vastly reduced delete of data. This will mean data does accumulate and in some cases may be private or junk data, although we must keep in mind the real time de-duplication of even junk data (temporary data). There is no use the network spending resources looking after this data in the same manner as current valuable data. So we can look at archive personas. These are simpler than we imagine. It works like this.

Archive locations are inserted into the network, initially these will be the leading 6 bits of the 512 bit address. These archive chunks will be akin to sparse files and may contain no data to begin with.

As data is deemed redundant, i.e. not accessed over a calculated set of events (MaidSafe does not use time across the network, this is vital) then it is transferred to an archive store. These archive stores will be held in high ranked large disk space machines just as normal immutable data is. The process is as follows:

data chunk ab764sdkl…(512) is marked for archive.
All data managers synchronise this judgement.
The data chunk move request is transferred to the data managers for ab764s0000…(512)
The PmidManagers (vault managers for storing vaults) then are issued with a transfer request to archive
The PmidManagers then request the PmidNode sends the chunks to the archive address.
The PmidManagers are then instructed by the data managers for the archive to delete the record of the chunk.
The store node itself will have deleted the chunk on transfer to the archive data managers.
This process can replace the management on millions of chunks to a single chunk management cost. This mechanism is fairly straightforward and requires a double targeted attack at least. This size of attack is larger than the network population.

If data is requested from the network and not found then an archive retrieval request is carried out. The data is served and this process is reversed, putting the chunk back in the general network data store once more. This process increases network efficiency and allows older data to be maintained for an extended period.

Fall back position

What if all of this did not work? Well this is unlikely, if the currency failed, then the safecoin simply becomes an internal market value token for network wide proof of resource. This will mean miners are paid in another currency for their resources. This will mean an internal market system similar to a bidding system be employed.

If resources could not keep up with demand then clients can be restricted in data storage and be forced to buy resources in a similar way as above. A simple mechanism is a safecoin == 2* network average., otherwise people can only store up to network average. The management of this type of system is in pretty much in place today. This would be winding back some code to today’s code base for client managers.


That article was written 1 year ago. :smile:

If archiving works as described, then small vaults (mobile devices) will keep “active” data, while very large vaults (mega servers) accumulate “inactive” data.

As a future farmer, I was worried about “inactive” chunks piling up in my vault. But SAFE will autonomously move those chunks to a “landfill” vault… I’m borrowing @jreighley’s term. This means I don’t need to reset my vault in order to get more “active” chunks.

It may not be a perfect solution but it does make sense. The largest reliable vaults should store “inactive” data, making it the perfect home for these orphans. :wink:

Maybe “archived” data is not used to calculate the storage rate (1 Safecoin = GB). This keeps our storage rate updated with “active” data, and costs measured for majority of the Network.

Why would anyone provide a landfill vault? We finally have a use for big server farms. They are the only ones able to provide “above average” storage capacity, as well has bandwith. And if the Network does this autonomously, they really won’t have a choice. Our great fear becomes our solution.

ANT Tech for the win!

I hope I have not misunderstood this article, and maybe things have changed. TestNet3 will be a big eye opener for many. If it works more efficiently, then recycling becomes less relevant. Keep in mind, we want a fast Network, not one that could get bogged down with user deletes. Which by the way is a PITA to implement.