Editable data and rewarded deleting

Was wondering if you can edit the data that you uploaded. That would be extremely helpful for websites and just anything really. Instead of uploading new data you can just edit the existing data?

If that’s not possible, why don’t we introduce a small reward, taken from when people store data, maybe take 5% extra than what the normal price would be and that 5% is permentantly locked until that data gets deleted in which is gets returned. Or it could be consumed slowly based on how long the data is around with a Max cap of 10 years at at 1% remaining reserve. I think putting reserves in place to reward people to delete data would be good for the network as well as it makes it more efficient and would make sure that much more people will not store useless or spare data taking up useful space. So storage would be way cheaper for everyone.

If you use the MutableData data type, you can (I believe, for free) modify the data as many times as you want.

1 Like

I don’t have a reference, but am sure each commit of the MD costs one PUT.

7 Likes

I wish this was the case. :slight_smile: However, SPAM attacks would likely flourish though without a PUT cost. In regards to the OP, I think a reward equal to the current PUT cost would be fair enough. Let’s say I store 1200GB as MD in 2020 at a ratio of 1GB/PUT. Then I realize I don’t need 1000GB of it in 2022. But technology has advanced so now the ratio is 100 GB/PUT. So the improvements in technology has naturally made my old data less valuable to redeem over time, it cost me 1200 PUTs in 2020, but I only get 12 PUTs in 2020. This is only a simple example (it could be for PUTs or SC), and since it would be free market driven there could be cases where I get more PUTs back because space on the network has become so valuable. This being said, I’m still not clear on what happens to an MD when you delete it. Does it really free up space on the network or is the data just unrecoverable/destroyed?

That was the model for SD and when it changed to MD the charges are for any updates. To update without charging is unsustainable and open to massive spamming/DOS attacks.

Beaten by @jlpell I must remember to read later posts before replying. I must remember to read later posts before replying. I must remember to read later posts before replying. I must remember to read later posts before replying. I must remember to read later posts before replying. I must remember to read later posts before replying.

I think that they are going with removing the data, but keeping the MD with a record of it being deleted to preserve the version number. A few K kept instead of the upto 1 MB. Or maybe they are just deleting it completely.

The reason to preserve the version number is to prevent someone uploading a legal document (version 1) and the courts or whomever can rely on knowing if it was ever altered by the fact the version number changes.

If you delete completely then the person can modify the document and by deleting the MD first then uploading the altered document then has version 1 again and no one can prove it wasn’t the original.

2 Likes

Interesting use case. I know it’s just an example. However. Court documents fall into the category of “immutable data.” And: We already have a type for that.

Do have current plans that Immutable Data could be removed? I saw it was talked about.

1 Like

Yes that is likely. And its only an example.

Basically the reason is to prevent a rewrite of history (indexes, appendable documents of record or ledger or whatever) and the author claiming that since its version 1 it must be the unaltered version. Or version x and everyones copies of version x are fake since he/she can show the definite copy is still version x. But he/she rewrote it and bumped it back to version x

I have not heard plans of that. Its very important to have immutable data and files.

What I have heard is having the ability to store temporary style of files as MDs so they can be removed later on. This would be an APP level function. Say using NFS with a file type of temporary.

Ah yes the currently unsolvable problem. That is over 18 months old and no progress in that area.

Look at David’s response, its still immutable data files, just allowing for non-persistence. So they are not wanting to get rid of immutable data at all.

I still reckon for that the user should be the one to specify that the data is temporary when storing the data and if no other user has uploaded that chunk (dedup) then it will expire.

But some ideas were put forth and they have been shown inadequate and currently we are at all immutable data is persistent

1 Like

Yes. It says: Owner should be allowed to delete immutable data. If all owners delete it: It is really deleted. If it has many owners, it becomes really immutable forever.

This is like having multiple hardlinks on an EXT4 filesystem. You can have any number of hardlinks pointing to the same data on disk, and the underlying data stays on disk until the last hardlink is deleted.

That being said, I like @neo’s temp data marker.

1 Like

That is for different thing. You must know you will delete it later.

If you want to delete pictures of your ex: Deleting Immutable Data.
Yes: You can delete your link to it. But: Network stores unnecessary data.
Probably not big problem. It would be elegant if such things could be removed.

As I mention there or elsewhere. There is a major problem even with that. People can share their datamaps which is the same as copying a file to another person. But the network will not record that other person as being an owner but in reality he/she is.

It could even be sharing a datamap as part of a paid project.

To me its unethical for the network to be able to delete immutable data that you “received” believing that its going to be immutable.

But also it requires the sections to have a link of who owns what chunk which defeats a portion of anonymity which is central to SAFE.

So there are some real problems with that. And I believe from the talk that the idea of recording owners against chunks in order for deletion has been abandoned for now.

Imagine a government running enough vaults and using that to record owners of each chunk of every section they get a node into. Given time and resources it would be possible adding/removing their nodes that such a entity can get into most sections every year. They don’t get any control but just record owners of every chunk. Then round up people they can associate the ID with who they don’t like the (public) data they saved.

2 Likes

Unethical is a strong word.
Immutable is a technical word.

People see data.

They already know: Data gets deleted. Why confusing?
If they received it, the can copy it: Won’t disappear.
20+ copies: Forever immutable. According to the RFC.

Why store the owner? Store just a one-time delete key. If Imgur can do it, Safe Network can do it too. Generate it on-the-fly as function of owner ID and chunk ID. Delete request has to be signed with right key.

Not a very important problem maybe.
But: It is possible, and those are not problems.

Because SAFE is now selling itself as a persistent store of files. See David’s blogs

People are going to use SAFE thinking it is what it says it is. To then delete without warning then its a scam.

I know that they would not make it such, but the safeguards need to be in place before any deleting of immutable is done.

And doesn’t this seems convoluted since dedup means nothing is actually stored and just a field updated so say you uploaded it.

BUT one of the touted features of SAFE was that to send someone a file you just send them the datamap and they then have a copy. So now we chuck away another promise of SAFE.

Just because we have been used to data being stored in a way that it may disappear at any time and the only way to protect it is to back it up again and again and again, is not any reason to build that into a system that is intended to be persistent data. No need to back it up since the network is already doing that.

Then you say to people “Oh you need to reupload that file because that persistent data may just disappear and you need to backit it up on SAFE to ensure it remains” That is just messed up.

I have lived with non-persistent data for many decades now and understand backing up and the need for it. Paper tape, magnetic Tape incl DecTape, 8 track, 9 track, even punched cards backup. And to reduce SAFE into a system that has to have every file you are sent from other SAFE users to be “backed up” (reuploaded) is just plain stupid and against the fundamentals of SAFE’s persistent data.

Any system to delete immutable data has to be at the request of the user when uploading and NEVER at the discretion of the network or other users. And if I receive a copy of a datamap then it needs to be marked as temporary if it can be deleted before or after I actually receive the datamapl.

1 Like

I understand it would be altering the profile.
But it would alter it in a way that is not new to people.
“You can’t delete your data”: This is new to people.
Not a bad thing. But new and unexpected.

Counter argument: Deleting the data map looks like deleting the data.
The chunks don’t disappear, but that is a technical detail hidden from users.

Truly deletable Immutable Data is to avoid storing garbage user says is garbage.
It may not be a big problem. But: I like that RFC.

It would be not that. It would be: If you want to own the data, copy it.

Dedup is invisible to users.
Technical detail to make the network more efficient.
Deleting unwanted chunks would be a similar thing.

Average Joe has his “file”. Not his “set of chunks that are shared by multiple copies of this and maybe other files.”

Over 50% of the people now using the internet do not understand backing up. Faceless and google etc do it for them. The loss of data they see is not due to system/storage problems but the company deleting the data on them. nd they direct their anger against the company. For SAFE they would be directing their anger against the network and we’d lose users over it.

Not to mention governments forcing whistle blowers to delete their files.

Its a new world since the last decade. The majority of users of online data today are not brought up on “you must backup your online data”

So no I reject that its the norm.

For the user they can “delete their data”. If they uploaded personal files then they can destroy (lose) their datamap and if no one else has uploaded it then that file is effectively deleted. Only the chunks remain. If they gave another person the datamap then that other person still has access to the file.

ALSO I never said that we should not have a immutable type that is not persistent.

Just

So you are actually arguing from a perspective that I say never ever allow immutable data to be deleted.

I just say let the user decide to upload it as temporary immutable files (ie can be deleted by them) and if others upload that file then lock the chunks so the chunk is not deleted just because the original uploaded “deleted”

Did you read through the comments. Not everyone agreed and David even said RFCs are not the position of Maidsafe the company but the individual writing the RFC.

The RFC had problems and the problems were enough that it was not pursued up to now, even though some work had been done. MutableData incorporated many of the ideas when SDs where converted into MDs and I suspect that it has become the current model for deletable data. Deleting immutable data is a back burner issue.

But Forcing users to do the unnatural thing of reuploading what they can already see is there is not straight forward, counter intuitive and considering the perception of online data has changed with the current generation of online (non-technical) people who are now most of the users, this forcing to reupload what they can see is already there is convoluted thinking.

I say that any immutable file that can be deleted should be marked as such so that an APP and/or doing direct datamap sharing can identify the file as such and take appropriate action. Most likely a network function to mark the file as no longer temporary. (Costing one “PUT” charge obviously)

None of this stupidity of needing to reupload what people can clearly see is already there. If it was implemented this way then the howls from the non-technical user (>50% internet users now) when a file disappears because their friend didn’t tell them that cat video he/she sent them was about to be deleted. The user would be pissed off to say the least and if they had posted the file (datamap) to forums, SafeSocial media etc then its worse and of course there are worse situations that I can think of.

I mentioned one above. If user can delete any immutable file (chunks and all) then they can be forced to delete their files. EG whistle blowers, websites the ABCs don’t want etc etc.

6 Likes

Not really if put in the right context. It is essentially the same as a DVD or a CDR. If you don’t like the data you immutably wrote to a CD then you can just cut the CD in pieces and spread them to the four corners of the earth. As neo pointed out, the binary data is still there on each piece, but it doesn’t matter. If you don’t go to such an extreme and just throw the CD pieces in the trash, your local landfill will keep those plastic chunks for a thousand years (maybe more for an M-Disk :wink: ), but good luck trying to reassemble it and getting it to play in your dvd drive at that point though :smile:. Seems to me that this is a decent analogy to deleting one’s datamap when explaining the mechanics of how deletion works to the average user, if they even care to know what’s happening behind the scenes, which they won’t. Or just explain to them that it’s no different than those physical hard drive shredder services.

It seems like you guys are advocating for the same thing, but just different ways to implement it. Norimi wants it more transparent, similar to the way hardlinks work on a typical disk filesystem. Neo’s way is more explicit since since you don’t need to keep track of ownership or number of links.

I think @neo’s temp data marker plan (was it first described on this forum or in safedev?) would work really well for minimizing garbage from well-meaning users as long as there is a decent incentive (in safecoin value) for people to actually use the marker, and there would also be an easy way average joe user to use it, such as if it was automatically added to files uploaded to a “_tmp” folder on SAFE. The simplest/fairest incentive I can think of is that when you delete any “temp” data, you get PUTs/safecoin back at the network’s current market rate, no more no less, if the data chunks are actually freed/erased. (The use of the term “freeing” or “erasing” the chunks might be better than “deleting” the chunks in this context, because from the point of view of the user, they think they “deleted” them, and it would be good to not get the terminology crossed. Semantics, I know…)

To make @norimi happy and achieve perfect storage “efficiency” (wasn’t he the one that reminded me of the value of mud?) it seems like a typical approach would be to keep a running tally counter on the number of “links” made to the chunk. I think this is the way linux treats hardlinks… would need to refresh my memory to be sure. Each chunk has an integer for number of links. Each time a chunk is uploaded or “copied” via a datamap share, the number of links is increased by 1. Each time someone deletes their local map to it, the counter is decremented. When the counter gets to zero you know that it is safe to remove the chunk. However, I see that @neo’s point is that the network will never be able to know if someone shares a datamap offline, nor if they actually delete a datamap, so this accounting method is impossible, which makes his approach the preferred method. I’m not sure if it’s really impossible, but there’s mostly not a good SAFE way to do it. Maybe some brainstorming would find ways around the drawback, such as datamap versioning and communicating with chunks, but this adds a lot of complicatedness, performance loss, and could still be rather un-SAFE regarding other security concerns in the end, so why bother… etc.

Sound about right?

Not a bad summary. And yes I was trying to point out that I agree with a form of deletable immutable data (non-persistent immutable), but just it has to be the users choice.

The biggest issue is that we could see gaming of this occur.

So I upload 100TB when spare storage is extremely high using 100 million million chunks. But chunks were 10 million million per safecoin since space was plentiful costing me 10 safecoin and maybe they were 2$ at the time

So later on (6 or 12 moths even) I delete the 100TB of files and get back those 10 million million and sell the account to someone. Oh did I mention? This is at a time when space is more normal and its 1 million "PUT"s per safecoin. Then I get 10 million safecoin (I discounted at 10% current put cost) and safecoin is now 3$ each.

So for my $20 I get 30 million dollars. Scale it back if you feel no one would payy that for an account. The principle is still valid and the figures are reasonable using the current RFC for safecoin.

A kind of black market where opportunists waste network resources to make a buck (million of them) and since it would be profitable then it will happen.

Its like ground loops in power stations. You can waste megawatts of power. So you design away the potential for these ground loops. Thus in the SAFE network we need to not provide avenues for people to waste massive amounts of resources (traffic/temp storage) just to make money for themselves and cause the network to “suffer”

Thats why I suggested elsewhere that the person deleting the file gets like 1 put refund for 3 chunks (a file is at least 3), 2 puts refunded for up to 10 chunks and say 1 put refund for every 100 chunks after that.

So deleting all those temp files (eg editing temp files and shopping lists, etc) means they get enough back to be worthwhile, but much more difficult to game.

I think his idea of recording a (throwaway style) of ID for the uploader in meta data for the chunk and the user supplies that ID to delete their entry in meta data (and delete chunk when all gone) could work. But I’d still say it has to be marked as deletable so that APPs / people know it can be removed.

I’d also go away from storing temp in a “special” directory style of thing and just have it as an attribute of the file. File explorers can use a colour or marker to show its deletable and not persistent.

But still I have to be convinced that the rate of increase of single disk drives and the increased rate of production of disk drives will not exceed the need for any deletion. So the actual deletion of chunks may not even be needed.

  • disk sizes increasing at 10x every 5 years (rotating media)
  • SSD disk sizes increasing at a faster rate and potential for 10x in 2 years.
  • number of disk units produced each year increasing at 30+% per year.
  • Cost per unit slowly decreasing and cost per TB decreasing faster
  • archive nodes will slowly absorb the chunks that are rarely accessed. This removes the data movement required as vaults go on/off line since the more active chunks will remain in ordinary vaults.
  • MDs can be used like immutable chunks to build files with a mapping to link the MDs together to form a editable/deletable file.

It certainly an interesting are the need for deletion or not of immutable files.

And then if MDs can do the function of temporary files then do we need deletable immutable data? That might solve it all. Maybe make deletion of MDs free to ease the burden of deleting.