Appendable Data discussion

appendable-data
immutable-data
mutable-data

#221

This is exactly what I’m thinking this last days. Ok, we’ll never sure that some data will be erased so we must deal with that. But we can take advantage of two universal laws, related to each other, such as the optimization of resources and laziness.
A node may not take the trouble to remove flagged data, but it probably won’t bother to send it to a new neighboring node if it can avoid it. And a new node will probably refuse to save data that won’t provide any benefit.
I call it lazy delete. In a way this data will fade slowly as the network nodes of a section change.


#222

I’d make a comment here that if you upload an immutable file then by definition you have uploaded perpetual data. Now if you “lose/destroy” the datamap then that file is effectively destroyed.

But in agreement as far as the SD/MD/AD/whatever types are concerned.

This is solved by the fact that immutable data is where dedup occurs and without the data map any chunk is effectively unreadable and just wasted space till someone uploads the same thing then dedup occurs and the network benefits.


Of course the biggest issue is the question of public?/private?

It might be easier to step back a little and look at your earlier analogy of writing in the sand and how everyone who sees writing in the sand knows its temporary.

So if the SD/MD/AD/XD/whatever is marked as version-kept then its perpetual (private or public) and if not then its writing in the sand (private or public). This can be safeguarded from people marking perpetual one day then later on marking it as sand by only allowing the marking to be changed from sand to perpetual and no option to reverse that. Maybe allow for mistakes by being able to reverse it if the version has not changed AND noone has accessed it since.

You can.

The forum though gives a 3-5 minute window so any edits in that window are lumped together.

So for 3-5 minutes after posting you can edit and its not showing as any edits.

OR if you make an edit there is that 3-5 minute window to make multiple edits and it shows up as one larger edit.

Thus its not like what AD will do but half way I guess.


If you can truly modify private data then the solution to public data is to do like most systems (eg writing books, email, etc) and that is use apps that make draft copies “in sand” of your writing till you are satisfied its correct then publish (make public) the work.


@dirvine David, maybe the solution to perpetual data is to allow true delete and then have a billion monkeys on “typewriters” to recreate the lost files as public. :thinking:


#223

Yes, because the network doesn’t know time.

But having a few minutes window during which an AD could be saved several times without creating intermediate versions would answer some objections raised in this topic.


#224

As you say. It did occur to me that would solve a few issues too, but didn’t suggest it because of the time factor.

Even the “if no one accessed it since” is problematic because of caching. But then again changing it to perpetual would invalidate any caches I suppose and could work. So rather than time have “not accessed since last change”.


#225

That’s right, good that you point it out, it is incoherent. I did mean the SD/MD/AD in the first part, but of course the second part about dedup is not applicable to those types (disregarding the possibility of AD just pointing to ImD).

So, to clarify, I am considering the ImD to still be immutable, and outside of the discussion.

And I might as well take the chance to point out that there was a bantering (if that’s the right word I’m seeking) tone to that long text of mine, it was late at night when I wrote it and I got “feeling” :slight_smile: I often source interesting (to varying degrees) ideas late at night, and get more vivid in my expression of them.

It’s not so much that I think that view is necessarily a full picture, or the right way to see things, it’s more like “oh, oh everyone, I can feel it…! …here is a way to see it” and then I enjoy going into that thoroughly … FWIW :joy:


#226

@dirvine

In all these discussions we may have glossed over an important question.

Its about datamaps, where they are stored and how to “lose” them. This is in respect to private data since once its public it cannot be reliably lost anyhow

There was talk a while back that the datamap would be stored in the last chunk written out for the file and a pointer to that would be supplied

Then there is talk above that it is stored in an AD

But in both cases the datamap of a private cannot be lost because it is in perpetual storage and remote chance it can be found by various means and then trawl through AD history.

Yes if you do not share your AD addresses it would be hard for a random person to search for it.

BUT I guess the real issue is that you can retrieve it easy enough by trawling back through your ADs and if you can so then can the authorities also. Very bad for whistle blowers and even for ordinary folk.


#227

I second that. The fact that the network has from the start been planned to be efficient and not hog resources by using something useful such as PoR, deduplication, opportunistic caching, anti spam mechanisms planned for outboxes, etc were some of the main reasons I was initially attracted to SAFE. To keep private data after wanting to delete it would be a complete waste of resources.

The only arguments I could see against it atm would be if someone was left logged in and someone else deleted their data against the others will or perhaps the owner of the data accidentally deleted data because he or she drank too many pints and would like to recover it. Could be an opportunity for an app to have a recently deleted folder that only actually deletes after say 30 days.
Just some thoughts.


#228

@Nigel Providing the capability to delete private data might well cost resources, and could have other effects - on performance, security etc - so these decisions are complex.


#229

It isn’t, because the previous versions in SAFE are open for anyone to fetch, when nowadays only Facebook can access the versions I have thought to be gone.

I’m quite sure that there are going to be easy tools to utilize all data that is openly available. For example a “show previous state” button in a browser etc. Perfect stuff for procrastination.


I understand the hurry to get the network published and I think it can be done with append -only data. Still I would like to see the possibility of true deletion of private and public data to be a goal, maybe for later developement.

This whole privacy / publicity thing is actually quite complicated. I was thinking yesterday, walking down the street, that everybody see me walking, I’m not hidden in anyway. Still it would be awkward if everybody would suddenly stop and start to stare at me. Privacy is not only about “not seeing” something, but it is about shared behaviour of “not saying out loud” something. Children often “publish” stuff that everybody knows but leave unsaid. Sometimes it’s funny, sometimes awkward.

I hope I am not rambling too much…


#230

I’m on board with the idea of the ability to delete private data, I think that is desirable from a privacy perspective, but the ability to delete public data is fraught, and problematic.

Suddenly I can be compelled to take down information from the network that is being utilised by and informing world. This could be through legal obligation, social pressure, or other forms of coercion. And we’re back into the realms of DMCA takedown notices, book-burnings etc.


#231

Only if they know who you are. There would still be the possibility to publish things permanently if done anonymously. Or if you throw away the keys.

I think that if deletable public data is technically possible, it will be done sooner or later, if not otherwise then forking. It’s the way people are used to think the internet is working, even though it is not. Maybe there is room for two networks, one with permanent only public data and the other with permanent and impermanent.

Anyway SAFE with undeletable public data is much better than no SAFE at all.


#232

Yeah, but in many instances it’s desirable and right that I would want to publish my identity, or that of my organisation, alongside my work. Think about journalists, authors, and academics the world over. That’s part of the history and context of the work too, it’s important to inform the reader.

It is technically possible; but also most probably, as I’m arguing, undesirable.


#233

Exactly. I should say in this convo, actually selecting data to delete and doing that is a huge area for the network and one reason it took us so long to make sure it is possible. Now before everyone shouts, just do it, I think this convo shows many aspects of not doing that straight off the bat.

  1. Perpetual data (AD and ID) is valuable
  2. It can be used to allow data to evolve with history
  3. We can allow apps to show latest versions of evolving data etc.

All this if the data is published, i.e. the data map is available and open. We know private data does not have available and open data maps, they are stored with the user of the network part in session packets and part in encrypted chunks on the network. So they can delete data at any time, simply by not having access to data maps.

So what does delete data mean to us (right now). It means removing the ability to read obfuscated and encrypted data that could have formed human readable data. That is what we mean by delete, we mean remove the ability for anyone to read the data in an understandable form. Sort of like me saying here is a binary 0 it is part of some file, you can read the zero, but you still have nothing. In our case you don’t even know what file of how many other zeros and ones you need, never mind how to arrange and decode them. So delete and keep it incomprehensible is a very similar thing.

Another thing this thread will show, is if we keep putting the you cannot delete published data as the story then what can be built? Well everything the internet current can do for a start! I would argue.

Then when we finish this debate, launch knowing we can have delete on the network, will we ever need to use it? I am not sure, but lets see. I can see a ton of debates and I could see both sides, but I think poking this dragon enough will show even with immutable chunks and Append only data lists of pointers we can have everything we need.

I caveat this by saying there may be a place for owned immutable data, that is never shared to anyone except the owner who can delete it, no sharing, no caching etc. But even there I am not convinced as it probably did not ever need to exist on the network.


#234

Maybe there is misunderstanding? I am not wanting to get rid of permanent public data as a possibility. It would be really great for newspapers, books… that are even nowadays published with permanency as a goal. But I am against having to make all the public data permanent, like personal websites, Facebook -type of thing etc. They should be deletable as a default. Like they now are, effectively speaking. I know people who have deleted their Facebook account, and even though it is kept in the basement of Facebook, it is effectively deleted from their social context, from their realms of addictive behaviour etc. This should be possible in SAFE if it is planned to be replacement for current internet.

Scientific papers, magazines, newspapers are all proofread many times and the bar for publishing is considerably higher, than it is in this forum or Twitter… I would not like to raise the bar of all the public activity to the level of printed word. It would kill creativity. I want a chance to be a bit foolish, not to think all my choices for decades to come.


#235

Firstly that would be very bad in itself don’t you agree?

Secondly, it isn’t true. ‘Deleted’ facebook data is potentially available to anyone that facebook knowingly share it with, companies, employees, governments, and unknowingly too through data breaches etc. We can hope it isn’t used but I think where facebook, Google et al are concerned it is clear we should assume the worst.


#236

Have I succeeded in a minor way to show the importance of keeping ID in history? Or you added bit just to satisfy me so I quieten down a little :stuck_out_tongue::stuck_out_tongue_winking_eye:

Are these session packets only existing while one is logged in. Or are they longer term in ADs? I’d expect they need to be a part of the account structure ADs/Immutable data. And since they can be deleted then that implies ADs? If in ADs then history is kept so a retrace of history will reveal that map again, wouldn’t it. And thus the implications if account details “rubber hosed” out of the user by court or otherwise.


#237

You are definitely right. I am not arguing against that. I am just saying that my ability to control my “public persona” is better with the current model, than it would be in SAFE. Check my reply to @JimCollinson above.


#238

Except they aren’t, are they? Where are you hosting your personal website? Squarespace? Amazon Servers? Is is cached by a CDN? How can you be sure it’s really deleted… you can’t, it’s not in your control.

And let’s not forget about the Internet Archive.

And then there is of course anyone who cares to take a copy, or perhaps tries to put words in your mouth, or rewrite/distort what you said and then deleted, by modifying a screengrab.

With permanent versioned history of public data the at least it is transparent, upfront, open for anyone to observe and be informed.


#239

These are binary blobs users create for login. They have little structure, but hold data maps (to a tree of other data maps)


#240

If they hold the data maps for private data then they have to survive across logins don’t they. So where would the datamaps be held (or the pointer to the data map)