Appendable Data discussion



I’ve read /skimmed all this and still am not clear what is being discussed. I’ll wait for the RFC :wink:


Your excellent post raises some very pertinent points about why this is a complex area that needs a lot of thought. The censorship game has changed. It’s more about creating multiple alternative ‘truths’ to sow mistrust in the genuine fact, diluting rather than deleting it. There’s nothing new about this tactic but these days using trolls and bots to spread misinformation and disinformation rather is cheaper and more effective than banning outright. Even those arch-banners in the Chinese government are doing this now (although they ban plenty of stuff too). Coincidentally I just read an article about that here The new censors won’t delete your words — they’ll drown them out


In is not that published “anything” is true, the truth part is “this was published” (and perhaps by this authority), if that makes sense. So we do not in the future rewrite what was actually said and done now.


One of the goals here is to create accountability. This is one of the things that we lose at scale, but if the record is public and permanent, we all can be held to account for what we said in public the past. Through that we can all have a better chance to filter out rubbish, and tune into a more reliable version of reality.


I think this is key, if we capture everything published and make it immutable then I am sure there are/will be algorithms to find the most likely truth for anything published, hopefully, that includes studies, records and reports as well as the usual claimed truth. I would love where all these wars are happening to have records of what we do. When I was in the middle east then the best place for western folks was considered Iraq, until we made them our enemy. There is a ton of this about, so capturing it all for our descendent to evaluate will be important.

So rather than history being written by the victor we hopefully will have enough proof points for our descendants to write a better history than we have had, well we can hope so, but step 1 must be, do not allow published stuff to change.


How do you think this affects the will to publish in SAFE Network?


Do you really think so?

If we had Facebook-equivalent in SAFE Network, I don’t think all my posts would be copied to somewhere. Why would they? Who would?


Publishing will be forever, so set up a wewbsite and that is done forever, no renewal etc. People will like that and I feel there is an appetite. People who want to remove published stuff though will find it hard to see the hard fact that they cannot, unlike today where we think we can. This is more honest.

Folk think they delete a facebook post and it is gone, it is far from gone, it is already in the databases of many marketing companies and more. Or perhaps you can tell google to invoke right to be forgotten, that only means they remove you (do they) from the index, all the data is still there

So today folk are fooled into believing you can delete published stuff, but it is not true, worse though the original data is perhaps out of the reach of the common folk and in the hands of the powerful folk, lots of them and lots of peoples data.

It goes further, say I have a pic of you or a video that is produced by an AI, its indistinguishable (soon possible) from the real you. Then what do we do :wink:

Anyhow we make everyone’s published data and private data last forever. Only if you want to delete your private data then you can. You do this by throwing away the data map and nobody can ever read your data again. It is that simple. Data map for public data is public, so you cannot just delete that unless we had signatures to show who put it up there to delete it etc. Then you have all sorts of edge cases and anonymity issues, never mind losing deduplication, caching etc.


Yes, but still on my experiential level the situation is that if I write something slightly stupid, or make embarrassing typo I can remove or edit my post, and no one relevant in my social sphere will get back to it. Maybe it is somewhere in Facebooks databases, but so what? It was nothing major. If something (edit:someone) digs it out from there, I don’t care that much. But at this moment I want to hide my mistake, and I can do it.

Of course it would more mature to just admit your mistakes, but let’s be honest, how many really are? I see this as a barrier for adoption. Your promise is factually correct history, and I can see how that would really boost the image of a company etc. but the inablility to mask your small mishaps will drive everyday folks away.


These two things are mutually exclusive

I agree with this as well :wink: (speaking as a typo king :smiley: :smiley: )


Of course if it is technically impossible I can’t argue against it.

But I see huge value for the user in the possibility to choose if some public data will be up forever or not. And also I see “You will be accountable forever” not very appealing.


Considering most serious relational databases have a append only transaction log, with the current state being the latest (head) of this, I think we can say there is much precedence. Taken to the extreme, the transaction log is never truncated and you can return to any prior state.


I agree some folk might not like to be accountable forever, even if they think they currently are not.

Remember though in your earlier thing about typo’s, even this forum uses a keep mechanism but shows your latest edit s the current one. I can still go though your edits. So it does not harm this forum

also create a new id, twitter is a good example of accountable forver, but who is behind the account etc. ?

I am not sure it is black and white as we imagine.


Those databases return actual state, not whole time transaction history. Nobody cares how db works internally if it returns current state fast and client does not need to have gigabytes of ram, network does not need to send whole backlog and results can be displayed without additional processing on client site. I can imagine append only data as combination of appending of small changes and than storing actual state recalculated from those changes after let say X appends. All done automatically by network. So client needs to download only last checkpoint plus few changes and everything can be calculated on client site from that checkpoint by client libs, so devs will work directly with latest state of all data. This is append only strategy that acts like mutable data and is fast and easy to use.


I understand your priorization here and appreciate that you are open for the addition of true editing in the future.

Beside other important things SAFE stands for freedom and for me this means freedom of choice. I strongly believe that we should give people the opportunity to do with their data what they want (in both directions!):

  1. Give them the possibility to post something which can’t be removed (e.g. political critical post)
  2. Upload something (privately) which you might want to remove/edit at a later point (e.g. private photos)

As long as the choice for immutability can’t be reverted at a later point I don’t see a reason why we can’t have both possibilities. Both have legit use cases.

In order to not violate the recently published network principles I would therefore suggest to change #8 to:

The network is capable of storing data in perpetuity.


I am really glad this conversation is being had. The discussion about how MD is to be cached effectively had come up multiple times without a satisfactory answer, imo. This AD discussion, with each new element being immutable data, addressable in the same way as any other immutable data makes a lot of sense in this regard.

I wonder how the impression and reality of MD has become so wide for so long. This brings a lot of clarity and removes most of the doubt about scalability of MD (content can be easily cached).

Obviously small changes to large chunks is going to be relatively expensive, but apps can be designed with this in mind - splitting changeable and non-changeable data, storing delta to another location instead of clone with change, etc.

I think true temporal data storage may still be desirable, but that sounds like a future debate in light of this thread.


From what has been discussed above, getting the latest data should be done via client libs without heavy lifting being done by apps. Much like how a relational database expose current state for queries.

The fact that you have the power to roll back to a prior version from the app is rather powerful though. Given that all immutable data is eminently cacheable, it should also scale well too.

It will be interesting to discover how the AD records themselves can be effectively cached too, as i haven’t read much about this.


I can imagine this too, but the network itself still needs to construct the final data before presenting it. This would require multiple GETs internally, as well as some reconstruction operations. In fact, storing the result as a new element in an AD would make sense.

Perhaps client libs could be smart enough to facilitate a combination of the above to keep an optimal level of performance vs price. It is common practice to do similar things in relational databases (summary table for results of slow/expensive query), so I suspect similar techniques for AD optimisation will evolve.


I would say this is a place where I would fork and make it immutable :slight_smile: ofc we call have that choice, but I do like “warts and all” approaches to things. That is just my personal opinion though of course and folk might not want that, even though they currently have it.

I do very much appreciate what you are saying though, but this hill is one to climb.


ADs is the wrong term as well I see now. Its an ALD (append link data) object and the current MD is really a MLD object from what has been said.

I am probably to blame for that since much of my discussions was based on the MDs as described when being introduced but it seems that is not what is implemented as an MD. These are link objects linking to the actual data. @dirvine is that right? I never went into delving into the current implementation or writing code against it so I took the discussions from a while back.

Except David said a record of ownership is not kept when its changed. So I can publish a lot of bad things then change ownership of my ALDs to you and then you seem like the bad guy.

For public published things ownership must be a part of the kept data otherwise perpetual data is broken on that point.

Imagine reconstructing history of events when ownership is changed. You cannot unless ownership history is kept. Imaging attributing KKK press releases to your neighbour (if you had one) by changing the ownership of those press releases to your neighbour’s blog site.

But if you do not keep ownership details when changed then the victor only has to write their version of it and then change ownership to the loser and then anyone researching that historical event is presented with views favourable to the victor (but falsely)

Yes I learnt that MDs are really MLDs and not what I thought was being presented when MDs where introduced. And ADs are really ALDs (L === Link) The data is not stored in the MD or AD